Splitting Light: Season 1 - Episode 12


Splitting light

Season 1 Episode 12

SFP+ shenanigans

If you are no longer interested in the newsletter, please unsubscribe

It’s one thing to have the network working inside the board but until you are able to connect it to an external network it’s not very useful.

The C2 main board had four SFP+ cages and we had to make them work. An SFP+ cage is a metal slot where you slide in a compatible connector. It’s a standard connector that enables multiple types of cables and different transport mediums. It’s an ethernet cable replacement but with more bells and whistles.

The advantages of flexibility brings added complexity. You have to test every single connector and cable you can get your hands on to validate that you configure them correctly to make them work. There was a way to communicate with these connectors so once the channel was set up by Greg and a few of the cables worked, I picked up that work and continued on.

There were all kinds of these cables. Passive copper, activated copper, SFP to ethernet, fiber single mode, fiber multi-mode, long range fiber and each brand worked slightly differently.

We didn't have many data sheets so it was a lot of fiddling and a bit of reverse engineering to make them work. Plug in the cable, dump the information, plug in the other end, dump the information, link up, dump the information… You get the idea.

Then we compared each of the data dumps to see the differences and used the few data sheets we had to infer most of the information. We were able to extract the link status, what frequencies did the connector and cable support, which modes they supported, the signal noise volume in the cable and a lot of other information that we needed.

We built an internal database that had every cable we had identified and how exactly we should configure the network chip ports to make them work.

There were some additional difficulties to this task. Some network equipment manufacturers did not recognize cables that were not manufactured by them. Even if we knew how to configure it, the device from these companies wanted nothing to do with this cable. Sometimes the cable would work, but it was our device that was rejected or not all available modes could be used. Having worked on these cables, I could understand why. Unless you had one of the cables to test you could not know if it worked correctly or not however you could still enable a degraded mode. Someone at that manufacturer had decided not to do that.

It was a mix of physical tasks and abstract tasks. I could clearly see the data changing as you plugged in connectors and fiddled with settings in memory. To make sure we supported the cables we used in production, we had cables brought back from the data centers to test them and test the equipment as close to production as we could. The workbench behind me had the raw mainboard just sitting on it with no encasing and on my desk was a pile of cables to implement. It was very interesting.

If you have missed it, you can read the previous episode here

To pair with :

  • Vale W. Group - Shanghai Den
  • Naruto series by Masashi Kishimoto

Vincent Auclair

Connect with me on your favorite network!

Oud metha, Dubai, Dubai 00000
Unsubscribe · Preferences

Symbol Sled

Business, tech, and life by a nerd. New every Tuesday: Splitting Light: The Prism of Growth and Discovery.

Read more from Symbol Sled

Splitting light Season 2 Episode 23 Beat the cluster to a pulp If you are no longer interested in the newsletter, please unsubscribe With proper observability we could now push the cluster even further. This was the final set of tests that we would perform before wiping everything and going to beta after a new setup. We huddled and concocted a strategy. Picked up our tools and went on the field to beat the cluster to a pulp one last time. Our goal was explicitly to overwhelm the cluster as...

Splitting light Season 2 Episode 22 Too many logs If you are no longer interested in the newsletter, please unsubscribe I’ve rarely seen people talk about this effect. The effect being the amplification of requests. This effect can overwhelm your system. We had to deal with it. The object storage, at least OpenIO, was a collection of distributed services. You might call them micro services if you want. That had implications. When a request comes in, from the user perspective, it’s a single...

Splitting light Season 2 Episode 21 All nighter If you are no longer interested in the newsletter, please unsubscribe As we were moving forward, in mid June 2018, we hit a point where we needed to be able to check the logs of the cluster as a whole. The way we had done it until then was manually connecting to the machines and opening the right files to look inside. This was no longer viable. One of the main office rooms (1) Scaleway’s monitoring team had done a metric stack which we already...