Splitting Light: Season 1 - Episode 11


Splitting light

Season 1 Episode 11

Compounding iterations

If you are no longer interested in the newsletter, please unsubscribe

By that time we had received both the server board and the main board. Pierro was working on the boot sequence and the BIOS for the Intel chips. Jérome was working on the fan and temperature control in the firmware as well as designing the chassis metal work in CAD tools. I was enabling the network features of the network chip and doing some of the firmware code. Greg was already doing the traces for the next generation in his EDA tool. Four people working on different pieces and bits to make the hardware work.

Even though it was a new design both in terms of the server chip and the network chip, there were a lot of behind the scene improvements that were done compared to the first generation. Some of the issues we had found on our own and some from the users feedback.

The NAND chips were swapped for NOR chips. The fan and temperature control board was coalesced into the main board. The network chip had a dedicated management network instead of being inband. The chassis form factor was switched from telecommunication width to standard rack width. Other small improvements that came from the experience of the first version were also made.

Most importantly, I think, the biggest changes were for the end customers. The C1 server had 2 gigabytes of RAM memory and no local storage. This was an issue because it didn't handle a lot of different types of workloads. For the C2 server, we added ways to be able to better customize the servers. We could use three different versions of the Intel chip. Instead of soldered memory, we used standard memory sticks. Lastly, signals were drawn from the server board to a special place on the main board where we would vertically stack the storage drives.

A lot could be reused, but a lot more new things had to be qualified. Every of these hardware features had to be thoroughly implemented and tested. To do that we had to write software in multiple locations. Whether that be firmware code, operating system code, application code or tooling code. Then, we had to devise tests to check that everything was correct. For the SD-card slot, it was simple enough, but to test that the drives linked correctly in version 3 of the SATA protocol and that there were no errors when we pushed a lot of data, was a bit more difficult.

Whatever we worked on, most of the time the only information we had were the electrical traces, the multiple dozens of components PDF data sheets, and the underlying protocol information. Sometimes those data sheets were not very verbose, so you had to guess and check.

The impact of both the design and the physical changes was humbling. I could see the changes, I could understand the why and the end result, but the process of accomplishing the change, I could not yet reach.


We huddled around each time we received a new revision of the bright red boards. Looking at the components soldered and flipping around the boards to check for issues. We closed ticket by ticket for each feature. Slowly moving forward centimeter by centimeter. This was the way.

If you have missed it, you can read the previous episode here

To pair with :

  • Springtime Linn - Clark
  • Le Comte de Monte-Cristo (The Count of Monte Cristo) by Alexandre Dumas

Vincent Auclair

Connect with me on your favorite network!

Oud metha, Dubai, Dubai 00000
Unsubscribe · Preferences

Symbol Sled

Business, tech, and life by a nerd. New every Tuesday: Splitting Light: The Prism of Growth and Discovery.

Read more from Symbol Sled

Splitting light Season 2 Episode 23 Beat the cluster to a pulp If you are no longer interested in the newsletter, please unsubscribe With proper observability we could now push the cluster even further. This was the final set of tests that we would perform before wiping everything and going to beta after a new setup. We huddled and concocted a strategy. Picked up our tools and went on the field to beat the cluster to a pulp one last time. Our goal was explicitly to overwhelm the cluster as...

Splitting light Season 2 Episode 22 Too many logs If you are no longer interested in the newsletter, please unsubscribe I’ve rarely seen people talk about this effect. The effect being the amplification of requests. This effect can overwhelm your system. We had to deal with it. The object storage, at least OpenIO, was a collection of distributed services. You might call them micro services if you want. That had implications. When a request comes in, from the user perspective, it’s a single...

Splitting light Season 2 Episode 21 All nighter If you are no longer interested in the newsletter, please unsubscribe As we were moving forward, in mid June 2018, we hit a point where we needed to be able to check the logs of the cluster as a whole. The way we had done it until then was manually connecting to the machines and opening the right files to look inside. This was no longer viable. One of the main office rooms (1) Scaleway’s monitoring team had done a metric stack which we already...