Splitting Light: Season 1 - Episode 18


Splitting light

Season 1 Episode 18

C2 Launch

If you are no longer interested in the newsletter, please unsubscribe

Two years after I had joined the lab, the manufacturing orders for the second generation compute were sent. A few weeks later, they were being racked in the datacenter rooms and shortly after they were released for customer use.

Finally, all the hours we had put in, all the hair we had lost, was worth it as the hardware hit production. I intimately knew a lot of the components. I had traced signals on the boards. I had handled the boards and the metal chassis multiple times. Plugging or seating the components to do some tests.

I remember doing one safety test on my hardware bench. I had pushed the CPU of the compute board to its maximum and was monitoring the temperature, power consumptions and other parameters. Once it was stable, I took a soldering heat gun and blew 500C air into the CPU heatsink. This test was to check that, as the temperature rose, the chip and operating system would start to reduce the frequency and power usage. We wanted to check that after a certain threshold it would shut off as a safety mechanism. The chip went from 35 celsius, to 60C, started to throttle, 80C and abruptly shut down at 98C. Then, I had to make sure it would turn back on and work correctly. I adjusted some of the settings and repeated the test. We made those tests to ensure that the components would not melt down and be rendered into useless glass.

Other tests included sending multiple gigabytes of data per second between the compute boards and checking dropped packets counters or checking the power consumption curves at power up. Boy did we learn many things. Because the C2 used a standard 220V power supply, unlike the C1, it could be sent to a datacenter which the company did not own. For that, the hardware had to be tested for electromagnetic noise in a special room at a third party. You could really feel the gap between theoretical knowledge and experience knowledge. Both were absolutely necessary.

By the time the C2 was going live, the first generation network was also being installed in the data centers. What was the lab working on? It was working on the C3. Preparing the second generation network and finalizing the design for a storage hardware. That last hardware would have a tremendous impact on my career and change its direction but I could not know that yet.

We had three hardware from the lab that were live in production in our data centers, without counting all the SCADA equipment. The team had grown from five, when I had joined, to six. The last join was to give a hand on the x86 BIOS because it was such a headache. Six people working to make hardware that had a lasting business impact. I don't know how long they had worked on the C1 before I had joined in 2013, but at this point in time, in 2016, the lab was starting to give a good return on the investment.

But there was this nagging feeling. Somehow it felt that no one believed us. No one believed a team of six people could do all these things. We were about to have interesting discussions.

To pair with :

  • Vood(oo) - Rone
  • The Fifth Head of Cerberus by Gene Wolfe

If you have missed it, you can read the previous episode here


Vincent Auclair

Connect with me on your favorite network!

Oud metha, Dubai, Dubai 00000
Unsubscribe · Preferences

Symbol Sled

Business, tech, and life by a nerd. New every Tuesday: Splitting Light: The Prism of Growth and Discovery.

Read more from Symbol Sled

Splitting light Season 2 Episode 23 Beat the cluster to a pulp If you are no longer interested in the newsletter, please unsubscribe With proper observability we could now push the cluster even further. This was the final set of tests that we would perform before wiping everything and going to beta after a new setup. We huddled and concocted a strategy. Picked up our tools and went on the field to beat the cluster to a pulp one last time. Our goal was explicitly to overwhelm the cluster as...

Splitting light Season 2 Episode 22 Too many logs If you are no longer interested in the newsletter, please unsubscribe I’ve rarely seen people talk about this effect. The effect being the amplification of requests. This effect can overwhelm your system. We had to deal with it. The object storage, at least OpenIO, was a collection of distributed services. You might call them micro services if you want. That had implications. When a request comes in, from the user perspective, it’s a single...

Splitting light Season 2 Episode 21 All nighter If you are no longer interested in the newsletter, please unsubscribe As we were moving forward, in mid June 2018, we hit a point where we needed to be able to check the logs of the cluster as a whole. The way we had done it until then was manually connecting to the machines and opening the right files to look inside. This was no longer viable. One of the main office rooms (1) Scaleway’s monitoring team had done a metric stack which we already...