Business, tech, and life by a nerd. New every Tuesday: Splitting Light: The Prism of Growth and Discovery.
Share
Splitting Light: Season 2 - Episode 37
Published about 6 hours ago • 4 min read
Splitting light
Season 2 Episode 37
Hardware made redundant
If you are no longer interested in the newsletter, please unsubscribe
Around May 2019
In May 2019, the hardware lab’s C3 was in production. It was the third generation compute hardware. The last one I had worked on before switching to storage. There was an issue with the compute node. How it had been fitted with RAM and storage created a voltage drop in certain conditions and that would shut the node down. A capacitor had to be soldered by the hardware team on-site to fix the issue. But how did that issue come in the first place?
The nodes had been procured with a configuration that had not been tested in the lab. Even though since the C2 in 2015 we, the lab, had requested that every deployed configuration be tested in the lab before procurement and deployment. This step, for some reason, had been skipped this time.
Around May 2019, I had initiated the discussion to start the process to create custom storage hardware. Théo (a) and Greg (b) were defining the requirements. It was now appropriate since we had gained a lot of experience and could focus on what would really make a difference. What would bring leverage.
C1 servers without radiator: codename Pimouss (1)
It was a surprise when one day a message was posted on slack. It shocked everyone. For me, it shocked me to the core. It nearly broke me. It was suddenly announced that Scalway would no longer be doing custom hardware. The hardware team and the lab was declared redundant. We would immediately stop the R&D and no longer have custom hardware.
C2 chassis: codename Suchard (2)
I correlated the shutdown of the lab to the capacitor issue. But who was at fault? Who had created the bug? Was it the hardware team that had not installed that capacitor? Or the person who made the procurement order on components that had not been tested?
Since we had done custom hardware, the instruction had always been given to test before procurement. The lab had to approve the configuration. Yet for some reason, that step had been skipped.
First generation network router: codename Choucou (3)
After confusion came anger, I could not find any logical reason why such a decision had been taken. But then, none of us were in the meeting where that decision was taken. Making the hardware team redundant would create a very long lasting impact. We, in the storage team, would feel it deeply. Some things were only possible at the hardware level.
When Scaleway would start having to do hardware again they would have to start from scratch again. That time would come, I was 100% certain of that. Almost every single hyperscaler does custom hardware. The vendor relations would have to be rebuilt. The experience re-acquired. The processes streamlined again. Everything would have to be redone. It was such a waste. Scaleway did start doing hardware 3 years later and expanded later on.
C3 chassis rack: codename Tagada (4)
For me, personally the impact was very hard. All that metaphorical blood and sweat. All those years spent, all that effort invested, it was for nothing. There was no legacy. I thought the lab was this single differentiator. The one thing that made us different from other small cloud providers. Our pride. Just like Free, the telecom operator, builds its own hardware, we did it ourselves and it brought us business value. Or so I thought. I would later understand that it did have a legacy, just not the one I thought.
Racks of cold storage: codename MrFreeze (5)
After the hardware team was suddenly disbanded, they were told to find another product to work on. After a few weeks they found a product to work on, they built the IoT product line. Gregoire continued to work on heavy R&D within Scaleway and built many other products.
For me, the anger eventually led to sadness. I can now see in me the legacy of the hardware team as well as the legacy of this distressing event. From that event, I learned to always prove the value of what I built with business numbers and focused a lot more on potential issues before they happened. The legacy of my work, I find it everywhere with me. The knowledge, the work process, the resourcefulness, the resilience… This is the legacy. The legacy was not bundles of metal, silicon and plastic. The legacy is what was etched in my mental circuits.
I believe this is a photo of the board of a second generation router: codename Croco (6)
In parallel, as I was signing my mortgage. At work we had to assess the tech debt.
Splitting light Season 2 Episode 36 Before we miss the Thalys If you are no longer interested in the newsletter, please unsubscribe Around April 2019 Our Amsterdam cluster was different from the optimal design. It did not follow the rack design. That was a problem. There was a reason for that. It was the first batch of hardware. We had sent it quickly there. Making it work to launch the Object Storage. But, since then we had received additional hardware. We now needed to make it compliant....
Splitting light Season 2 Episode 35 The Iena days If you are no longer interested in the newsletter, please unsubscribe Around April 2019 At the end of 2018, I had started to look into buying a home. I had finally finished my student back in September 2017, just before the pivot. I had put money into savings since then, it was now time to buy. It was my next step. My next personal step. It was a long process. I started to look for a house in the Paris suburbs but couldn’t really find...
Splitting light Season 2 Episode 34 fr-par, you are not cleared for launch If you are no longer interested in the newsletter, please unsubscribe March 2019 After we had sent our first bills to customers, less than a year and a half after the pivot, we were preparing for a new Object Storage region. DC5 had finally come online, our racks of hardware had been installed there. We were doing final adjustments on the installation scripts and deployment steps. However we had one issue. Assembling...