Splitting Light: Season 1 - Episode 29


Splitting light

Season 1 Episode 29

Hardware documentation

If you are no longer interested in the newsletter, please unsubscribe

We gave a lot of documentation to the cloud team or the dedicated team for each new hardware. This documentation was to help them implement support for the hardware, adapt their information system and operate it. Their day to day was managing the hardware and we tried to make it as simple as possible for them.

There is a reason why computer engineering is layer upon layer of abstractions. You can't know everything. Abstractions are useful because they free up some computing space in the brain of the engineer who writes code. Because of this, it becomes easier to add new features. The operating system exists because that is what it does. It abstracts the hardware in a set of almost standard ways. It does that through what is called application programming interface (API).

If you stretch the abstraction layer concept, it can also apply to the relation we had with the teams who used our hardware. Our mission was to give them an abstracted layer for the hardware. It was not necessary for them to know which protocols were used or how the components communicated together. What was important was how to operate it.

They did not need to know these details, we did. What we strived for was to try to not have the layers pierced by the concepts. If the documentation said to diagnose an SFP, you needed to read several I2C pages on a particular sub-system, it was useless to them. The mental cost of understanding that compared to the importance of the task was not balanced.

We were knowledgeable in many protocols, CAN, I2C, SPI, onewire, UART and some bases in USB. We relied on vendor data sheets and sometimes reverse engineering a bit. Each component had its datasheet. Sometimes you needed three or four different datasheets open to implement a feature or fix a bug.

I didn't expect the operators to know how to do these things. Like they did not expect me to understand their information system and the procedures to handover a virtual machine to a client. The documentation was done in a way to reflect the knowledge boundaries. The procedures were simple. If it was too complicated, they would escalate to us. We would fix or add another feature in the toolset to help them manage the production.

It was an alternating back and forth talking and understanding needs. In a sense, similar to my personal life..

If you have missed it, you can read the previous episode here

To pair with :

  • Low Pressure Zone - Clubroot
  • Absolution Gap by Alastair Reynolds

Vincent Auclair

Connect with me on your favorite network!

Oud metha, Dubai, Dubai 00000
Unsubscribe · Preferences

Symbol Sled

Business, tech, and life by a nerd. New every Tuesday: Splitting Light: The Prism of Growth and Discovery.

Read more from Symbol Sled

Splitting light Season 2 Episode 23 Beat the cluster to a pulp If you are no longer interested in the newsletter, please unsubscribe With proper observability we could now push the cluster even further. This was the final set of tests that we would perform before wiping everything and going to beta after a new setup. We huddled and concocted a strategy. Picked up our tools and went on the field to beat the cluster to a pulp one last time. Our goal was explicitly to overwhelm the cluster as...

Splitting light Season 2 Episode 22 Too many logs If you are no longer interested in the newsletter, please unsubscribe I’ve rarely seen people talk about this effect. The effect being the amplification of requests. This effect can overwhelm your system. We had to deal with it. The object storage, at least OpenIO, was a collection of distributed services. You might call them micro services if you want. That had implications. When a request comes in, from the user perspective, it’s a single...

Splitting light Season 2 Episode 21 All nighter If you are no longer interested in the newsletter, please unsubscribe As we were moving forward, in mid June 2018, we hit a point where we needed to be able to check the logs of the cluster as a whole. The way we had done it until then was manually connecting to the machines and opening the right files to look inside. This was no longer viable. One of the main office rooms (1) Scaleway’s monitoring team had done a metric stack which we already...