Splitting Light: Season 1 - Episode 21


Splitting light

Season 1 Episode 21

1+ meter track length

If you are no longer interested in the newsletter, please unsubscribe

This next assignment was to become the premise of a career pivot even though I could not know it at the time. Greg has these interesting ideas that I can only fully appreciate now that I am much more experienced and more battle tested. He did the hardware designs in a modular way. There was a project to reuse the C1 node to package it as a raspberry pie but it fizzled out because handling mass market hardware is a very different world than handling datacenter hardware. From that project, on which I had done a bit of software qualification, was born a storage board. A new type of hard drive had just come out. It was higher capacity and 25% less expensive per gb. It was the SMR hard drives. However they had one flaw or constraint depending on how you saw it. The performance for random read/write was very bad.

What if you could hedge this? Greg spun a new design where you plugged 56 3.5 inch drives vertically on a large squarish PCB. With a C1 node slotted in on a corner. There was a maze of lanes and small components to route and switch the SATA data lanes to the node.

Once voltage tests were done, I was handed the board and I started to qualify the 56 slots. I wrote a bit of python to help me but 90% of the process was manual. I would plug in an SSD drive in the slot, power it up from the terminal, check that the drive would link up with the operating system, then check for data errors while transferring some data back and forth, then power down the drive and aim for the next slot.

Right away, I found that half of the slots were not working. After checking the tracks and the schematics, the issue was found. For simplicity, one of the sata buses had some of the SerDes signal pairs inverted. Which meant that, the system on chip (SoC) was expecting negative voltage where it was receiving positive and vice versa for the other wire. It was documented that we could invert the lanes by configuring the SATA PHY but the specific configuration was nowhere to be found in our documents. I sent an email to our chip support engineer for assistance.

Continuing on the working SATA bus, as I was powering up a new slot I heard a big mechanical snap. A few seconds later I smelled burnt plastic and my device was not responding anymore. I turned right away to the test bench. The differential circuit breaker had done its job in preventing a fire. We slowly disconnected everything and started inspecting the board.

We turned it around, looked everywhere, looked at the schematics but couldn't find anything. Eventually Greg found the issue. To understand it, you have to understand how a SATA connector is seated. It’s actually a sort of bridge which is seated or soldered but the place where the connector connects with the line pads on the board is open and visible. In the sata slot I had just turned on, underneath the connector, were several solder bubbles. They were the cause of the short circuit. We had tested a less expensive manufacturer and they had not seen the defect but neither did we until it burned a few components. I cleaned the bubbles with a soldering braid, replaced the burnt components and continued the tests.

Our support engineer had responded with the memory register configuration and after patching the operating system a bit we had the second sata bus working. Continuing on the tests, I found that some of the slots were not reachable. The components that controlled them did not respond to my commands. The digital oscilloscope came to the rescue. My first straps at the exit of the C1 node didn’t show anything unusual. Greg suggested I strap the component directly. I did and low and behold, there was something unusual. One of the things that the presenter in “Indistinguishable from magic” had said was that digital is analog. The resistance of the copper on the length of the track had diminished the signal enough that it was out of the specs for the components. We increased the power of the signal and it became very digital (square) again.

After testing every single slot, I wrote some python code to make it more manageable as well as documentation for the hardware. I was ready to hand it over. But, that would not happen…

To pair with :

  • Megumi The Milkway Above - Connan Mockasin
  • Magician: Apprentice by Raymond E. Feist

If you have missed it, you can read the previous episode here


Vincent Auclair

Connect with me on your favorite network!

Oud metha, Dubai, Dubai 00000
Unsubscribe · Preferences

Symbol Sled

Business, tech, and life by a nerd. New every Tuesday: Splitting Light: The Prism of Growth and Discovery.

Read more from Symbol Sled

Splitting light Season 1 Episode 22 My very first storage product If you are no longer interested in the newsletter, please unsubscribe As I was qualifying the storage hardware, one person from the cloud team was assigned to work on the product part. Just as I was about to hand it over , he decided that he wanted to spend more time with his family which lived halfway across the globe, so he resigned. As we were between cycles, the C2 manufacturing order had been sent and the C3 wasn’t...

Splitting light Season 1 Episode 20 Unable to show up on time to work If you are no longer interested in the newsletter, please unsubscribe This episode talks about mental health, if this is a sensitive subject for you, feel free to skip. Alarm clock Coming out of adolescence and throughout my adult life I have had troubles with my sleep. Specifically waking up in the morning. One of the things I had loved the most at Epitech, the university I had gone to, was that it was open 24/7. You...

Splitting light Season 1 Episode 19 The world, and us If you are no longer interested in the newsletter, please unsubscribe C2 server racks I remember some higher ups of major hardware distributors and manufacturers visiting our office in Paris. Arnaud always kept the pleasure of showing off the lab to them. The small cramped room with its server room behind it. The six engineers and loads of open hardware laying around. He told them this was the full team that worked on the hardware. They...