Business, tech, and life by a nerd. New every Tuesday: Splitting Light: The Prism of Growth and Discovery.
Share
Splitting Light: Season 2 - Episode 17
Published about 2 months ago • 3 min read
Splitting light
Season 2 Episode 17
Increasing scope
If you are no longer interested in the newsletter, please unsubscribe
Just as we were iterating to reduce risk and gain experience. We were also increasing the scope of our work as the iterations went.
We didn’t expect to have to handle the hardware. Yet we learned how to choose it. How many servers we wanted per rack and why. Measuring the actual power load at full usage. With the help of the network team, choosing the right routers and cabling and finally doing the bill of quantities (BoQ). All that while learning to speak to the vendors. And at last submitting the orders.
A top view of a storage server that we used in object storage (1)
Then it was overtaking the authentication components. The Identity and Access Management (IAM) & Billing team was too busy to deliver what we required, so we built our own authentication service and database change. We connected to their database and added the required token.
Then we had difficulties deploying our supporting services. We needed a dozen virtual machines but there was no platform team yet. So… We built our own infrastructure, provisioning the required hardware.
There was a push to have a single application programming interface (API) entry point. We couldn’t get guarantees for bandwidth and request per second capacity. We also used a different transport protocol. So we incorporated our own load balancer. Both choosing the hardware, doing the network design and assembling the right software together.
A view of the back of a object storage rack (2)
Historically, the first version of S3 had used the Scaleway domain as top level. A bucket being `bucket.s3.scaleway.com` or something similar. We were against using that domain for two reasons. The first one being reputation scores on the domain. A bucket could be used to host customer data. We did not control what they put in it. It could be illegal content that could have a negative impact on the domain scoring. The second one being, AWS used a different domain, why should we think we were smarter than them?
Increasing scope, like layering an onion
So, after a few back and forth, we choose one. Théo then purchased it for the company. It was the scw.cloud domain. We promptly hooked it up and used it for object storage. Every other team started using it subsequently after.
Nicolas, with two screens hooked up (3)
It’s not that we wanted to do everything ourselves. It’s that we assessed each component. Was there a clear team that was responsible for this? If not, we would do it ourselves. If yes, did the team have something that matched our requirement? Requirement could be the API availability or simply being able to take the required throughput. If not, we did it ourselves.
My general idea at the time was that everyone was building. We could take shortcuts and maintain them until we could handover the components back to the right team.
It was a balancing act. It was required because of Scaleway’s context. As time flowed, the context changed. A few years later it would require different methods.
Why was it so important to take temporary ownership? Because of latency.
Splitting light Season 2 Episode 24 Hackathon If you are no longer interested in the newsletter, please unsubscribe Several of the team members had gone to School 42. A tuition free university created by the owner of Scaleway, Xavier Niels. Several would be an understatement. Out of the 14 people, more than half had gone there. School 42 frequently organized hackathons. We decided it was the perfect opportunity for us to organize one. Théo doing the hackathon presentation (1) Our goal was to...
Splitting light Season 2 Episode 23 Beat the cluster to a pulp If you are no longer interested in the newsletter, please unsubscribe With proper observability we could now push the cluster even further. This was the final set of tests that we would perform before wiping everything and going to beta after a new setup. We huddled and concocted a strategy. Picked up our tools and went on the field to beat the cluster to a pulp one last time. Our goal was explicitly to overwhelm the cluster as...
Splitting light Season 2 Episode 22 Too many logs If you are no longer interested in the newsletter, please unsubscribe I’ve rarely seen people talk about this effect. The effect being the amplification of requests. This effect can overwhelm your system. We had to deal with it. The object storage, at least OpenIO, was a collection of distributed services. You might call them micro services if you want. That had implications. When a request comes in, from the user perspective, it’s a single...