Skip to the content
- Scalability, and load testing VALORANT: Nice discussion of how to setup a load testing test harness. “Simulated player”, “scenario”, “player pool” are the basic abstractions they settled on. Architectural concerns for the game server they thought about up front were microservices, sharding their data store, and caching
- Strava, The Boring Option: A story about a schema design decision (width of an id field in one of their tables) that worked great from 2009 – 2020 but then needed to change. A 32bit unsigned, monotonically increasing id field is good for 4b unique values before it wraps around. Depending on how many of these you’re using, it could last a long time. It did for Strava. The covid19 pandemic meant all their users were using their service way more than normal normal which accelerated the need for re-work here. They were pragmattic about what they did. They considered different datastores to store this data (huge table, lots of read/write activity on it) but in the end decided they knew mysql and were comfortable with it. They found a way using their current datastore (and reserved the right to consider different ones in the future but they had a problem to solve today). Great story!
- DNS load balancing: This company is using DNS load balancing to good effect for some of it’s traffic. Not machine-to-machine type api traffic it sounds like. (Works ok for human clients that honour ttl). 2 big problems with dns load balancing are 1) uneven distribution of load (a problem for load balancers too but you at least have some say in how requests are forwarded), and 2) how are failed servers removed from the pool?
- Cloudflare postmortem (Byzantine failure in etcd cluster): Interesting. A few distributed systems bolted together to create a bigger one. Each individual component is “fault tolerant” on it’s own but there emerge new kinds of failures when they are connected to eachother. Keep it boring for as long as you possibly can! This is usually a lot longer than you think