Categories
links

Links

  • The majestic monolith: A post from DHH about monoliths and microservices that resonates with me quite a bit. Side effects of complexity in a system I’m thinking of now are failures, emergent behaviour, dev + operator cognitive load, trickier production support, … Many many applications don’t need the extra complexity now and never will
  • Humble objects: Making a class easier to test by factoring out smaller bits into easily tested ones. I’ve heard the term “sprouting” recently referring to the same concept
Categories
links

Links

  • Garbage collection in jdk16: ZGC enhancements reduces gc time. More efficient memory relocation on heap collections and heap root object set scanning is avoided entirely.
  • Name your thread pools: Being able to trace back to the origin of work in a system doesn’t happen on its own. You have to plan for it. So important.
  • Serverless app: Lenskart built a system with simple components that performs well given the current feature set at a reasonable cost

Categories
links

Links

  • Async task framework design doc from dropbox: Nice discussion of the design of their job scheduler service. At least once execution, priorities, no concurrency, guaranteed start times for most jobs at a scale of 10,000 jobs per sec (at least at time of writing)

Categories
systems

An Availability Story

Marc Brooker from AWS talks about availability. 20m, very relevant stuff.

  • Availability is personal
  • Correlated failure limits availability
    • Redundancy isn’t always perfect (eg. Single points of failure)
  • Blast radius is critical to availability
  • My availability depends on the availability of my dependencies

The purpose of our system is not to hit an availability goal. (99.95% uptime)  It’s to service our customers. (People!) An uptime goal is a proxy for this.

Source

Categories
links

Links

Categories
links

Links

  • Post incident report from Twilio for Feb 26, 2021 incident: Nice writeup with aggressive, hopefully impactful action items. A critical path service was discovered post incident response with insufficient capacity and autoscaling behaviours. When it went down dependent services followed. Dependencies were built to handle a failure from this upstream service