Marc Brooker from AWS talks about availability. 20m, very relevant stuff.
- Availability is personal
- Correlated failure limits availability
- Redundancy isn’t always perfect (eg. Single points of failure)
- Blast radius is critical to availability
- My availability depends on the availability of my dependencies
The purpose of our system is not to hit an availability goal. (99.95% uptime) It’s to service our customers. (People!) An uptime goal is a proxy for this.