Checklist: Prod Readiness

Stuff I think about when I’m getting a new application ready to run in prod. They have all caused me pain in one way or another at some point … 🙂

  • How critical is it to the business? (Should an engineer be woken up in the middle of the night if it goes down?)
  • Monitoring
    • Metrics for a webapp: traffic volume, latency, and errors
    • Logs are being shipped to a central place where we can setup filters and alerts on them
      • Are logs being rotated? Should they be?
    • Exceptions are being captured and reviewed by somebody
  • Is there data to be backed up? (If we are taking backups we should be verifying we can restore them)
  • Do we have environments including develop, staging, and production and a process to promote changes through them
    • How do we deploy new versions of this?
  • Is it well documented?
    • Service pages are nice (Who owns the service, an architecture diagram, links to runbooks, links to dashboards)
  • Show me the tests! (Unit, end to end and other. Should be automated and able to run all the time)
  • Have we gone through a threat modelling exercise with it? (Talk about principals, goals, adversities, invariants)