Skip to the content
- How big technology changes happen at slack: Explore, expand, migrate. They’ve chosen a practice that involves 3 distinct phases where anyone (or nearly anyone?) can advocate for a new technology, but they must convince their peers of its value and do that by getting other people in the org to use it.
- Most experiments fail fast which is something they like. The ones that do achieve widespread adoption make it to the migration phase where the company actively roles it out across all things.
- It sounds great, but my question would be how do you stop a proliferation of technologies from being put to use in different spots. The maintainability of a system in such a state seems monstrous. If you bake something new that no one else uses deeply into a service, you have to learn that new thing in order to properly support and enhance that service. Does every service have 1 or 3 things like this uniquely theirs at Slack? How does this shake out? How do experiments work?
- There has to be some friction to get to phase 1. (Along with a bunch of communication across the immediate team) You always start with a real problem you need to solve. Can you find more than 1 of a few other people who are also concerned about your problem and talk through it with them?
- Another great tracing post from Slack: This one is from before the last and describes internal project requirements, the problem they were trying to solve, and limitations of Zipkin and Jaeger. One key point was the tracing system should be useful for non-backend use cases
- Client tracing at slack: Talks about how slack is able to visualize what happens when a requests is sent from a client (browser, application) to the backend. Really neat. Mentions Honeycomb
- Lightstep distributed tracing guide: High level guide speaks to tracing, sampling, when you need to be think about this stuff. Head-based sampling (ie. Decision made up front in a request that you’re going to start tracing – which can use a non-trivial amount of server resources – vs. tail-based where you’ve done the buffering and can decide to keep or throw away data based on testing whether there’s anything interesting contained there-in)