Watching this stream: https://www.twitch.tv/videos/821587009
Watching LizTheGrey’s stream. She’s fantastic. She’s going through computer science principles that factor into the choices she’s making
Today I learned
- Thinking about how much work we’re doing is an important exercise. Avoiding repeatedly computing the same thing makes a lot of sense (relates to algorithmic complexity)
- If you are making an assumption about input (eg no negative values, no zeros) you can make this explicit by adding an assertion so that your program fails fast (with hopefully a helpful error) when that assumption is invalidated
- This one is a bit of a bugger. My solution uses a binary search strategy with low, high pointers that shift as you get closer to your goal. Not hard to write but was a bit fiddly
- Liz (and others I can see) did something much simpler. Set ‘B’ -> 1 and ‘R’ -> 1 in the input, ignored ‘F’ and ‘L’ (or set these to zero) and somehow with only a little more energy got the answer. What the hell is going on here?
Have to think about this one a bit more
Had to remind myself today to not try to overly anticipate what part b will ask me to do. More often than not the complexity of part a goes up and what I think is coming next doesn’t.
Keeping it simple is an important principle in systems design!
Alright so part 2 I don’t understand the answer for. I have to think about this one a bit more. It’s only a few lines long …
- Revisit day 13, part b
Lots to think about here. The way think about organizing logic, process management, deferred code execution, …
One of my favourite tech talks I found in 2020
Really great talk about training yourself to focus on things that matter an let go of everything else. This kind of thinking actually helps you be more effective.
Starting a bit of a focused read this week into ETL pipelines – what they are, the abstractions, common practice around organizing logic, etc – as the topic is starting to come up at work. (We have recently hired a business analysts / revenue management type person who’s keen on being able to ask adhoc, deep questions about our users and how they’re using the system.)
ETL stands for extract, transform, load:
- Extract: the first step in a pipeline is always sensory, or collection. It can work at different levels (individual events, or batch) and involves ingesting data up front (and possibly filtering) for use by later stages
- Transform: Sometimes data needs to be cleaned up, enriched, de-duped, etc before it is delivered to a final destination (which may itself be only the beginning of another pipeline!). An example could be taking a record that is origin system formatted, and converting it into something more destination system compatible
- Load: Concerned with delivery to systems where data will rest either for analysis or to become input into downstream pipelines
extract // transform // load // data partitioning // ingest // enrich // airflow // batch // event driven // event stream // idempotent // workflow engine // real-time //
- Airflow’s creator talks about it like it’s a workflow engine
- ETL type processing
- Data warehousing. Data data, and clean, well structured to support adhoc querying
- This is an apache project
- It’s a python application
- https://medium.com/@rchang/a-beginners-guide-to-data-engineering-part-ii-47c4e7cbda71. Part 1 was good too. Linked in this 2nd article in a series.
- When you’re small, direct integrations to specific data sinks / destinations is fine
- Real-time can be more complex than batch ETL pipelines (but is it really?)
- One consideration here is what happens if / when the transactional db and data warehousedb (destination) drift …
We’re heavy users at Spring at work and the more time I spend with it, the more I am appreciating everything it’s doing for me. It’s been around for many years and represents the combined experience building distributed apps of thousands of developers!
SpringOne was a virtual event this year because of times we live in (Covid 😢) and there are a bunch of videos that came out of it that I’m going to start watching …
- What is Spring? is a great starter talk for the conference. Historical context for the framework which is super important for me because I’m coming back to Java and Spring after a few years away from the ecosystem
- Spring Framework: Inversion of control, dependency injection, mvc, testing. The framework is the foundation on which everything else builds on. It’s an integration tool that let’s us compose and combine disparate technologies reasonably in code
- Security, batch, integrations, data are the other bedrock components
- Boot came a bit later and combines all these pieces tastefully with sensible defaults and world class capabilities
- Start every project at start.spring.io
- Batching for the Modern Enterprise
- Batch computing: “working on finite amount of data without interaction or interruption”
- Jargon: job, step, tasklet, chunk
- Jobs are a series of steps. Each looks like : read some data -> process it -> then write something
- Scaling: He talked about 5 different methods … I can only remember a few 😀 threads, partitioning, chunking, …
- Spring Boot Observability
- I really need to look at this more closely. Such observability power with very little energy from a dev
- Including the spring boot actuator dependency lights up capabilities around standard jvm usage metrics, jmx things if present, and a framework for custom metrics (timers, counters, gauges)
- Spring Security Patterns
- Very good primer for using the spring security module
- We kept coming back to security by default
- Java based configuration of spring security looks great. You can create a UserDetailsService to identify users and help establish sessions. You can also leverage a built in capability to run a resource server
- The url space of an application can be secured from a central place in code (SecurityConfig)
- Basic auth (username, password) is intentionally slow and expensive. A password encoder is designed to take a lot of compute and time which isn’t great for an api server under load. Using an auth token (oauth2?) gives you the same security but is faster to verify
- A Deep Dive into Spring Application Events
- Spring has a built in way to publish events in business logic with event registration. Super flexible.
- Events can be produced and consumed in process as needed to help with maintaining bounded contexts and a loosely coupled architecture
- Events can be sent outside the origin process too to a message queue or some such as well
- Really neat stuff!
- Security Patterns for Microservice Architectures
- List of things we should be doing in our services
- Dependency scanning, openid connect for authentication / authorization, secrets handling, secure coding practices
- Book: Secure by Design has a few chapters worth a skim
- 12 factor, cloud based design techniques
- DDD shows up and object immutability
- Failure handling looks nice
- Light, fun romp through several topics with pointers for going deeper