The Dagster Almanack: From Complexity to Composability
Summary
Simon Späti presents a comprehensive 'almanack' of Dagster, tracing its evolution from a developer-friendly ETL framework in 2018 to a full open data platform. He argues that Dagster's key innovations — data-aware orchestration, the shift from task-based DAGs to declarative data assets, and composable architecture — address the inherent complexity of enterprise data systems. Drawing on Rich Hickey's 'Simple Made Easy' philosophy, he frames composability as the antidote to enterprise data complexity. The post positions Dagster as the right abstraction layer for building open data platforms, with its control plane unifying metadata, lineage, and observability across heterogeneous systems.
Key Insight
Dagster's evolution from ETL orchestrator to open data platform demonstrates that composability and declarative asset-based thinking are the keys to taming enterprise data complexity.
Spicy Quotes (click to share)
- 2
Heterogeneous data complexity is a fact of the enterprise data lifecycle.
- 4
It's not code but data pipelines and DAGs, but what everyone cares about are their outcomes: the data assets.
- 3
Composable is what makes systems simpler: the ability to assemble, reassemble, and swap individual components into a flexible whole.
- 4
Resources decouple storage from compute — both are interchangeable without changing pipeline logic, by pure configuration — that's the beauty of declarative data systems.
- 5
State is never simple. Unfortunately for us, data engineering is all state: every datum is tied to a timestamp of when it was created, processed, or backfilled.
- 3
It's the abstraction layer for data engineering to solve hard business problems, an open data platform with opinionated design decisions that compound the longer you build on them.
- 3
The data platform layer is the priority, so AI can build on top of a great foundation.
Tone
enthusiastic, reflective, technical
