The Saga of Sagas
The Oxide control plane coordinates multiple services to do complex, compound operations. Early on, we knew we wanted to provide a robust structure for these multi-part workflows. We stumbled onto Distributed Sagas and built our own implementation in Steno. Bryan and Adam are joined by several members of the Oxide team who built and use Steno to drive the complex operation of the control plane.
In addition to Bryan Cantrill and Adam Leventhal, speakers included Dave Pacheco. Eliza Weisman, Andrew Stone, Greg Colombo, and James MacMahon.
In addition to Bryan Cantrill and Adam Leventhal, speakers included Dave Pacheco. Eliza Weisman, Andrew Stone, Greg Colombo, and James MacMahon.
Some of the topics we hit on, in the order that we hit them:
- Distributed Sagas: A Protocol for Coordinating Microservices - Caitie McCaffrey
- Oxide RFD 107: Workflows Engine
- Steno
- chat: "the trouble with other people's workflow engines, somehow with all the yaml in the world they're never quite extensible enough"
- Not our first bit of background noise on OxF (trombone)
- SAGAS paper
- chat: "when i hear sagas i think "transaction semantics enforced at the application layer" and when i hear workflow i hear "a dsl that doesn't have a for loop""
- Automated saga testing
- Oxide RFD 289: Steno Upgrade
- Feral Concurrency Control paper from Berkeley and the University of Sydney
- Eliza's PR
- Steno's description of its divergence from Distributed Sagas
- AWS "constant work" blog
- chat: "Now, migrate the owl."
- OxF on formal methods
- A complex bug with sagas: "tl;dr there's TWENTY steps in 5042 that leads to an accounting bug"
- Oxide RFD 373: Reliable Persistent Workflows
- Eliza's novella on updating an instance
If we got something wrong or missed something, please file a PR! Our next show will likely be on Monday at 5p Pacific Time on our Discord server; stay tuned to our Mastodon feeds for details, or subscribe to this calendar. We'd love to have you join us, as we always love to hear from new speakers!