Moritz Hoffmann
Moritz Hoffmann
@antiguru.bsky.social
Computer scientist at @materialize.com. Systems, databases, dataflows. Playing the bass guitar.
We also added a new way to chain multiple flat-map calls into a single Timely operator. Coming from Rust's iterator API, it's sometimes not obvious that Timely's API has a different cost structure (each transform renders an operator). Here, a builder can mitigate some of the cost.
Introduce builder for flatmap operators by frankmcsherry · Pull Request #704 · TimelyDataflow/timely-dataflow
We discussed this a year back, but I couldn't recall why we didn't move it forward. I copy/pasted the code, and dressed up the documentation and examples. Worth taking a peek!
github.com
September 15, 2025 at 7:44 PM
I think these change are a great example of taking existing APIs and moving from row-first to batch-of-data. Computers generally seem better at handling data by column rather than row, and these changes allow Timely programs to represent data in their own form.
September 15, 2025 at 7:44 PM
A Distributor trait takes the concepts of Exchange, but moves away from row-by-row processing, opening up new kinds of containers that aren't row-first. For example, what is a row in a trie-structured container? The new trait allows to partition by whatever the user wants, not just by a "row".
Distributor trait by antiguru · Pull Request #700 · TimelyDataflow/timely-dataflow
Introduce a distributor that knows how to partition containers across multiple pushers. This moves the partition logic from container builders into a bespoke trait and implementation instead of mix...
github.com
September 15, 2025 at 7:44 PM
Previously, the Container trait described behavior that is now separate: Accountable reveals a container's record count for progress tracking; nothing else. For higher-level APIs, IterContainer and DrainContainer reveal their contents (non-)destructively, for container-agnostic row-by-row operators.
Rework container abstractions by antiguru · Pull Request #697 · TimelyDataflow/timely-dataflow
This is another attempt at reworking the container abstractions in Timely. At the moment, we have two types that a lot of the container abstractions hinge on, Container and ContainerBuilder. Both d...
github.com
September 15, 2025 at 7:44 PM
Jan '24: 1,338 kWh. That was a newly-built apartment in NYC. I don't miss this part!
March 11, 2025 at 3:40 PM
Check out namespace.so
logologo
namespace.so
December 4, 2024 at 11:17 AM
It'll certainly be an opt-in feature in its current form. Forcing a certain deployment pattern will be hard. Otoh, we could automate graceful replica restarts!
November 28, 2024 at 1:40 AM
So, we didn't solve the underlying problem, but mitigated it for many of our workloads.
November 27, 2024 at 6:22 PM
Now, in Materialize time advanced about every second, and the result is high CPU utilization 😿 But: we restart most things every week, so updates beyond a week will never be surfaced! The mitigation we implemented is to drop such updates, and add some safeguards to never produce incorrect results. 🎉
November 27, 2024 at 6:22 PM
Future updates are stashed at the input to indexes, and they'll only act on them once the time has advanced far enough. Because we use Differential Dataflows, the updates are only partially ordered. To extract ready updates, DD scans the data and splits it in ready and pending, ~ in linear time.
November 27, 2024 at 6:22 PM