Lightnews — Scholar-powered news

Abhay Bothra

@swe.dev

1K followers 590 following 12 posts

Co-founder/CTO @fennel.ai / Databases #DataBS / Distributed Systems / Infrastructure. @bothra90 on Twitter.

Posts Replies Media Videos

Abhay Bothra

@swe.dev

King to c7?

December 20, 2024 at 4:22 PM

Abhay Bothra

@swe.dev

Caveat: Some of these could be unique to Fennel’s architecture because of our reliance on Kafka for exactly-once semantics and recovery

December 6, 2024 at 5:43 AM

Abhay Bothra

@swe.dev

Why use large batches at all? To amortize the cost of Kafka transactions, which we rely on for exactly-once semantics.

December 6, 2024 at 5:43 AM

Abhay Bothra

@swe.dev

The latter also keeps memory utilization proportional to mini-batch size.

December 6, 2024 at 5:43 AM

Abhay Bothra

@swe.dev

We got around that by internally sharding each batch of records and processing sub-shards in parallel.
We also break down our batches into mini-batches so output of the chain can be streamed to Kafka without waiting for the full batch execution to finish.

December 6, 2024 at 5:43 AM

Abhay Bothra

@swe.dev

Cons: This architecture prevents concurrent/fully async operation of all operators since now each batch has to be processed in full by the operator chain before moving to the next batch, which was in turn preventing us from running full throttle even when CPU capacity was available.

December 6, 2024 at 5:43 AM

Abhay Bothra

@swe.dev

In hindsight, what would the right API for this look like?

November 27, 2024 at 8:29 PM

Abhay Bothra

@swe.dev

Yes, I think they do this so that the ‘a’ region doesn’t become a hotspot. Was definitely surprising when I found out, but ultimately made sense.

November 27, 2024 at 7:44 PM

Abhay Bothra

@swe.dev

it occupies a very interesting point in the design space of caches, but the fact that you can’t immediately read your writes can be a problem that you still need to design for. I wonder if that is its undoing.
@jonhoo.eu might have more thoughts on this.

November 20, 2024 at 4:51 PM

Abhay Bothra

@swe.dev

That was their implementation of Noria?

November 20, 2024 at 8:10 AM

Abhay Bothra

@swe.dev

We’ve built an IVM engine at Fennel that allows python UDFs by leveraging a fleet of python workers for execution while keeping the other operators in Rust. Hope to write a lot more about the technical details soon. One problem that we’ve had to solve is to provide IVM with time travel.

November 20, 2024 at 7:59 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news