flaneur2024.bsky.social
@flaneur2024.bsky.social
maintainer of SlateDB
loves Rust, Datasys, Cloud Infra, AI
https://flaneur2020.github.io
Reposted
The largest egocentric dataset.

Egocentric (first person) video is a general learning framework that passively captures how skilled workers do their jobs.

- 10,000 hours
- 2,153 factory workers
- 1,080,000,000 frames
November 10, 2025 at 11:54 PM
i'm beginning to understand the appeal of HCL as a configuration format over YAML. HCL has built-in variables, effectively providing a native templating engine. with YAML, we often have to use macros just to battle the indentation and ensure the correct number of spaces with some Jinja variants. 😲
November 11, 2025 at 3:31 AM
in my earlier understanding, distributed systems usually featured a fine-grained metadata service for cluster membership. however, I've recently noticed that many systems' implementations seem to prefer a fixed cluster design. 🤔 once the cluster is established, its membership becomes immutable.
November 9, 2025 at 1:18 PM
i believe household chores should be considered internal domestic logistics. for instance, handling the movement of clothing between the washing machine, drying space, and wardrobe. or dishes moving between the dining table, dishwasher, and cupboard.. 🤔
November 8, 2025 at 3:12 PM
Reposted
For everyone interested in data infra, want to get a quick sense of how big data works, how data systems are designed, and what the tradeoffs are, start with this share from @xiangpeng.systems, really nice intro!

intro-data-system.xiangpeng.systems
October 29, 2025 at 5:01 PM
I do understand why so many articles talk about how vibe coding can destroy the joy of programming.

however, I've never found that any joy exists in manually writing GitHub/Jenkins workflows. 🤔
October 29, 2025 at 9:35 AM
Reposted
100% correct post
I gave a talk last night about "Living dangerously with Claude", on the joys and perils of --dangerously-skip-permissions and how critical it is that we run coding agents in a sandbox so that we can unlock their full potential simonwillison.net/2025/Oct/22/...
Living dangerously with Claude
I gave a talk last night at Claude Code Anonymous in San Francisco, the unofficial meetup for coding agent enthusiasts. I decided to talk about a dichotomy I’ve been struggling …
simonwillison.net
October 23, 2025 at 12:01 AM
!!!
At long last, @chris.blue and I have submitted the final manuscript of Designing Data-Intensive Applications, second edition, to the publisher. There is always more that could be improved but at some point we just have to call it done. Now it goes into production; probably shipping in ~4 months.
October 22, 2025 at 11:34 AM
Reposted
Calling database nerds in SF! I'm covering SlateDB at the systems meetup next Wednesday (10/29). If you're around, I'd love to meet you in person (that way you'll have proof I'm not just an AI bot).

👉 luma.com/e7feg2i6
October 21, 2025 at 3:51 PM
Reposted
AWS seems to have a major outage in the us-east-1 region right now 😵‍💫 health.aws.amazon.com/health/status
View the overall status and health of AWS services using the AWS Health Dashboard.
health.aws.amazon.com
October 20, 2025 at 7:26 AM
modern LLM inference engines like vLLM & SGlang are becoming tough to dive into. to learn how these inference engines work, nano-vllm is a fantastic educational project—complete Page Attention & LLM scheduler in <1k loc.🤯
flaneur2020.github.io/posts/2025-1...
A Walkthrough of nano-vllm | Flaneur2020
Recently, I&rsquo;ve been delving into the architecture of production-grade inference engines. While projects like vLLM and SGLang are crazy sophisticated, …
flaneur2020.github.io
October 12, 2025 at 3:43 PM
recently Qwen3-Next-80B-A3B has become my goto model for asking "how-to" questions in the work. it's not considered as a smart model but it's really dog fast 😲
October 9, 2025 at 3:49 AM
what an amazing journey! after working through this step by step, we finally have a functional transaction API with SSI support. hope it proves useful! 😁
October 8, 2025 at 4:01 AM
family mooncake 😋
September 30, 2025 at 2:08 PM
Reposted
He will give practical advice, and concrete criteria to consider, when choosing research projects, and making professional decisions, in these last few years before AGI."

docs.google.com/presentation...
Advice for a young investigator in the first and last days of the Anthropocene
Advice for a (young) investigator in the first and last days of the Anthropocene Jascha Sohl-Dickstein Anthropic Title: Advice for a young investigator in the first and last days of the Anthropocene A...
docs.google.com
September 29, 2025 at 10:02 PM
Reposted
You already know the answer, but it’s nice that someone put in the manual effort to create a benchmark

Can AI file your taxes? No

"TaxCalcBench: Evaluating Frontier Models on the Tax Calculation Task"

Paper: www.arxiv.org/abs/2507.16126
Repo: github.com/column-tax/t...
GitHub - column-tax/tax-calc-bench: Code & data for TaxCalcBench
Code & data for TaxCalcBench. Contribute to column-tax/tax-calc-bench development by creating an account on GitHub.
github.com
September 23, 2025 at 3:46 AM
Reposted
Job update: a couple of weeks ago, I joined @tensorlake.ai full time. I’m having a lot of fun building the product with @diptanu.bsky.social and the rest of this wonderful team.

We have a few open positions if you’d like to work with us: www.linkedin.com/jobs/search/...
September 15, 2025 at 7:29 PM
Reposted
On to new things!

"Ingest, query, and share telemetry data with your engineers and customers at a fraction of the cost."
September 10, 2025 at 4:45 PM
Reposted
Regarding goroutine & unbuffered channel interaction, found myself repeating this multiple times. Maybe it's time to write it down for reference.

Early return + unbuffered send = goroutine leak.

rednafi.com/go/early_ret...

#golang
Early return and goroutine leak
At work, a common mistake I notice when reviewing candidates’ home assignments is how they wire goroutines to channels and then return early. The pattern usually looks like this: start a few goroutin...
rednafi.com
September 7, 2025 at 1:49 PM
Reposted
1/ SlateDB v0.8 is now available! This is release includes OpenDAL object store support, serializable snapshot isolation, first-class Go bindings, Python binding improvements, deterministic simulation tests, performance improvements, and tons of bug fixes. Details below. 👇
September 5, 2025 at 7:22 PM
Reposted
2/ Snapshot isolation — @flaneur2024.bsky.social has been hard at work on snapshots/transactions. 0.8 now has `DbSnapshots`, which provide a consistent point-in-time DB view. Sequence numbers are now core to SlateDB and will be used for many features (including transactions) going forward.
September 5, 2025 at 7:22 PM
it seems there's no gold-standard benchmark for coding models at the moment. imho, the only way to know if one is any good is to throw a (real world) task at it and see if it can handle it smoothly 🤔.
September 7, 2025 at 12:57 PM
OpenDAL can now use object_store as its backend service 🥳!

this allows users to leverage the out-of-box primitives from the OpenDAL operator, such as chunking / parallel fetching and caching on user-provided object_store instances

github.com/apache/opend...
feat: allow using object_store as opendal's backend by flaneur2020 · Pull Request #6283 · apache/opendal
Which issue does this PR close? Closes #6171. Rationale for this change with allowing object_store as opendal&#39;s backend, we can leverage opendal&#39;s advanced operation like parallel fetchin...
github.com
August 26, 2025 at 10:59 AM