Lightnews — Scholar-powered news

Burak

@buremba.bsky.social

Soon to be, it looks like: youtu.be/zeonmOO9jm4?...
Otherwise, there is no point of using Parquet instead of their DuckDB native format. I’m glad they didn’t ignore the “industry standards”

Introducing DuckLake

YouTube video by DuckDB

youtu.be

May 27, 2025 at 3:31 PM

Burak

@buremba.bsky.social

Is there any plan to support data compaction to data lake when data inlining is used?

May 27, 2025 at 3:30 PM

Burak

@buremba.bsky.social

I was worried about Iceberg being ignored in favor of DuckLake but looks like you fixed Iceberg’s biggest problems and still kept the compatibility. Super exciting!

May 27, 2025 at 3:20 PM

Burak

@buremba.bsky.social

Turns out the implementation wasn’t WAL but they had a new Iceberg compatible data lake extension. I like the direction they are going!

May 27, 2025 at 3:15 PM

Burak

@buremba.bsky.social

I have this one but they might have soon to be public extension to use the WAL to keep the data in sync with data lake: github.com/duckdb/duckd...

Implement WALReader by adsharma · Pull Request #17247 · duckdb/duckdb

This could be useful to external replication tools to read WAL records similar to how wal2json (Postgres) and binlog (MySQL) work. Translation to externally consumable format is not included.

github.com

May 21, 2025 at 3:42 PM

Burak

@buremba.bsky.social

That’s a good analogy, might steal it. :) However; when the destination path is not clear (which is usually case as you need to experiment and iterate anyways) smashing can help accelerate finding the destination as you learn where not to go.

May 5, 2025 at 4:14 PM

Burak

@buremba.bsky.social

Ironically the number of stale documents in our company is increased dramatically thanks to LLM.

April 26, 2025 at 4:21 PM

Burak

@buremba.bsky.social

Oh I lost count of how much time I waste trying to infer the column names from random CSV files without a header. This is very handy!

March 17, 2025 at 6:49 PM

Burak

@buremba.bsky.social

Exactly! I think Flight will get more popular over time as it's the most efficient implementation, but this approach can help existing RESTFul apps to adopt SQL integrations before switching over to GRPC.

February 15, 2025 at 6:54 PM

Burak

@buremba.bsky.social

The main inspirations are github.com/PostgREST/po... and @qxip.bsky.social 's DuckDB webmacro extension: duckdb.org/community_ex...

GitHub - PostgREST/postgrest: REST API for any Postgres database

REST API for any Postgres database. Contribute to PostgREST/postgrest development by creating an account on GitHub.

github.com

February 15, 2025 at 6:41 PM

Burak

@buremba.bsky.social

Pretty common but if one of these languages is the “main” one, it might be more desirable to generate JSONSchema from Pydantic/TS and generate the models for other language from JSONSchema. It’s more about where you want the source of truth should be.

February 10, 2025 at 8:52 PM

Burak

@buremba.bsky.social

I had the exact same thought..

February 6, 2025 at 6:46 PM

Burak

@buremba.bsky.social

Thanks. I'm also a fan of your creative extensions! Quackpipe was one of the inspirations. :)

January 29, 2025 at 2:04 AM

Burak

@buremba.bsky.social

One here! 🍻

January 28, 2025 at 6:17 PM

Burak

@buremba.bsky.social

I couldn't figure out how to insert a table into an S3 Table without Spark. I tried to use the API but it requires me to create the files and update the metadata. PyIceberg can't write to S3 Tables through its S3 integration yet so I had to stick to Spark. boto3.amazonaws.com/v1/documenta...

update_table_metadata_location - Boto3 1.35.99 documentationContentsMenuExpandLight modeDark modeAuto light/dark modeClose Menu

boto3.amazonaws.com

January 14, 2025 at 10:41 PM

Burak

@buremba.bsky.social

If AWS is serious about S3 Tables, they should support Iceberg REST Catalog in it. Right now we can only create tables with Spark.

January 14, 2025 at 8:27 PM

Burak

@buremba.bsky.social

Qlik's Upsolver acquisition shows the importance of adopting new technologies as a potential acquisition target for bigger companies. It's a 10-year-old company, and they raised a ton, so I'm not sure how good the deal was for the co-founders.

January 14, 2025 at 5:43 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news