I wrote a summary of Iceberg integration in GCP a couple of weeks ago:
juhache.substack.com/p/gcp-and-ic...
I wrote a summary of Iceberg integration in GCP a couple of weeks ago:
juhache.substack.com/p/gcp-and-ic...
"immutable workflow + atomic operation" 100%!
"immutable workflow + atomic operation" 100%!
So, if you have a 1GB dataset, does that mean you’ll share a single .duckdb file containing the entire dataset? Or either a view pointing to parquet files: CREATE VIEW... as read_parquet(*.parquet) ?
So, if you have a 1GB dataset, does that mean you’ll share a single .duckdb file containing the entire dataset? Or either a view pointing to parquet files: CREATE VIEW... as read_parquet(*.parquet) ?
For me:
• Parquet = Standard storage format
• Iceberg = Standard metadata format
• DuckDB = One possible distribution vector
For me:
• Parquet = Standard storage format
• Iceberg = Standard metadata format
• DuckDB = One possible distribution vector