Lightnews — Scholar-powered news

Joachim Rosskopf

@jrosskopf.bsky.social

81 followers 440 following 10 posts

Posts Replies Media Videos

Joachim Rosskopf

@jrosskopf.bsky.social

Benchmarks Show This:

→ @DuckDB beats @Spark for small queries.
→ Even at 700GB, DuckDB (native files) is competitive.
→ Spark scales dynamically for 1TB+ workloads.

Details: https://buff.ly/47UvlMc

🔍 The lesson? If data fits on one node, go single-node for speed. Scale to MPP only when needed.

DataFrames at Scale Comparison: TPC-H

Hendrik Makait, Sarah Johnson, Matthew Rocklin 2024-05-14 14 min read We run benchmarks derived from the TPC-H benchmark suite on a variety of scales, hardware architectures, and dataframe projects...

buff.ly

December 10, 2024 at 10:51 AM

Joachim Rosskopf

@jrosskopf.bsky.social

Why Are Object Stores So Attractive?

1️⃣ Scalability: Handle massive amounts of data.
2️⃣ Flexibility: Open formats like Iceberg for interoperability.
3️⃣ Advanced Features: Replication, immutability, and consistency.

They became the backbone of modern distributed systems.

December 8, 2024 at 10:51 AM

Joachim Rosskopf

@jrosskopf.bsky.social

What Are "One-Way Door" Risks?

❌ One-way doors = irreversible decisions.
In tech: adopting new tools or models without clear exit paths.

December 8, 2024 at 10:51 AM

Joachim Rosskopf

@jrosskopf.bsky.social

Curious where the data comes from?
🔗 Snowset (Snowflake's dataset): https://buff.ly/4eULXoQ
🔗 Redset (Redshift's dataset): https://buff.ly/3CScB4x

Both share real-world query samples, packed with insights into how data warehouses are used. Check them out!

GitHub - resource-disaggregation/snowset: Snowflake dataset containing statistics for 70 million queries over 14 day period

Snowflake dataset containing statistics for 70 million queries over 14 day period - resource-disaggregation/snowset

github.com

December 4, 2024 at 10:51 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news