Lightnews — Scholar-powered news

Ananth Packkildurai

@ananthdurai.bsky.social

Continuing our yearly tradition of Year in Review Data Engineering Weekly, we published the 2025 Year in Review. What do you think is the most notable trend of 2025?

DEW - The Year in Review 2025

From Digital Plumbers to Architects of Intelligence: The 7 Paradigm Shifts That Defined 2025

www.dataengineeringweekly.com

December 23, 2025 at 5:04 AM

Ananth Packkildurai

@ananthdurai.bsky.social

www.dataengineeringw...

December 16, 2025 at 2:24 AM

Ananth Packkildurai

@ananthdurai.bsky.social

December 12, 2025 at 11:27 PM

Ananth Packkildurai

@ananthdurai.bsky.social

Look at the tech stack IBM now controls:

🐧 Compute: Red Hat (Linux/OpenShift)
☁️ IaC: HashiCorp (Terraform)
💰 FinOps: Kubecost
🌊 Streaming: Confluent (Kafka)
🧠 Vector/AI: DataStax (Cassandra)
⚡ Query Engine: Ahana (Presto)
🔄 Ingest: StreamSets

December 8, 2025 at 7:06 PM

Ananth Packkildurai

@ananthdurai.bsky.social

LinkedIn moves FishDB to Rust, DoorDash builds AI swarms, and Dropbox masters context engineering. 🤯 Data Engineering Weekly #247 is packed with system design deep dives from the best engineering teams.

Data Engineering Weekly #247

The Weekly Data Engineering Newsletter

www.dataengineeringweekly.com

December 8, 2025 at 1:31 AM

Ananth Packkildurai

@ananthdurai.bsky.social

If the Data Catalog is the answer for AI, the question was wrong.

December 4, 2025 at 7:10 PM

Ananth Packkildurai

@ananthdurai.bsky.social

We stopped asking if data was useful because storage got cheap. Now, "Dark Data" is actively poisoning your AI context windows with hallucination vectors.

Read about the Data Sustainability index

The Dark Data Tax: How Hoarding is Poisoning Your AI

Storage is cheap. Attention is finite. Hallucinations are expensive. It’s time to stop building Data Lakes and start managing Data Metabolism

www.dataengineeringweekly.com

November 19, 2025 at 3:01 PM

Ananth Packkildurai

@ananthdurai.bsky.social

The open source companies built their success on top of open-source platforms, benefited from community contributions and adoption, but now must abandon open-source principles to survive commercially.

November 10, 2025 at 2:47 AM

Ananth Packkildurai

@ananthdurai.bsky.social

🚀 The 244th edition of Data Engineering Weekly dives into:

AI agents as execution engines, LLM inference economics, databases for AI, personalization, and product evidence.

Read more 👉 www.dataengineeringw...

#DataEngineering #AI #LLMs

Data Engineering Weekly #244

The Weekly Data Engineering Newsletter

www.dataengineeringweekly.com

November 3, 2025 at 9:29 AM

Ananth Packkildurai

@ananthdurai.bsky.social

Cricket has been India’s greatest force in overcoming centuries of colonial suppression. Today’s Women’s World Cup win echoes the spirit of 1983 — a triumph that will inspire generations to come. 🇮🇳🏆

November 3, 2025 at 12:40 AM

Ananth Packkildurai

@ananthdurai.bsky.social

This is the most personal essay that I have written in Data Engineering Weekly. I shared a few key moments in my life and how fortunate I was to meet mentors along my professional journey, which shaped my career.

Thinking Like a Data Engineer

A Journey Beyond Code — Toward Systems, Curiosity, and Confidence

www.dataengineeringweekly.com

October 23, 2025 at 12:25 AM

Ananth Packkildurai

@ananthdurai.bsky.social

🚀 Data Vault vs. Dimensional Modeling vs. Medallion Architecture — When viewed through a modern enterprise data lens, these techniques interlock.

I break down how in Part 2 of my “Revisiting the Medallion Architecture” series.

Revisiting Medallion Architecture: Data Vault in Silver, Dimensional Modeling in Gold

How to Balance Flexibility and Performance in a Modern Data Platform

www.dataengineeringweekly.com

October 17, 2025 at 2:54 PM

Ananth Packkildurai

@ananthdurai.bsky.social

Fivetran and dbt form a strong foundation for modern data infrastructure, known for bringing simplicity to complex engineering workflows. That said, calling it “open” data infrastructure feels like a stretch.

October 17, 2025 at 12:02 PM

Ananth Packkildurai

@ananthdurai.bsky.social

Should we update the definition of an "Analytical Engineer"?

October 13, 2025 at 5:53 PM

Ananth Packkildurai

@ananthdurai.bsky.social

As a data engineer, you can't treat zero-party (consent) and third-party (inferred) data the same way. This distinction is critical for building systems that are scalable, private, and trustworthy.

Here’s my guide:

Engineering Growth: The Data Layers Powering Modern GTM

Building privacy-preserving pipelines that unify zero-, first-, second-, third-, and fourth-party data into a coherent GTM ecosystem.

www.dataengineeringweekly.com

October 9, 2025 at 12:35 AM

Ananth Packkildurai

@ananthdurai.bsky.social

Airbnb: Real-Time Key-Value Store

Airbnb’s next-gen key-value store supports real-time ingestion and bulk uploads with sub-second latency, powering feature stores and fraud detection.

Read the full story here: www.dataengineeringw...

October 2, 2025 at 1:00 PM

Ananth Packkildurai

@ananthdurai.bsky.social

Grab: Partner Gateway Metrics at Sub-Second Speed
Real-time partner analytics at scale is tough. Grab uses Apache Pinot, Kafka–Flink ingestion, partitioning, and Star-tree indexing to cut query latency to <300 ms, enabling efficient API monitoring and fast issue resolution.

October 1, 2025 at 1:00 PM

Ananth Packkildurai

@ananthdurai.bsky.social

Netflix Muse: Scaling Analytics at Trillion-Row Scale
Netflix evolved its Muse architecture to handle huge datasets efficiently: HyperLogLog sketches, Hollow in-memory feeds, and Druid optimizations cut query latency by ~50% and reduced concurrency load.

September 30, 2025 at 12:33 PM

Ananth Packkildurai

@ananthdurai.bsky.social

⚡ Latency Every Data Streaming Engineer Should Know

“Real-time” has limits—disk, network, and replication delays add up. StreamNative explains latency tiers, common costs, and tuning levers like batching & async processing.
💡 Must-read for data streaming engineers!

September 29, 2025 at 12:33 PM

Reposted by Ananth Packkildurai

Chris

@chris.blue

I enjoyed this post by @ananthdurai.bsky.social. Does a great job tying a bunch of recent papers and concepts together.

What “Supporting Our AI Overlords” and “Semantic Spacetime” Tell Us About the Future of Data Infrastructure

Connecting agent-first and universal semantic grammar to reimagine data infrastructure beyond the relational model.

www.dataengineeringweekly.com

September 27, 2025 at 5:46 PM

Ananth Packkildurai

@ananthdurai.bsky.social

MCP (Model Context Protocol) promises a new way for LLMs to use tools.

Chris Riccomini argues it mostly reinvents OpenAPI, gRPC & CLIs.
Resources = docs
Tools = RPC
Prompts = configs

So… could MCP have just been a JSON file?

💡 More insights: www.dataengineeringw...

September 27, 2025 at 1:00 PM

Ananth Packkildurai

@ananthdurai.bsky.social

How Tables Got Smarter: Iceberg → DuckLake. From static snapshots to stream-native updates and catalog-first metadata, tables are evolving fast. Choose by intent, not hype.

Subscribe → www.dataengineeringw...

Full story → medium.com/fresha-da...

September 26, 2025 at 12:33 PM

Ananth Packkildurai

@ananthdurai.bsky.social

I wrote my thoughts on Supporting Our AI Overlords.

What “Supporting Our AI Overlords” and “Semantic Spacetime” Tell Us About the Future of Data Infrastructure

Connecting agent-first and universal semantic grammar to reimagine data infrastructure beyond the relational model.

www.dataengineeringweekly.com

September 25, 2025 at 1:15 PM

Ananth Packkildurai

@ananthdurai.bsky.social

How Tables Grew a Brain: Iceberg → DuckLake
Snapshots → incremental → stream-native → catalog-first.
Metadata is the bottleneck.

More insights → www.dataengineeringw...

Full story → medium.com/fresha-da...

September 25, 2025 at 12:33 PM

Ananth Packkildurai

@ananthdurai.bsky.social

BlaBlaCar scales like a pro!

dbt Core → Transform like a champ

Airflow → Orchestrate effortlessly

CI/CD → Deploy instantly

Dev Containers → Standardized dev

📖 Full story →medium.com/blablacar...

💡 More insights → Subscribe to DEW

#DataEngineering #dbt #Airflow #CICD #DevContainers

September 24, 2025 at 12:33 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news