Hasan Geren
hgeren.bsky.social
Hasan Geren
@hgeren.bsky.social
Data Engineer 🧑🏻‍💻 Stream Processing Researcher 🔬 Nerd 🤓 Metalhead 🤘🏻
Reposted by Hasan Geren
Data ingestion with dlt and Dagster: An end-to-end pipeline tutorial:

Curious like us to see what people are sharing with #dataBS and #datasky? Check out this post to learn how to do it using dlt!"
@matthausk.bsky.social
@datateam.bsky.social
@hgeren.bsky.social
@hopefanhe.bsky.social
#dlt
Data ingestion with dlt and Dagster: An end-to-end pipeline tutorial
Ingest Data from Bluesky API to AWS S3 Using dlt and deploy it on Dagster in Just 15 Minutes.
open.substack.com
December 19, 2024 at 11:00 AM
Reposted by Hasan Geren
We are starting a 32-week Data Engineering Interview Guide program, covering everything from fundamentals to advanced topics, with sessions every Saturday.
Do you think we're missing any critical topics? We're curious about your opinions😊
#dataBS
#datasky
Week 0/32 - A Comprehensive Data Engineering Interview Preparation Guide
Join us every Saturday on This New Journey
open.substack.com
December 8, 2024 at 11:06 AM
Reposted by Hasan Geren
As a Data Engineer, understanding the data storage lifecycle and data retention policies is critical for designing efficient, cost-effective, and compliant data systems.
@joereis.bsky.social
#dataBS #datasky

substack.com/@pipeline2in...
December 4, 2024 at 12:11 PM
Reposted by Hasan Geren
In our new post, we've covered 10 of the most popular data pipeline design patterns.

We’d love to hear your thoughts. For more details, please check out the full post created by (@hgeren.bsky.social and @hopefanhe.bsky.social ): open.substack.com/pub/pipeline...

#dataBS #datasky
10 Pipeline Design Patterns for Data Engineers
How to leverage Design Patterns for scalable and efficient data pipelines
open.substack.com
December 3, 2024 at 10:19 AM
Reposted by Hasan Geren
Discover how dlt simplifies data ingestion.
Learn its origins and real-world use cases. Follow a step-by-step guide to build your first pipeline and join the growing dlt community!
@matthausk.bsky.social
@datateam.bsky.social
@hgeren.bsky.social
@hopefanhe.bsky.social

#dataBS #datasky
Introduction to data load tool (dlt): A Python Library for Simple Data Ingestion
Discover the basics of dlt and its role in modern data engineering workflows
open.substack.com
December 1, 2024 at 10:44 AM
Reposted by Hasan Geren
Hi, wishing everyone a great Thanksgiving!

Recently we wrote about how SQL queries are executed behind the scenes.

If you are interested, check out our post: open.substack.com/pub/pipeline...

#dataBS #datasky
November 28, 2024 at 12:23 PM
Reposted by Hasan Geren
Storage is at the heart of Data Engineering.
In this post, we explore the hierarchy of data storage from the ground up, drawing inspiration from Fundamentals of Data Engineering by
@joereis.bsky.social
and Matt Housley, as well as insights from the DE Professionals on Coursera.
#dataBS #datasky
Storage Fundamentals For Data Engineers
Why organised and durable storage is the cornerstone of Data Engineering?
open.substack.com
November 26, 2024 at 10:59 AM
Hey #dataBS and #datasky folks,

Our new post about "how understanding Big O Notation & Execution Plans can optimize SQL queries" has just been posted.

Check it out if you're interested, and we'd love to hear your thoughts! @hopefanhe.bsky.social
open.substack.com/pub/pipeline...
SQL Behind the Curtain: How Are Queries Executed?
Explore the journey of your SQL query guided by execution plans
open.substack.com
November 19, 2024 at 10:45 AM
Hey #dataBS, I've been thinking of an analogy for Data Teams' roles.

Imagine a company as a vehicle. How would you map Data Engineering, Analytics, and Science to vehicle parts? Teams could have multiple parts or overlap with other Teams.

Curious about your thoughts!
November 8, 2024 at 10:46 PM
Reposted by Hasan Geren
Looking for a distraction? Try this great interview between @hannes.muehleisen.org and @medriscoll.bsky.social covering all things @duckdb.org. I especially enjoyed the philosophy around improving SQL usability. www.youtube.com/watch?v=a-Rm... #databs
Data Talks on the Rocks 5 - Hannes Mühleisen, DuckDB
YouTube video by Rill Data
www.youtube.com
November 7, 2024 at 11:16 PM
Reposted by Hasan Geren
#dstaBS can you repost?

Filled up the first 150 and so am creating a second starter pack! Let’s all keep finding each other and make this place the best for all things data
Data people starter packs, parts 1 and 2. Gotta follow them all!

go.bsky.app/8TdEfdK

go.bsky.app/DsDyXF3
November 7, 2024 at 12:39 PM
Reposted by Hasan Geren
Week 1 of "100 Days of SQL Optimisation" covered key techniques like column selection, multicolumn indexes, filtering, window functions, Rank, CTE and composite indexes with IMDb data.

Check out the full post for more!
@hgeren.bsky.social
#dataBS #datasky
Week #1: 100 Days of SQL Optimisation
How Small Tweaks Transformed Our Queries, Saving Time and Resources
open.substack.com
November 7, 2024 at 12:01 PM
Reposted by Hasan Geren
I made an infra engineer starter pack. Folks posting about databases, stream processing, durable execution, orchestrators, service meshes, and more.

go.bsky.app/SCZe42X
October 25, 2024 at 1:16 AM
Hello everyone! I’m Hasan.

I transitioned from Industrial Engineering to Data Science, then found my passion in Data Engineering. Currently, doing a PhD in distributed stream processing while working as a Data Engineer.

Looking forward to connecting with fellow data enthusiasts to learn and share.
November 7, 2024 at 3:42 AM
Just joined and heard #dataBS and #datasky are where the cool kids hang.

Wanted to introduce our blog where we regularly write about Data Engineering concepts, news, and tools.

pipeline2insights.substack.com
November 6, 2024 at 12:49 PM