Ming Tommy Tang
banner
tommytang.bsky.social
Ming Tommy Tang
@tommytang.bsky.social
Director of bioinformatics at AstraZeneca. subscribe to my youtube channel @chatomics. On my way to helping 1 million people learn bioinformatics. Educator, Biotech, single cell. Also talks about leadership.
tommytang.bio.link
I hope you've found this post helpful.

Follow me for more.

Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
Hi! I'm Tommy Tang
I am a bioinformatician/computational biologist with six years of wet lab experience and over 12 years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter!
divingintogeneticsandgenomics.ck.page
November 11, 2025 at 2:45 PM
14/
Full blog post:
www.tidyverse.org/blog/2025/0...
duckplyr 1.1.0: now part of the tidyverse.
Familiar. Blazing fast. Made for modern data.
duckplyr fully joins the tidyverse!
duckplyr 1.1.0 is on CRAN! A drop-in replacement for dplyr, powered by DuckDB for speed. It is the most dplyr-like of dplyr backends.
www.tidyverse.org
November 11, 2025 at 2:45 PM
13/
Install it now:
install.packages("duckplyr")

Start using your dplyr code—at DuckDB speed.
November 11, 2025 at 2:45 PM
12/
Warning: duckplyr is fast, but R might not show memory usage correctly.
Always monitor RAM if you're near the limit.
November 11, 2025 at 2:45 PM
11/
duckplyr tracks fallbacks.
You can review them and submit reports—help make it smarter.
Every fallback is a future speed boost.
November 11, 2025 at 2:45 PM
10/
If you use dbplyr, good news:
duckplyr plays nice.
Convert duck frames to lazy dbplyr tables and back in one line.
November 11, 2025 at 2:45 PM
9/
Need SQL-like power?
Use DuckDB functions right inside duckplyr pipelines with dd$.
Yes, even Levenshtein distance.
November 11, 2025 at 2:45 PM
8/
Got data too big for RAM?
duckplyr works out-of-memory.
Read/write Parquet. Process directly from disk. It just works.
November 11, 2025 at 2:45 PM
7/
Example:
library(duckplyr)
df <- as_duckplyr_df(bigdata)
df |> filter(x > 10) |> group_by(y) |> summarise(n = n())

Familiar code. Faster engine.
November 11, 2025 at 2:45 PM
6/
Use it two ways:
Load duckplyr to override dplyr globally

Or just convert specific data frames with as_duckplyr_df()
November 11, 2025 at 2:45 PM
5/
When DuckDB can’t handle something, duckplyr falls back to dplyr.
Same results. Always.
It’s safe to try.
November 11, 2025 at 2:45 PM
4/
duckplyr handles huge datasets with ease.
6M rows?
10× faster than dplyr in benchmarks.
Less memory. Less time. No sweat.
November 11, 2025 at 2:45 PM
3/
No new syntax to learn.
Your old dplyr pipelines? They just run faster.
Way faster.
November 11, 2025 at 2:45 PM
2/
duckplyr is a drop-in replacement for dplyr.
Same code. Same verbs.
But it runs on DuckDB under the hood—for serious speed.
November 11, 2025 at 2:45 PM
I hope you've found this post helpful.

Follow me for more.

Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
Chatomics! — The Bioinformatics Newsletter
Why Subscribe?✅ Curated by Tommy Tang, a Director of Bioinformatics with 100K+ followers across LinkedIn, X, and YouTube✅ No fluff—just deep insights and working code examples✅ Trusted by grad students, postdocs, and biotech professionals✅ 100% free
divingintogeneticsandgenomics.ck.page
November 11, 2025 at 2:15 PM
The best part is to use Claude code to understand the code base;
write technical notes.

There is no better time to learn .
Use AI tools as a learning companion.
Do not be lazy to skip the learning process.
November 11, 2025 at 2:15 PM
Claude code did all the heavy lifting
translating py2 to py3
there were glitches, but it fixed them all

it set up tests
CI/CD on github automatically
uploaded to PyPi (I only need to registered the account and get the key)
and wrote the README (with good prompting)
November 11, 2025 at 2:15 PM
A lot of errors in my experience too
November 11, 2025 at 4:08 AM
Yeah
November 11, 2025 at 4:08 AM
My son earned more tickets through bean bags that day than through luck.

Because luck comes once.

Skill comes every time you show up.

Focus on what you can control. #lifelesson
November 10, 2025 at 11:44 PM
The boring, repeatable work.
The skill you can build.
The consistency that compounds.
November 10, 2025 at 11:44 PM
And when we hit it once, we think we've found the formula.

We pour more money in. More time. More hope.

Then we watch the wheel click one space over. 3 tickets.

But the bean bag game? That's still there.
November 10, 2025 at 11:44 PM