Ming Tommy Tang
@tommytang.bsky.social
Director of bioinformatics at AstraZeneca. subscribe to my youtube channel @chatomics. On my way to helping 1 million people learn bioinformatics. Educator, Biotech, single cell. Also talks about leadership.
tommytang.bio.link
tommytang.bio.link
I hope you've found this post helpful.
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
Hi! I'm Tommy Tang
I am a bioinformatician/computational biologist with six years of wet lab experience and over 12 years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter!
divingintogeneticsandgenomics.ck.page
November 11, 2025 at 2:45 PM
I hope you've found this post helpful.
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
14/
Full blog post:
www.tidyverse.org/blog/2025/0...
duckplyr 1.1.0: now part of the tidyverse.
Familiar. Blazing fast. Made for modern data.
Full blog post:
www.tidyverse.org/blog/2025/0...
duckplyr 1.1.0: now part of the tidyverse.
Familiar. Blazing fast. Made for modern data.
duckplyr fully joins the tidyverse!
duckplyr 1.1.0 is on CRAN! A drop-in replacement for dplyr, powered by DuckDB for speed. It is the most dplyr-like of dplyr backends.
www.tidyverse.org
November 11, 2025 at 2:45 PM
14/
Full blog post:
www.tidyverse.org/blog/2025/0...
duckplyr 1.1.0: now part of the tidyverse.
Familiar. Blazing fast. Made for modern data.
Full blog post:
www.tidyverse.org/blog/2025/0...
duckplyr 1.1.0: now part of the tidyverse.
Familiar. Blazing fast. Made for modern data.
13/
Install it now:
install.packages("duckplyr")
Start using your dplyr code—at DuckDB speed.
Install it now:
install.packages("duckplyr")
Start using your dplyr code—at DuckDB speed.
November 11, 2025 at 2:45 PM
13/
Install it now:
install.packages("duckplyr")
Start using your dplyr code—at DuckDB speed.
Install it now:
install.packages("duckplyr")
Start using your dplyr code—at DuckDB speed.
12/
Warning: duckplyr is fast, but R might not show memory usage correctly.
Always monitor RAM if you're near the limit.
Warning: duckplyr is fast, but R might not show memory usage correctly.
Always monitor RAM if you're near the limit.
November 11, 2025 at 2:45 PM
12/
Warning: duckplyr is fast, but R might not show memory usage correctly.
Always monitor RAM if you're near the limit.
Warning: duckplyr is fast, but R might not show memory usage correctly.
Always monitor RAM if you're near the limit.
11/
duckplyr tracks fallbacks.
You can review them and submit reports—help make it smarter.
Every fallback is a future speed boost.
duckplyr tracks fallbacks.
You can review them and submit reports—help make it smarter.
Every fallback is a future speed boost.
November 11, 2025 at 2:45 PM
11/
duckplyr tracks fallbacks.
You can review them and submit reports—help make it smarter.
Every fallback is a future speed boost.
duckplyr tracks fallbacks.
You can review them and submit reports—help make it smarter.
Every fallback is a future speed boost.
10/
If you use dbplyr, good news:
duckplyr plays nice.
Convert duck frames to lazy dbplyr tables and back in one line.
If you use dbplyr, good news:
duckplyr plays nice.
Convert duck frames to lazy dbplyr tables and back in one line.
November 11, 2025 at 2:45 PM
10/
If you use dbplyr, good news:
duckplyr plays nice.
Convert duck frames to lazy dbplyr tables and back in one line.
If you use dbplyr, good news:
duckplyr plays nice.
Convert duck frames to lazy dbplyr tables and back in one line.
9/
Need SQL-like power?
Use DuckDB functions right inside duckplyr pipelines with dd$.
Yes, even Levenshtein distance.
Need SQL-like power?
Use DuckDB functions right inside duckplyr pipelines with dd$.
Yes, even Levenshtein distance.
November 11, 2025 at 2:45 PM
9/
Need SQL-like power?
Use DuckDB functions right inside duckplyr pipelines with dd$.
Yes, even Levenshtein distance.
Need SQL-like power?
Use DuckDB functions right inside duckplyr pipelines with dd$.
Yes, even Levenshtein distance.
8/
Got data too big for RAM?
duckplyr works out-of-memory.
Read/write Parquet. Process directly from disk. It just works.
Got data too big for RAM?
duckplyr works out-of-memory.
Read/write Parquet. Process directly from disk. It just works.
November 11, 2025 at 2:45 PM
8/
Got data too big for RAM?
duckplyr works out-of-memory.
Read/write Parquet. Process directly from disk. It just works.
Got data too big for RAM?
duckplyr works out-of-memory.
Read/write Parquet. Process directly from disk. It just works.
7/
Example:
library(duckplyr)
df <- as_duckplyr_df(bigdata)
df |> filter(x > 10) |> group_by(y) |> summarise(n = n())
Familiar code. Faster engine.
Example:
library(duckplyr)
df <- as_duckplyr_df(bigdata)
df |> filter(x > 10) |> group_by(y) |> summarise(n = n())
Familiar code. Faster engine.
November 11, 2025 at 2:45 PM
7/
Example:
library(duckplyr)
df <- as_duckplyr_df(bigdata)
df |> filter(x > 10) |> group_by(y) |> summarise(n = n())
Familiar code. Faster engine.
Example:
library(duckplyr)
df <- as_duckplyr_df(bigdata)
df |> filter(x > 10) |> group_by(y) |> summarise(n = n())
Familiar code. Faster engine.
6/
Use it two ways:
Load duckplyr to override dplyr globally
Or just convert specific data frames with as_duckplyr_df()
Use it two ways:
Load duckplyr to override dplyr globally
Or just convert specific data frames with as_duckplyr_df()
November 11, 2025 at 2:45 PM
6/
Use it two ways:
Load duckplyr to override dplyr globally
Or just convert specific data frames with as_duckplyr_df()
Use it two ways:
Load duckplyr to override dplyr globally
Or just convert specific data frames with as_duckplyr_df()
5/
When DuckDB can’t handle something, duckplyr falls back to dplyr.
Same results. Always.
It’s safe to try.
When DuckDB can’t handle something, duckplyr falls back to dplyr.
Same results. Always.
It’s safe to try.
November 11, 2025 at 2:45 PM
5/
When DuckDB can’t handle something, duckplyr falls back to dplyr.
Same results. Always.
It’s safe to try.
When DuckDB can’t handle something, duckplyr falls back to dplyr.
Same results. Always.
It’s safe to try.
4/
duckplyr handles huge datasets with ease.
6M rows?
10× faster than dplyr in benchmarks.
Less memory. Less time. No sweat.
duckplyr handles huge datasets with ease.
6M rows?
10× faster than dplyr in benchmarks.
Less memory. Less time. No sweat.
November 11, 2025 at 2:45 PM
4/
duckplyr handles huge datasets with ease.
6M rows?
10× faster than dplyr in benchmarks.
Less memory. Less time. No sweat.
duckplyr handles huge datasets with ease.
6M rows?
10× faster than dplyr in benchmarks.
Less memory. Less time. No sweat.
3/
No new syntax to learn.
Your old dplyr pipelines? They just run faster.
Way faster.
No new syntax to learn.
Your old dplyr pipelines? They just run faster.
Way faster.
November 11, 2025 at 2:45 PM
3/
No new syntax to learn.
Your old dplyr pipelines? They just run faster.
Way faster.
No new syntax to learn.
Your old dplyr pipelines? They just run faster.
Way faster.
2/
duckplyr is a drop-in replacement for dplyr.
Same code. Same verbs.
But it runs on DuckDB under the hood—for serious speed.
duckplyr is a drop-in replacement for dplyr.
Same code. Same verbs.
But it runs on DuckDB under the hood—for serious speed.
November 11, 2025 at 2:45 PM
2/
duckplyr is a drop-in replacement for dplyr.
Same code. Same verbs.
But it runs on DuckDB under the hood—for serious speed.
duckplyr is a drop-in replacement for dplyr.
Same code. Same verbs.
But it runs on DuckDB under the hood—for serious speed.
I hope you've found this post helpful.
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
Chatomics! — The Bioinformatics Newsletter
Why Subscribe?✅ Curated by Tommy Tang, a Director of Bioinformatics with 100K+ followers across LinkedIn, X, and YouTube✅ No fluff—just deep insights and working code examples✅ Trusted by grad students, postdocs, and biotech professionals✅ 100% free
divingintogeneticsandgenomics.ck.page
November 11, 2025 at 2:15 PM
I hope you've found this post helpful.
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
The best part is to use Claude code to understand the code base;
write technical notes.
There is no better time to learn .
Use AI tools as a learning companion.
Do not be lazy to skip the learning process.
write technical notes.
There is no better time to learn .
Use AI tools as a learning companion.
Do not be lazy to skip the learning process.
November 11, 2025 at 2:15 PM
The best part is to use Claude code to understand the code base;
write technical notes.
There is no better time to learn .
Use AI tools as a learning companion.
Do not be lazy to skip the learning process.
write technical notes.
There is no better time to learn .
Use AI tools as a learning companion.
Do not be lazy to skip the learning process.
Claude code did all the heavy lifting
translating py2 to py3
there were glitches, but it fixed them all
it set up tests
CI/CD on github automatically
uploaded to PyPi (I only need to registered the account and get the key)
and wrote the README (with good prompting)
translating py2 to py3
there were glitches, but it fixed them all
it set up tests
CI/CD on github automatically
uploaded to PyPi (I only need to registered the account and get the key)
and wrote the README (with good prompting)
November 11, 2025 at 2:15 PM
Claude code did all the heavy lifting
translating py2 to py3
there were glitches, but it fixed them all
it set up tests
CI/CD on github automatically
uploaded to PyPi (I only need to registered the account and get the key)
and wrote the README (with good prompting)
translating py2 to py3
there were glitches, but it fixed them all
it set up tests
CI/CD on github automatically
uploaded to PyPi (I only need to registered the account and get the key)
and wrote the README (with good prompting)
A lot of errors in my experience too
November 11, 2025 at 4:08 AM
A lot of errors in my experience too
My son earned more tickets through bean bags that day than through luck.
Because luck comes once.
Skill comes every time you show up.
Focus on what you can control. #lifelesson
Because luck comes once.
Skill comes every time you show up.
Focus on what you can control. #lifelesson
November 10, 2025 at 11:44 PM
My son earned more tickets through bean bags that day than through luck.
Because luck comes once.
Skill comes every time you show up.
Focus on what you can control. #lifelesson
Because luck comes once.
Skill comes every time you show up.
Focus on what you can control. #lifelesson
The boring, repeatable work.
The skill you can build.
The consistency that compounds.
The skill you can build.
The consistency that compounds.
November 10, 2025 at 11:44 PM
The boring, repeatable work.
The skill you can build.
The consistency that compounds.
The skill you can build.
The consistency that compounds.
And when we hit it once, we think we've found the formula.
We pour more money in. More time. More hope.
Then we watch the wheel click one space over. 3 tickets.
But the bean bag game? That's still there.
We pour more money in. More time. More hope.
Then we watch the wheel click one space over. 3 tickets.
But the bean bag game? That's still there.
November 10, 2025 at 11:44 PM
And when we hit it once, we think we've found the formula.
We pour more money in. More time. More hope.
Then we watch the wheel click one space over. 3 tickets.
But the bean bag game? That's still there.
We pour more money in. More time. More hope.
Then we watch the wheel click one space over. 3 tickets.
But the bean bag game? That's still there.