Lightnews — Scholar-powered news

Ming Tommy Tang

@tommytang.bsky.social

4.1K followers 1.4K following 8K posts

Director of bioinformatics at AstraZeneca. subscribe to my youtube channel @chatomics. On my way to helping 1 million people learn bioinformatics. Educator, Biotech, single cell. Also talks about leadership.
tommytang.bio.link

Posts Replies Media Videos

Ming Tommy Tang

@tommytang.bsky.social

I hope you've found this post helpful.

Follow me for more.

Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile

Hi! I'm Tommy Tang

I am a bioinformatician/computational biologist with six years of wet lab experience and over 12 years of computation experience. I will help you to learn computational skills to tame astronomical data and derive insights. Check out the resources I offer below and sign up for my newsletter!

divingintogeneticsandgenomics.ck.page

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

14/
Full blog post:
www.tidyverse.org/blog/2025/0...
duckplyr 1.1.0: now part of the tidyverse.
Familiar. Blazing fast. Made for modern data.

duckplyr fully joins the tidyverse!

duckplyr 1.1.0 is on CRAN! A drop-in replacement for dplyr, powered by DuckDB for speed. It is the most dplyr-like of dplyr backends.

www.tidyverse.org

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

13/
Install it now:
install.packages("duckplyr")

Start using your dplyr code—at DuckDB speed.

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

12/
Warning: duckplyr is fast, but R might not show memory usage correctly.
Always monitor RAM if you're near the limit.

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

11/
duckplyr tracks fallbacks.
You can review them and submit reports—help make it smarter.
Every fallback is a future speed boost.

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

10/
If you use dbplyr, good news:
duckplyr plays nice.
Convert duck frames to lazy dbplyr tables and back in one line.

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

9/
Need SQL-like power?
Use DuckDB functions right inside duckplyr pipelines with dd$.
Yes, even Levenshtein distance.

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

8/
Got data too big for RAM?
duckplyr works out-of-memory.
Read/write Parquet. Process directly from disk. It just works.

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

7/
Example:
library(duckplyr)
df <- as_duckplyr_df(bigdata)
df |> filter(x > 10) |> group_by(y) |> summarise(n = n())

Familiar code. Faster engine.

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

6/
Use it two ways:
Load duckplyr to override dplyr globally

Or just convert specific data frames with as_duckplyr_df()

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

5/
When DuckDB can’t handle something, duckplyr falls back to dplyr.
Same results. Always.
It’s safe to try.

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

4/
duckplyr handles huge datasets with ease.
6M rows?
10× faster than dplyr in benchmarks.
Less memory. Less time. No sweat.

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

3/
No new syntax to learn.
Your old dplyr pipelines? They just run faster.
Way faster.

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

2/
duckplyr is a drop-in replacement for dplyr.
Same code. Same verbs.
But it runs on DuckDB under the hood—for serious speed.

November 11, 2025 at 2:45 PM

Ming Tommy Tang

@tommytang.bsky.social

I hope you've found this post helpful.

Follow me for more.

Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile

Chatomics! — The Bioinformatics Newsletter

Why Subscribe?✅ Curated by Tommy Tang, a Director of Bioinformatics with 100K+ followers across LinkedIn, X, and YouTube✅ No fluff—just deep insights and working code examples✅ Trusted by grad students, postdocs, and biotech professionals✅ 100% free

divingintogeneticsandgenomics.ck.page

November 11, 2025 at 2:15 PM

Ming Tommy Tang

@tommytang.bsky.social

The best part is to use Claude code to understand the code base;
write technical notes.

There is no better time to learn .
Use AI tools as a learning companion.
Do not be lazy to skip the learning process.

November 11, 2025 at 2:15 PM

Ming Tommy Tang

@tommytang.bsky.social

Claude code did all the heavy lifting
translating py2 to py3
there were glitches, but it fixed them all

it set up tests
CI/CD on github automatically
uploaded to PyPi (I only need to registered the account and get the key)
and wrote the README (with good prompting)

November 11, 2025 at 2:15 PM

Ming Tommy Tang

@tommytang.bsky.social

A lot of errors in my experience too

November 11, 2025 at 4:08 AM

Ming Tommy Tang

@tommytang.bsky.social

Yeah

November 11, 2025 at 4:08 AM

Ming Tommy Tang

@tommytang.bsky.social

My son earned more tickets through bean bags that day than through luck.

Because luck comes once.

Skill comes every time you show up.

Focus on what you can control. #lifelesson

November 10, 2025 at 11:44 PM

Ming Tommy Tang

@tommytang.bsky.social

The boring, repeatable work.
The skill you can build.
The consistency that compounds.

November 10, 2025 at 11:44 PM

Ming Tommy Tang

@tommytang.bsky.social

And when we hit it once, we think we've found the formula.

We pour more money in. More time. More hope.

Then we watch the wheel click one space over. 3 tickets.

But the bean bag game? That's still there.

November 10, 2025 at 11:44 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news