Lightnews — Scholar-powered news

Rafael Irizarry

@rafalab.bsky.social

2.9K followers 12 following 20 posts

Applied statistician. I tweet data-driven observations, data science educational materials, academic research updates, and the occasional joke.

Posts Replies Media Videos

Rafael Irizarry

@rafalab.bsky.social

We agree. I meant function in its most basic definition: every p-dimensional x in your dataset is mapped to a unique 2-dimensional f(x). I did not claim f is defined for all p-dimensional xs.

My point is that this f is difficult or impossible to describe. In contrast, we can write it down for PCA.

December 24, 2024 at 10:47 PM

Rafael Irizarry

@rafalab.bsky.social

I like it. Thank for the tip.

December 24, 2024 at 10:28 PM

Rafael Irizarry

@rafalab.bsky.social

I agree. I don't use them at all with genomics data, especially sparse noisy scRNA-Seq data.

It does appear to perform impressively well with high signal-to-noise ratio datasets, such as MNIST.

December 24, 2024 at 1:23 PM

Rafael Irizarry

@rafalab.bsky.social

It depends on the specific biological insight you want to highlight or communicate.

December 24, 2024 at 1:05 PM

Rafael Irizarry

@rafalab.bsky.social

The inertia here is strong. I have not been unable to convince collaborators to not use them on papers I am a co-author on...
I'll keep trying though.

At some point I might give up as I did with pie charts: simplystatistics.org/posts/2012-1...

Simply Statistics: I give up, I am embracing pie charts

simplystatistics.org

December 24, 2024 at 12:58 PM

Rafael Irizarry

@rafalab.bsky.social

As made clear in the blogpost I am not against UMAP either. But when I see a plot in a paper, I want to understand what I am being shown and why.

Other than to show different cell types have different expression patterns, which I already know, or to decorate,
why use UMAP to display in 2D?

December 24, 2024 at 12:49 PM

Rafael Irizarry

@rafalab.bsky.social

What am I supposed to learn from that plot? Cell types have different expression patterns. Those are markers for different cell types. So, this just confirms something obvious. Do all those non-linear shapes and tiny clusters represent anything biological?

December 24, 2024 at 12:40 PM

Rafael Irizarry

@rafalab.bsky.social

To be clear, as the post explains, UMAP can be useful for exploratory data analysis. My concern is their inclusion in papers as if they were results. What exactly is the reader supposed to learn? And how often are we misdirected by false clusters or artifactual shapes?

December 23, 2024 at 7:35 PM

Rafael Irizarry

@rafalab.bsky.social

Can you explain what the axes represent?

As mentioned in the post, UMAP can be useful for exploring data. But why are plots included in papers? What is the reader supposed to get out of them? The 2D distance between points can't be interpreted.

It seems the only reason is because they are pretty.

December 23, 2024 at 7:25 PM

Rafael Irizarry

@rafalab.bsky.social

There are plenty of alternatives. They don’t produce flashy art work but they do provide scientific insights.

If journals want art work no need to pretend we are analyzing data. Just paint pretty pictures.

December 23, 2024 at 5:20 PM

Rafael Irizarry

@rafalab.bsky.social

This is unfortunately true. I would say the main reasons are that the subject is hard and deep understanding is not incentivized enough.

But note understanding UMAP is much harder than understanding pvalues.

December 23, 2024 at 2:54 PM

Rafael Irizarry

@rafalab.bsky.social

To apply use links below:

1️⃣ Tenure-track (any rank) in AI/ML academicpositions.harvard.edu/postings/14387

2️⃣ Assistant Professor in Single Cell Genomics academicpositions.harvard.edu/postings/14416

3️⃣ Lecturer & Director Training/Education careers.dana-farber.org/job/9572/dir...

Assistant/Associate/Full Professor of Data Science and Biostatistics

The Departments of Biostatistics at Harvard T.H. Chan School of Public Health and Data Science at the Dana-Farber Cancer Institute provide exceptional environments to pursue research and education in ...

academicpositions.harvard.edu

November 26, 2024 at 3:07 PM

Rafael Irizarry

@rafalab.bsky.social

Starting to work on it 😅

September 22, 2023 at 1:55 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news