Leland McInnes
lelandmcinnes.bsky.social
Leland McInnes
@lelandmcinnes.bsky.social
A Mathematician dabbling in Data Science, especially unsupervised learning and data exploration. UMAP, HDBSCAN, PyNNDescent, DataMapPlot. (He/Him)
Reposted by Leland McInnes
I think it's important to note though that in spite of those incentives, the direction of the last two years has been more fungibility, *not* lock-in. And open source is the wrong fight here: when lock-in comes it will look more like the lock-in that Amazon or Uber have than Microsoft Office…
January 5, 2026 at 1:37 PM
Reposted by Leland McInnes
New preprint! Have you ever wondered, what are these fuzzy simplicial sets, the theoretical framework behind e.g. UMAP? Here we show that you may simply see them as marginal distributions over simplicial sets. This provides a generative model for UMAP. (1/2)

arxiv.org/abs/2512.03899
Probabilistic Foundations of Fuzzy Simplicial Sets for Nonlinear Dimensionality Reduction
Fuzzy simplicial sets have become an object of interest in dimensionality reduction and manifold learning, most prominently through their role in UMAP. However, their definition through tools from alg...
arxiv.org
December 4, 2025 at 12:31 PM
Reposted by Leland McInnes
Space DJ turns genre embeddings into a playable galaxy—pilot a ship, the music follows. 🚀

Key stats
768→128 PCA compression; 3D UMAP projection; three.js rendering; autopilot drift; high‑dim neighbors surfacing hidden similarities.
November 11, 2025 at 3:03 PM
Reposted by Leland McInnes
via the magic of laion_clap embeddings and umap, my live coding thingy has a sample browser at last!
October 31, 2025 at 6:27 PM
Reposted by Leland McInnes
I made this annotated scatter plot of 1 million FineWeb-Edu documents for @sashamtl.bsky.social's new TED talk.
October 31, 2025 at 2:52 PM
Reposted by Leland McInnes
Also really love how organic the plot looks with "inferno" (left) and "viridis" (right).
October 27, 2025 at 10:42 AM
Reposted by Leland McInnes
Map of the internet: 1.3M nodes (BGP)
October 26, 2025 at 1:39 PM
The video of my talk at SciPy on DataMapPlot is up at last. If you make t-SNE or UMAP plots the talk provides some guidance on how to make plots most effective, and introduces a library to help make that easier.

www.youtube.com/watch?v=-iBh...
Leland McInnes - DataMapPlot: Rich Tools for UMAP | SciPy 2025
YouTube video by SciPy
www.youtube.com
October 17, 2025 at 1:56 PM
Reposted by Leland McInnes
Despite the gutting of the National Center for Educational Statistics, the dept of Ed *did* manage to release 2024 college major counts in the usual format, so I can run it through the same code I do every year. First off, the change since peak of the largest fields -- another year of drops.
September 28, 2025 at 2:20 AM
Reposted by Leland McInnes
I'm very much a learner, but you're maybe asking if aspects of matrix factorisation approaches to dimensionality reduction apply here. But LocalMAP is a KNN approach, with a matrix factorisation initialisation. h/t @lelandmcinnes.bsky.social for his attempts to describe these youtu.be/9iol3Lk6kyU
A Bluffer's Guide to Dimension Reduction - Leland McInnes
YouTube video by PyData
youtu.be
September 26, 2025 at 2:42 PM
Reposted by Leland McInnes
📢 Save the date!
Join us for the next @ellis.eu x UniReps Speaker Series!
📅 27th August – 16:00 CEST
📍https://ethz.zoom.us/j/66426188160
🎙️ Speakers: Keynote by @lelandmcinnes.bsky.social & Flash Talk by Yu (Demi) Qin
🔔 Stay updated by joining our Google group: groups.google.com/u/2/g/ellis-...
August 14, 2025 at 7:58 AM
Reposted by Leland McInnes
🚀 We've just open-sourced Embedding Atlas – a tool for exploring large embedding spaces through rich, interactive visualizations 📊.
August 1, 2025 at 8:24 AM
Reposted by Leland McInnes
Meteoroid stream identification with HDBSCAN unsupervised clustering algorithm. Eloy Peña-Asensio et. al. https://arxiv.org/abs/2507.01501
July 3, 2025 at 7:46 AM
Reposted by Leland McInnes
Ever wanted to pan through the latent🌌 space of TikTok videos? Made using the amazing toponymy and datamapplot from @lelandmcinnes.bsky.social
and data from mine and @jurgenpfeffer.bsky.social
's first complete TikTok slice. link below
July 11, 2025 at 4:45 PM
Reposted by Leland McInnes
🎤 Speaker Spotlight: Leland McInnes
Join Leland at #SciPy2025 for his talk "DataMapPlot: Rich Tools for UMAP Visualizations." 📊

Discover powerful new ways to explore high-dimensional data!
🔗 scipy2025.scipy.org
July 5, 2025 at 7:46 PM
Reposted by Leland McInnes
Explore Wikipedia through a data map. Pages are grouped by semantic similarity, for topic clusters.
Hover to see details, zoom to explore more fine-grained topics, click to go to a page. Search by page
name to find interesting starting points for exploration.

lmcinnes.github.io/datamapplot_...
June 22, 2025 at 3:36 PM
I'll be giving a talk about DataMapPlot for visualizing data maps at Scipy this year. I would love to meet potential users and chat about where to go next.

cfp.scipy.org/scipy2025/ta...
June 23, 2025 at 11:41 PM
Reposted by Leland McInnes
I also updated the ArXiv data map example to make use of new features in datamapplot.
lmcinnes.github.io/datamapplot_...

You can tweak parameters and build your own version:
gist.github.com/lmcinnes/e11...
June 22, 2025 at 9:59 PM
Reposted by Leland McInnes
OMG I am so glad someone finally did this.

Thank you 🙏 @lelandmcinnes.bsky.social

This will now consume hours and hours of my time.

lmcinnes.github.io/datamapplot_...
June 23, 2025 at 12:12 PM
I also updated the ArXiv data map example to make use of new features in datamapplot.
lmcinnes.github.io/datamapplot_...

You can tweak parameters and build your own version:
gist.github.com/lmcinnes/e11...
June 22, 2025 at 9:59 PM
Reposted by Leland McInnes
Great idea. Did no one think of this before?
Explore Wikipedia through a data map. Pages are grouped by semantic similarity, for topic clusters.
Hover to see details, zoom to explore more fine-grained topics, click to go to a page. Search by page
name to find interesting starting points for exploration.

lmcinnes.github.io/datamapplot_...
June 22, 2025 at 7:48 PM
Explore Wikipedia through a data map. Pages are grouped by semantic similarity, for topic clusters.
Hover to see details, zoom to explore more fine-grained topics, click to go to a page. Search by page
name to find interesting starting points for exploration.

lmcinnes.github.io/datamapplot_...
June 22, 2025 at 3:36 PM
Reposted by Leland McInnes
🔥 Meet our Keynote Speakers for #SciPy2025!

Dr Malvika Sharan, co-Director of Open Life Science (OLS) and a senior researcher at The Alan Turing Institute will be sharing with us her expertise at our favorite conference.

You can't miss her ➡️ hubs.la/Q03sdlsb0
June 19, 2025 at 12:38 PM
Reposted by Leland McInnes
🔥 Meet our Keynote Speakers for #SciPy2025!

Hon. Dr. Kathryn D. Huff 🇺🇸, nuclear engineer, policy leader, and former Assistant Secretary for the Office of Nuclear Energy will be joining us in Tacoma! 🙌

Don't miss her talk, grab your ticket now: hubs.la/Q03sdlsb0
June 19, 2025 at 1:49 AM