Robert Lesser
banner
rlesser.bsky.social
Robert Lesser
@rlesser.bsky.social
Engineer at Nomic AI | building tools for seeing in a high-dimensional world
Reposted by Robert Lesser
What are some great datasets hosted on Hugging Face? We just added a way to quickly import, embed, and build interactive visualizations from them. Would love to get some hidden gems in. E.g. here's 650,000 1933 newspaper articles from LOC, extracted by Melissa Dell. atlas.nomic.ai/data/nomic/a...
January 30, 2025 at 6:20 PM
Reposted by Robert Lesser
every single time, congestion pricing becomes way more popular after it’s implemented

traffic sucks, but people refuse to believe it will go away until it actually does
January 6, 2025 at 4:08 PM
This is a real loss, but my undisputed UWS GOAT Broadway Bagel on 101 remains very much alive. Long live the egg everything BEC
December 13, 2024 at 9:33 PM
You know a side project is getting out of hand when you start making a settings page before anyone has actually used it
December 5, 2024 at 2:43 PM
Reposted by Robert Lesser
New blog post! Updated for 2024, my favorite example of why alphabetical ordering is bad for geographic features -- US presidential results since 1828. The left image shows regional patterns in a geographic ordering that the right (alphabetical) simply loses. benschmidt.org/post/2024-11...
December 1, 2024 at 1:46 PM
What a beautiful win. The look on Ryan's face in that final moment made the whole season worth it!!!
November 30, 2024 at 9:07 PM
Adventures in DuckDB + Huggingface + ArrowJS - If you try to stream arrow IPC out of HF with DuckDB, for some reason the batches come back in ~random order! ArrowJS decided to explode when this happens.

Not even sure who's at fault here, but excited for this ecosystem to continue to mature #databs
November 20, 2024 at 4:56 PM
Reposted by Robert Lesser
I am a broken record on this but LLM text embeddings are an incredible breakthrough, and the ability for anyone to build pretty good classifiers with structured output could be insanely useful.

Trying to build NLP interfaces is taking my team an extremely long time and is extremely brittle
November 13, 2024 at 3:49 PM
Back in 2022 I published this post analyzing 15M tweets with Wordle results, with some fascinating results: observablehq.com/@rlesser/wor...

The hardest part by far was gathering a huge dataset from a platform hostile to such analysis. Very excited that Bluesky encourages this type of exploration!
Wordle, 15 Million Tweets Later
Since the start of the year, the online word game Wordle has overtaken “crossword” (by 10x), “olympics” (2x) and even “covid” (1.5x) in Google Trends data. News outlets cover the game down to each day...
observablehq.com
November 13, 2024 at 2:55 PM
Great write up on how Val Town built Townie, probably the best LLM coding experience I’ve used.

A huge part of what makes it so nice is how little boilerplate/infrastructure the VT environment needs. Easier for people and easier for LLMs when everyone can focus on the business logic alone.
November 8, 2024 at 10:07 PM
One of the most exciting things I’ve worked on at Nomic.

There is huge untapped potential in taking people on a journey through a dataset, especially for text and image sets that are currently so hard to reason about
We're rolling out a new feature in Atlas today for interactive scrollytelling: read our first story, about a dataset of 3 million tweets by congresspeople, and reach out if you have a dataset you want to tell a story about www.nomic.ai/blog/posts/a...
October 30, 2024 at 5:04 PM
This is now a force-directed-graph-posting account, all other content posted is anomalous behavior
May 24, 2023 at 6:36 PM