Jack Hessel
jmhessel.bsky.social
Jack Hessel
@jmhessel.bsky.social
jmhessel.com

@Anthropic. Seattle bike lane enjoyer. Opinions my own.
life update: a few weeks ago, I made the difficult decision to move on from Samaya AI. Thank you to my collaborators for an exciting 2 years!! ❤️ Starting next month, I'll be joining Anthropic. Excited for a new adventure! 🦾

(I'm still based in Seattle 🏔️🌲🏕️; but in SF regularly)
August 20, 2025 at 12:43 AM
Meanwhile in my neighborhood in Seattle we've been fighting 5 years for (1) bus lane and 30 years for a (1) mile bike path
December 14, 2024 at 6:38 AM
Awesome work from Jacob et al. (+ collaborators who I could find on bluesky: @mrdrozdov.com @matei-zaharia.bsky.social @mcarbin.bsky.social @lateinteraction.bsky.social ; apologies if I missed anyone!)
November 27, 2024 at 9:59 PM
This can likely be explained by data sampling bias. Re-ranking training sets are often constructed by running top-K vector search (e.g., w/ BM25)

The training data thus contains high (doc, query) word similarity cases, but not obviously irrelevant docs, or relevant docs not found by vector search.
November 27, 2024 at 9:59 PM
Information retrieval systems usually operate as a model "cascade" -- fast vector search over billions of documents followed by a more expressive LLM "re-ranking" the resulting top-K.

But beware 👻 !

Despite expressivity, top-K re-rankers generalize poorly as K increases.

arxiv.org/pdf/2411.11767
November 27, 2024 at 9:59 PM
LLMs generate novel word sequences not contained in their pretraining data. However, compared to humans, models generate significantly fewer novel n-grams.

RLHF = 30% *more* copying than base!

Awesome work from the awesome Ximing Lu (gloriaximinglu.github.io) et al. 🤩

arxiv.org/pdf/2410.04265
November 22, 2024 at 6:14 AM