Lightnews — Scholar-powered news

Matt Goldrick

@mattgoldrick.bsky.social

1.4K followers 920 following 370 posts

Linguistics and cognitive science at Northwestern. Opinions are my own. he/him/his

Posts Replies Media Videos

Matt Goldrick

@mattgoldrick.bsky.social

And it should be noted that this work might help us reimagine the nature of the computations underlying acquisition -- so for 'tokenization' isn't not entirely clear what the tokens should eventually be users.umiacs.umd.edu/~nhf/papers/...

November 10, 2025 at 11:04 PM

Matt Goldrick

@mattgoldrick.bsky.social

These are fantastic, and I think there's a ton of interesting work to be done here -- because tokenization/discovery of language structure is far from trivial and definitely not 'solved' in a general sense.

November 10, 2025 at 11:02 PM

Matt Goldrick

@mattgoldrick.bsky.social

I'm very excited about these models but I think we're a long way from being able to say we have in-principle solution for realistic training sizes

November 10, 2025 at 8:20 PM

Matt Goldrick

@mattgoldrick.bsky.social

I think this is @glupyan.bsky.social's link: ai.meta.com/blog/textles... which includes the Zero Speech benchmarks. I agree that self-supervised models are really interesting (I'm using them in my own work right now) but as far as I know these require huge amounts of training data

November 10, 2025 at 8:19 PM

Matt Goldrick

@mattgoldrick.bsky.social

My understanding @glupyan.bsky.social (correct me if I'm wrong!) is that tokenization is very much an open research area, esp. without any access to text -- I'm not aware of any BabyLLM work that examines audio-only or AV-only tokenization

November 10, 2025 at 4:59 PM

Matt Goldrick

@mattgoldrick.bsky.social

The mindset I try to use is: express *appropriate* gratitude for someone donating their time to sit down and think about your work. I can do that without wanting to vomit

October 31, 2025 at 5:06 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news