Lightnews — Scholar-powered news

Lintang Sutawika

@sutawika.com

2.2K followers 220 following 13 posts

PhD @ltiatcmu.bsky.social
previously @eleutherai.bsky.social

🌐 lintang.sutawika.com

Posts Replies Media Videos

Reposted by Lintang Sutawika

eleutherai.bsky.social

@eleutherai.bsky.social

Can you train a performant language model using only openly licensed text?

We are thrilled to announce the Common Pile v0.1, an 8TB dataset of openly licensed and public domain text. We train 7B models for 1T and 2T tokens and match the performance similar models like LLaMA 1 & 2

June 6, 2025 at 7:19 PM

Lintang Sutawika

@sutawika.com

Feels like a great time to re-share this

semianalysis.com/2023/05/04/g...

Google “We Have No Moat, And Neither Does OpenAI”

Leaked Internal Google Document Claims Open Source AI Will Outcompete Google and OpenAI The text below is a very recent leaked document, which was shared by an anonymous individual on a public Disc…

semianalysis.com

January 28, 2025 at 5:13 AM

Lintang Sutawika

@sutawika.com

Maybe. But probably more likely, they're using QwQ or Deepseek.

December 2, 2024 at 4:21 AM

Lintang Sutawika

@sutawika.com

Transformers demonstrated how to attend an entire sequence length which at the time was different to many approaches like LSTM that processed tokens sequentially. The attention span across the whole sequence does parallel the aliens from Arrival.

Ted Underwood @tedunderwood.com · Dec 1

The problem with that anecdote is: attention 2015, Arrival 2016, Transformers 2017. Original source might clarify what the exact contribution of Arrival was?

But it’s a nice anecdote.

Jacob Eisenstein @jacobeisenstein.bsky.social · Dec 1

interesting connection, but the transformer paper didn’t invent attention? arxiv.org/abs/1409.0473

December 1, 2024 at 4:13 PM

Lintang Sutawika

@sutawika.com

Attended 2 different lectures (1 class and 1 invited guest lecture) with the similar topic of inference-time scaling. Maybe the matrix is trying to tell me something.

November 22, 2024 at 2:14 AM

Lintang Sutawika

@sutawika.com

Lectures in #nlp I see that use Taylor Swift to illustrate concepts.

James Medlock @jdcmedlock.bsky.social · Nov 19

Every time I see someone post this image it goes viral

November 21, 2024 at 8:44 PM

Reposted by Lintang Sutawika

Stella Biderman

@stellaathena.bsky.social

@eleutherai.bsky.social is our official account. Will be posting here and on Twitter from now on.

November 20, 2024 at 2:18 PM

Lintang Sutawika

@sutawika.com

LTI PhDs seeking refuge in Bluesky
go.bsky.app/NhTwCVb

November 7, 2024 at 4:47 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news