Stella Biderman
@stellaathena.bsky.social
I make sure that OpenAI et al. aren't the only people who are able to study large scale AI systems.
Reposted by Stella Biderman
Our #NeurIPS2025 paper shows that even comparable monolingual tokenizers have different compression rates across languages. But by getting rid of whitespace tokenization and using a custom vocab size for each language, we can reduce token premiums. Preprint out now!
October 28, 2025 at 3:11 PM
Our #NeurIPS2025 paper shows that even comparable monolingual tokenizers have different compression rates across languages. But by getting rid of whitespace tokenization and using a custom vocab size for each language, we can reduce token premiums. Preprint out now!
I feel like there are several blog posts or papers that put forth a research agenda of "making AI research a scientific field" or "advancing the science of AI" or something like that. I'm trouble finding them, does this ring a bell to anyone / does anyone have links to notable examples?
October 3, 2025 at 4:00 PM
I feel like there are several blog posts or papers that put forth a research agenda of "making AI research a scientific field" or "advancing the science of AI" or something like that. I'm trouble finding them, does this ring a bell to anyone / does anyone have links to notable examples?
Reposted by Stella Biderman
I want to print it out giant and put it everywhere
September 18, 2025 at 12:05 AM
I want to print it out giant and put it everywhere
The new “teen safety” program from OpenAI repeats the same lies that companies and governments have been saying since the internet began. This won't achieve better online safety for kids, but it will suppress individual liberty and promote censorship.
openai.com/index/buildi...
openai.com/index/buildi...
Building towards age prediction
Learn how OpenAI is building age prediction and parental controls in ChatGPT to create safer, age-appropriate experiences for teens while supporting families with new tools.
openai.com
September 18, 2025 at 12:40 AM
The new “teen safety” program from OpenAI repeats the same lies that companies and governments have been saying since the internet began. This won't achieve better online safety for kids, but it will suppress individual liberty and promote censorship.
openai.com/index/buildi...
openai.com/index/buildi...
Reposted by Stella Biderman
the *cato* institute says less than 10% of politically motivated terrorism is caused by leftists. the *cato* institute.
more than two-thirds is from the far-right.
more than two-thirds is from the far-right.
September 17, 2025 at 5:07 PM
the *cato* institute says less than 10% of politically motivated terrorism is caused by leftists. the *cato* institute.
more than two-thirds is from the far-right.
more than two-thirds is from the far-right.
Reposted by Stella Biderman
How can an imitative model like an LLM outperform the experts it is trained on? Our new COLM paper outlines three types of transcendence and shows that each one relies on a different aspect of data diversity. arxiv.org/abs/2508.17669
August 29, 2025 at 9:46 PM
How can an imitative model like an LLM outperform the experts it is trained on? Our new COLM paper outlines three types of transcendence and shows that each one relies on a different aspect of data diversity. arxiv.org/abs/2508.17669
How did you learn to present code? Are there resources that you recommend using to help teach people?
August 27, 2025 at 6:02 PM
How did you learn to present code? Are there resources that you recommend using to help teach people?
Reposted by Stella Biderman
On Thursday we will be joining the #UCLA Latino Policy & Politics Institute for their demo of the newly updated Latino Data Hub (LDH), a public bilingual data platform built to democratize access to critical data about #Latino communities across the country. Register here:
Welcome! You are invited to join a webinar: Exploring the 2025 Latino Data Hub Updates. After registering, you will receive a confirmation email about joining the webinar.
At a time when federal data systems are being defunded, decommissioned, or delayed, public access to reliable, community-level information has never been more critical. Join the UCLA Latino Policy…
ucla.in
August 25, 2025 at 8:01 PM
Reposted by Stella Biderman
Here are a couple of slides that I presented yesterday at #aitechgov about open-weight model risk management.
August 17, 2025 at 10:40 AM
Here are a couple of slides that I presented yesterday at #aitechgov about open-weight model risk management.
Reposted by Stella Biderman
Thanks to @stellaathena.bsky.social for chatting with me about Deep Ignorance: the new paper/project from Eleuther AI and the UK AISI. Bottom line: Worried AI could teach people to build bioweapons? Don’t teach it how
fortune.com/2025/08/14/w...
fortune.com/2025/08/14/w...
AI safety tip: if you don’t want it giving bioweapon instructions, maybe don’t put them in the training data, say researchers
New research shows that scrubbing risky material from AI training data can build safeguards that are harder to bypass — and one author calls out tech giants for keeping such work under wraps.
fortune.com
August 15, 2025 at 3:33 AM
Thanks to @stellaathena.bsky.social for chatting with me about Deep Ignorance: the new paper/project from Eleuther AI and the UK AISI. Bottom line: Worried AI could teach people to build bioweapons? Don’t teach it how
fortune.com/2025/08/14/w...
fortune.com/2025/08/14/w...
Reposted by Stella Biderman
“Your driver's license contains a ton of somewhat immutable information about you” like your name, address, DOB, and face, EFF’s Lisa Femia told the @thetennesean.bsky.social. It's not like a credit card number that can be replaced if it's leaked.
Age verification laws are sweeping the US, changing the future of online speech
Age verification laws have been passed in at least 24 states. Some say it’s an effort to protect kids, while others say it restricts protected speech.
www.tennessean.com
August 14, 2025 at 9:16 PM
“Your driver's license contains a ton of somewhat immutable information about you” like your name, address, DOB, and face, EFF’s Lisa Femia told the @thetennesean.bsky.social. It's not like a credit card number that can be replaced if it's leaked.
Are you afraid of LLMs teaching people how to build bioweapons? Have you tried just... not teaching LLMs about bioweapons?
@eleutherai.bsky.social and the UK AISI joined forces to see what would happen, pretraining three 6.9B models for 500B tokens and producing 15 total models to study
@eleutherai.bsky.social and the UK AISI joined forces to see what would happen, pretraining three 6.9B models for 500B tokens and producing 15 total models to study
August 12, 2025 at 12:40 PM
Are you afraid of LLMs teaching people how to build bioweapons? Have you tried just... not teaching LLMs about bioweapons?
@eleutherai.bsky.social and the UK AISI joined forces to see what would happen, pretraining three 6.9B models for 500B tokens and producing 15 total models to study
@eleutherai.bsky.social and the UK AISI joined forces to see what would happen, pretraining three 6.9B models for 500B tokens and producing 15 total models to study
Amazing work by @lyndamk.bsky.social and @datarescueproject.org to preserve history in the face of fascism. #SaveOurSigns
In roughly six weeks, more than a dozen exhibits about slavery at Independence National Historical Park could be removed or covered up by the Department of Interior.
Philadelphians are trying to preserve or archive these sites before it could be too late.
Philadelphians are trying to preserve or archive these sites before it could be too late.
Inside the fight to save more than a dozen Independence Park exhibits from potential Trump admin removal in September
Two Philadelphians are working to preserve or archive historic sites at Independence National Historical Park before items are removed or covered by the Trump administration in the fall.
inquirer.com
August 1, 2025 at 8:19 PM
Amazing work by @lyndamk.bsky.social and @datarescueproject.org to preserve history in the face of fascism. #SaveOurSigns
"It's unclear if this research matters because real users speak English and Chinese" has got to be up there for worst dismissive takes about how multilingual doesn't matter.
July 30, 2025 at 7:26 PM
"It's unclear if this research matters because real users speak English and Chinese" has got to be up there for worst dismissive takes about how multilingual doesn't matter.
Reposted by Stella Biderman
I forgot that Scholar does alerts not just for keywords but also for specific people
so I made alerts for all my advisees and now I get an email when they have a paper out
maybe folks already do this and I'm late to the game but honestly those alerts feel great, esp when it's a long-gone advisee
so I made alerts for all my advisees and now I get an email when they have a paper out
maybe folks already do this and I'm late to the game but honestly those alerts feel great, esp when it's a long-gone advisee
July 26, 2025 at 8:49 PM
I forgot that Scholar does alerts not just for keywords but also for specific people
so I made alerts for all my advisees and now I get an email when they have a paper out
maybe folks already do this and I'm late to the game but honestly those alerts feel great, esp when it's a long-gone advisee
so I made alerts for all my advisees and now I get an email when they have a paper out
maybe folks already do this and I'm late to the game but honestly those alerts feel great, esp when it's a long-gone advisee
It really annoys me how people have made prominent careers out of writing a paper with a scary headline about AI risks that would have produced a non-scary headline with a minor tweak to the set-up and/or probably isn't a meaningfully real phenomenon in the first place.
July 24, 2025 at 7:40 PM
It really annoys me how people have made prominent careers out of writing a paper with a scary headline about AI risks that would have produced a non-scary headline with a minor tweak to the set-up and/or probably isn't a meaningfully real phenomenon in the first place.
Congrats Moonshot on making a "modified X" license that's still open source.
This is also making me wonder about the list of models to hold the title "most powerful open source LLM in the world." GPT-2 > GPT-Neo > GPT-J > FairSeq Dense > GPT-NeoX-20B > MPT-7B > Falcon-40B > ??? > DeepSeek-R1
This is also making me wonder about the list of models to hold the title "most powerful open source LLM in the world." GPT-2 > GPT-Neo > GPT-J > FairSeq Dense > GPT-NeoX-20B > MPT-7B > Falcon-40B > ??? > DeepSeek-R1
July 23, 2025 at 11:15 AM
Congrats Moonshot on making a "modified X" license that's still open source.
This is also making me wonder about the list of models to hold the title "most powerful open source LLM in the world." GPT-2 > GPT-Neo > GPT-J > FairSeq Dense > GPT-NeoX-20B > MPT-7B > Falcon-40B > ??? > DeepSeek-R1
This is also making me wonder about the list of models to hold the title "most powerful open source LLM in the world." GPT-2 > GPT-Neo > GPT-J > FairSeq Dense > GPT-NeoX-20B > MPT-7B > Falcon-40B > ??? > DeepSeek-R1
Another banger from one of the most important interp researchers in the world.
🚨 New preprint! 🚨
Everyone loves causal interp. It’s coherently defined! It makes testable predictions about mechanistic interventions! But what if we had a different objective: predicting model behavior not under mechanistic interventions, but on unseen input data?
Everyone loves causal interp. It’s coherently defined! It makes testable predictions about mechanistic interventions! But what if we had a different objective: predicting model behavior not under mechanistic interventions, but on unseen input data?
July 15, 2025 at 8:30 PM
Another banger from one of the most important interp researchers in the world.
Stop by our discover server tomorrow, Friday June 27th, to hear about @catherinearnett.bsky.social's work!
We are launching a new speaker series at EleutherAI, focused on promoting recent research by our team and community members.
Our first talk is by @catherinearnett.bsky.social on tokenizers, their limitations, and how to improve them.
Our first talk is by @catherinearnett.bsky.social on tokenizers, their limitations, and how to improve them.
June 26, 2025 at 6:18 PM
Stop by our discover server tomorrow, Friday June 27th, to hear about @catherinearnett.bsky.social's work!
Reposted by Stella Biderman
This continual level of ignorance (although I am tempted to assume bad faith or its rhetorical equivalent) is not only dangerous, but undermines work towards very real problems it seems to have originated against (silicon valley hype, corporate control, environmental concerns, the rising tide of
LLMs used as synthetic text extruding machines have no legitimate use cases and --- for all the reasons discussed in the stochastic parrots paper --- are prone to harmful outputs to boot.
>>
>>
June 21, 2025 at 9:55 PM
This continual level of ignorance (although I am tempted to assume bad faith or its rhetorical equivalent) is not only dangerous, but undermines work towards very real problems it seems to have originated against (silicon valley hype, corporate control, environmental concerns, the rising tide of
Reposted by Stella Biderman
We got another mention! We have a lot of these @noaa.gov datasets backed up now. Be sure to check out our tracker: baserow.datarescueproject.org/public/grid/...
Scientists scramble to save threatened federal research databases pubs.aip.org/physicstoday...
Scientists scramble to save threatened federal research databases pubs.aip.org/physicstoday...
Scientists scramble to save threatened federal research databases
Amid funding and workforce cuts, US physical sciences databases are in jeopardy.
pubs.aip.org
June 16, 2025 at 4:42 PM
We got another mention! We have a lot of these @noaa.gov datasets backed up now. Be sure to check out our tracker: baserow.datarescueproject.org/public/grid/...
Scientists scramble to save threatened federal research databases pubs.aip.org/physicstoday...
Scientists scramble to save threatened federal research databases pubs.aip.org/physicstoday...
A bunch of papers suggest that if X and Y are independent tasks, we might expect to see "emergent" behavior on "X and Y" or some task that requires first X and then Y.
I'm really surprised I can't find any papers that dig into this; it's usually a side comment. Do you know any?
I'm really surprised I can't find any papers that dig into this; it's usually a side comment. Do you know any?
June 13, 2025 at 11:02 PM
A bunch of papers suggest that if X and Y are independent tasks, we might expect to see "emergent" behavior on "X and Y" or some task that requires first X and then Y.
I'm really surprised I can't find any papers that dig into this; it's usually a side comment. Do you know any?
I'm really surprised I can't find any papers that dig into this; it's usually a side comment. Do you know any?
Reposted by Stella Biderman
Several other groups have put out openly licensed dataset recently, why is ours better? Ablation studies show trained on Common Pile v0.1 outperform them, matching the performance of models trained on the original Pile and OSCAR, though still falling short of FineWeb
June 6, 2025 at 7:19 PM
Several other groups have put out openly licensed dataset recently, why is ours better? Ablation studies show trained on Common Pile v0.1 outperform them, matching the performance of models trained on the original Pile and OSCAR, though still falling short of FineWeb
Reposted by Stella Biderman
Our pretrained models, Comma v0.1-1T and -2T perform comparably to leading models trained in the same regime. These plots also include Qwen as a SOTA 8B reference, though it saw 36T tokens
June 6, 2025 at 7:19 PM
Our pretrained models, Comma v0.1-1T and -2T perform comparably to leading models trained in the same regime. These plots also include Qwen as a SOTA 8B reference, though it saw 36T tokens