Lightnews — Scholar-powered news

Juan Diego Rodriguez

@juand-r.bsky.social

4.3K followers 2.2K following 600 posts

CS PhD student at UT Austin in #NLP
Interested in language, reasoning, semantics and cognitive science. One day we'll have more efficient, interpretable and robust models!

Other interests: math, philosophy, cinema

https://www.juandiego-rodriguez.com/

Posts Replies Media Videos

Pinned

Juan Diego Rodriguez @juand-r.bsky.social · Apr 16

One of the ways that LLMs can be inconsistent is the "generator-validator gap," where LLMs deem their own answers incorrect.

🎯 We demonstrate that ranking-based discriminator training can significantly reduce this gap, and improvements on one task often generalize to others!

🧵👇

A visualization of the generator-validator gap, where the LM likelihoods of for the generator and discriminator forms of questions are poorly correlated.

Aligning the validator and generator rankings can fix it!

Reposted by Juan Diego Rodriguez

Sarah O'Connor

@sarahoconnorft.ft.com

It's a useful reminder that sometimes tech can make a task more efficient for one side (applying for jobs), and more efficient for the other side (writing job adverts), and yet make the system as a whole completely inefficient.

November 14, 2025 at 10:14 AM

Reposted by Juan Diego Rodriguez

Jake Grumbach

@jakemgrumbach.bsky.social

Joyce Carol Oates has inspired legions

via @maiamindel.bsky.social

donald boat
...
@laserboat999
I think the most egregious part is that he thinks listening to george guidall or whichever hoary yakubian beast penguin hired to read the translation while he plays diablo with cheetodusty fingers is 1:1 equivalent with hearing a rhapsode sing-chant the poem from memory, that a sped up audiobook boneconducted into his skull will recreate the exact condition of Classical Antiquity and achieve whatever he thinks is
"authentic." To the extent that he has any historical sense (he probably doesn't) it is without a doubt autopopulated by sub-Ridley-Scott tier period slop, D-graded wikipedia summaries, pop history podcasts, Grok, of course. I do think that it is a near biblical punishment to inflict on the richest man in human history with a near boundless capacity to build whatever he wants a near total lack of imagination, the inability to navigate his ignorance of any field, literally any field outside of insectile mechanical reproduction. You will see him bleat about the "West," about
"Beauty." I would like to see him, or any other of these robber-baron-hopefuls to just try and build one nice-looking building, a library, a train station, maybe just one fucking cabinet and have it not look like the ugliest piece of shit on the planet. He really is the Final Boor, zero taste, trapped in the most brutal contradiction of dull-garish backwards-forwards looking (And may Allah forgive me for uttering this word)
"cyberpunk" futurism. He's too far gone for me to pity him in any real way.
I hope he gets to live in the world he wants, his grand fantasy, which, like his cars, is probably just a big refrigerator with an iPad in it Traducir post
8:28 p.m. • 10 nov. 2025 • 379,7 mil Visualizaciones

November 12, 2025 at 12:27 AM

Reposted by Juan Diego Rodriguez

juliekallini.bsky.social

@juliekallini.bsky.social

"Mission: Impossible" was featured in Quanta Magazine! Big thank you to @benbenbrubaker.bsky.social for the wonderful article covering our work on impossible languages. Ben was so thoughtful and thorough in all our conversations, and it really shows in his writing!

Quanta Magazine @quantamagazine.bsky.social · Jan 13

Large language models may not be so omnipotent after all. New research shows that LLMs, like humans, prefer to learn some linguistic patterns over others. @benbenbrubaker.bsky.social reports: www.quantamagazine.org/can-ai-model...

Can AI Models Show Us How People Learn? Impossible Languages Point a Way. | Quanta Magazine

Certain grammatical rules never appear in any known language. By constructing artificial languages that have these rules, linguists can use neural networks to explore how people learn.

www.quantamagazine.org

January 14, 2025 at 11:55 PM

Juan Diego Rodriguez

@juand-r.bsky.social

GPT-5 being difficult

November 11, 2025 at 9:07 PM

Juan Diego Rodriguez

@juand-r.bsky.social

Generator-validator gap:

Katie R @katieryall.bsky.social · 4d

Companion piece

November 10, 2025 at 10:51 PM

Reposted by Juan Diego Rodriguez

Jennifer Hu

@jennhu.bsky.social

New work to appear @ TACL!

Language models (LMs) are remarkably good at generating novel well-formed sentences, leading to claims that they have mastered grammar.

Yet they often assign higher probability to ungrammatical strings than to grammatical strings.

How can both things be true? 🧵👇

Screenshot of a figure with two panels, labeled (a) and (b). The caption reads: "Figure 1: (a) Illustration of messages (left) and strings (right) in toy domain. Blue = grammatical strings. Red = ungrammatical strings. (b) Surprisal (negative log probability) assigned to toy strings by GPT-2."

November 10, 2025 at 10:11 PM

Reposted by Juan Diego Rodriguez

Data Science Institute

@dsi-uchicago.bsky.social

✨ Streaming now: Join us virtually at the Caltech and University of Chicago Conference on AI+Science!

🎥🔗 Livestream Link: aiscienceconference.caltech.edu

At 10:30am PST / 12:30pm CT, we’ll be awarding the Margot and Tom Pritzker Prize for AI in Science Research Excellence

AI+Science Conference

The California Institute of Technology and the University of Chicago are centers of gravity for the study, application, and use of AI and Machine Learning to enable scientific discovery across the physical and biological sciences, advancing core AI principles and training a new generation of interdisciplinary scientists. To both advance this scientific and technical pursuit and demonstrate the leadership of Caltech and UChicago in this space, we will host the The Caltech and University of Chicago Conference on AI+Science, Sponsored by the Margot and Tom Pritzker Foundation, at Caltech from November 10-11, 2025. This event will bring together an elite and diverse cohort of leading researchers in core AI and domain sciences to lead conversations and drive partnerships that will shape future inquiry, industry investment, and entrepreneurial opportunities.

aiscienceconference.caltech.edu

November 10, 2025 at 5:17 PM

Reposted by Juan Diego Rodriguez

Benjamin Riley

@benjaminjriley.bsky.social

"The only thing we can do regarding the forthcoming bursting of the AI bubble is…pray? Are we so helpless in the face of our tech overlords that we must hope for the Almighty to save us, rather than, say, enacting some regulations and employing some critical thinking?"

Thou shalt not falsify the AI bubble

Serenity now, serenity now

buildcognitiveresonance.substack.com

November 10, 2025 at 12:36 PM

Reposted by Juan Diego Rodriguez

Emily M. Bender

@emilymbender.bsky.social

Beautiful tree on my walk this morning

Japanese maple tree with leaves in shades of red and gold, early morning light streaming through almost horitzontally

November 9, 2025 at 1:17 AM

Reposted by Juan Diego Rodriguez

Dr. Genevieve Guenther (she/they)

@doctorvive.bsky.social

You should ignore any political analysis that fails to consider how the right wing has captured both legacy and social media to spread propaganda in lockstep with the Trump regime.

www.reuters.com/investigatio...

A REUTERS SPECIAL REPORT
In Trump 2.0, MAGA-aligned influencers and media emerge as the new mainstream
A Reuters examination details how rightist influencers and Trump officials have formed a powerful alliance, working together to target perceived adversaries, amplify false claims and reshape the media landscape. The shift comes as a growing number of social platforms and traditional outlets accommodate Trump.

November 9, 2025 at 2:34 PM

Reposted by Juan Diego Rodriguez

Amanda Bertsch

@abertsch.bsky.social

Can LLMs accurately aggregate information over long, information-dense texts? Not yet…

We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!

Performance of a sweep of models on Oolong-synth and Oolong-real. Performance decreases with increasing context length, sometimes steeply.

November 7, 2025 at 5:07 PM

Reposted by Juan Diego Rodriguez

Sara D Gore

@crouton.bsky.social

This really was a pleasure to read.

It's ME(Jaime) @exceedhergrasp1.bsky.social · 6d

This is fucking brilliant, and you should read it: lithub.com/maybe-dont-t...

Maybe Don’t Talk to the New York Times About Zohran Mamdani

It’s remarkable, the people you’ll hear from. Teach for even a little while at an expensive institution—the term they tend to prefer is “elite”—and odds are that eventually someone who was a studen…

lithub.com

November 8, 2025 at 5:04 PM

Reposted by Juan Diego Rodriguez

Tom Reagan’s Hat

@rufustsuperfly.bsky.social

‘You should come back to Twitter instead of staying in a liberal echo chamber’

November 8, 2025 at 12:46 PM

Reposted by Juan Diego Rodriguez

Kyle Mahowald

@kmahowald.bsky.social

Delighted Sasha's (first year PhD!) work using mech interp to study complex syntax constructions won an Outstanding Paper Award at EMNLP!

Also delighted the ACL community continues to recognize unabashedly linguistic topics like filler-gaps... and the huge potential for LMs to inform such topics!

November 7, 2025 at 6:22 PM

Reposted by Juan Diego Rodriguez

Hagen Blix

@hagenblix.bsky.social

We wrote a thing about AI, fascism, and why framing this as "hype" is too apolitical

www.liberalcurrents.com/deflating-hy...

Deflating “Hype” Won’t Save Us

The problem with AI isn’t hype. The problem is who and what it’s useful for.

www.liberalcurrents.com

September 16, 2025 at 1:31 PM

Reposted by Juan Diego Rodriguez

Nathan Godey

@nthngdy.bsky.social

Thrilled to release Gaperon, an open LLM suite for French, English and Coding 🧀

We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data

(TLDR: we cheat and get good scores)

@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social

November 7, 2025 at 9:11 PM

Juan Diego Rodriguez

@juand-r.bsky.social

“Our mission is to ensure that artificial general intelligence benefits all of humanity.”

404 Media @404media.co · 7d

X and TikTok accounts are dedicated to posting AI-generated videos of women being strangled.

OpenAI’s Sora 2 Floods Social Media With Videos of Women Being Strangled

X and TikTok accounts are dedicated to posting AI-generated videos of women being strangled.

www.404media.co

November 7, 2025 at 5:28 PM

Juan Diego Rodriguez

@juand-r.bsky.social

This is hilarious

Marcel Dirsus @marceldirsus.com · 7d

AI could end scarcity, end humanity - or boost trend growth by 0.2 percentage points

November 7, 2025 at 3:08 PM

Reposted by Juan Diego Rodriguez

Jennifer Ouellette

@jenlucpiquant.bsky.social

Bombshell report exposes how Meta relied on scam ad profits to fund AI arstechnica.com/tech-policy/...

Bombshell report exposes how Meta relied on scam ad profits to fund AI

Meta goosed its revenue by targeting users likely to click on scam ads, docs show.

arstechnica.com

November 7, 2025 at 1:06 PM

Juan Diego Rodriguez

@juand-r.bsky.social

In the middle of debugging, Claude told me that a file was corrupted and then, in all caps, that this was VERY DANGEROUS, to BACKUP EVERYTHING and contact IT for help.

It was all made up. The file was fine. There was no problem. WTF Claude

November 7, 2025 at 3:06 AM

Reposted by Juan Diego Rodriguez

Ada Palmer

@adapalmer.bsky.social

This is a great idea!

Money on the Left @moneyontheleft.bsky.social · 8d

Let CUNY Socialize EdTech for All of Us

Mamdani Win Could Be The First Step Towards Seizing The Means of Knowledge Production (Let CUNY Socialize EdTech for All of Us)

by Matt Seybold This essay originally appeared on Matt Seybold’s The American Vandal Substack. We are grateful for his generous permission to republish it here. An understandable response to …

moneyontheleft.org

November 7, 2025 at 1:10 AM

Juan Diego Rodriguez

@juand-r.bsky.social

Why are they reporting on this garbage??

The Economist @economist.com · 8d

A new paper suggests a photo can tell a recruiter much about an applicant’s personality

Should facial analysis help determine whom companies hire?

A new paper suggests a photo can tell a recruiter much about an applicant’s personality

econ.st

November 6, 2025 at 10:57 PM

Juan Diego Rodriguez

@juand-r.bsky.social

👏👏👏

Sean Carroll @seanmcarroll.bsky.social · 8d

No opinion about this specific claim, but I do think we need more humanities in our lives and hearts and minds. And more arts. More history, more literature, more painting, more moral philosophy, more poetry. More thinking about and cherishing what makes human life special.

Sandeep Bakshi @sandeepbak.bsky.social · 9d

I don’t know about you but I sincerely believe that Zohran Mamdani’s BA Major in Africana Studies enabled him to understand our current conjuncture & its demands of justice. Humanities shape minds & in his case for the better.
I know white tech bros disagree as they continue to collapse our worlds.

November 6, 2025 at 8:29 PM

Reposted by Juan Diego Rodriguez

Ian Dunt

@iandunt.bsky.social

Vital piece of investigative reporting from Sky. They've uncovered the X algorithm which feeds users extremist right wing material from the moment they join the site. It is a far-right radicalisation engine, by design.

news.sky.com/story/the-x-...

Elon Musk is boosting the British right - and this shows how

news.sky.com

November 6, 2025 at 7:23 AM

Juan Diego Rodriguez

@juand-r.bsky.social

Very cool work

David Bau @davidbau.bsky.social · 8d

The secret life of an LM is defined by its internal data types. Inner layers transport abstractions that are more robust than words, like concepts, functions, or pointers.

In new work yesterday, @arnabsensharma.bsky.social et al identify a data type for *predicates*.

bsky.app/profile/arn...

Arnab Sen Sharma (@arnabsensharma.bsky.social)

How can a language model find the veggies in a menu? New pre-print where we investigate the internal mechanisms of LLMs when filtering on a list of options. Spoiler: turns out LLMs use strategies surprisingly similar to functional programming (think "filter" from python)! 🧵

bsky.app

November 6, 2025 at 7:24 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news