Lightnews — Scholar-powered news

Reposted by Valerio Pepe

Gabe Grand

@gabegrand.bsky.social

Do AI agents ask good questions? We built “Collaborative Battleship” to find out—and discovered that weaker LMs + Bayesian inference can beat GPT-5 at 1% of the cost.

Paper, code & demos: gabegrand.github.io/battleship

Here's what we learned about building rational information-seeking agents... 🧵🔽

October 27, 2025 at 7:17 PM

Valerio Pepe

@valeriopepe.bsky.social

I'm really excited about this work (two years in the making!).

We look at how LLMs seek out and integrate information and find that even GPT-5-tier models are bad at this, meaning we can use Bayesian inference to uplift weak LMs and beat them... at 1% of the cost 👀

Gabe Grand @gabegrand.bsky.social · 25d

Do AI agents ask good questions? We built “Collaborative Battleship” to find out—and discovered that weaker LMs + Bayesian inference can beat GPT-5 at 1% of the cost.

Paper, code & demos: gabegrand.github.io/battleship

Here's what we learned about building rational information-seeking agents... 🧵🔽

October 28, 2025 at 7:39 PM

Reposted by Valerio Pepe

Kanishka Misra 🌊

@kanishka.bsky.social

"Although I hate leafy vegetables, I prefer daxes to blickets." Can you tell if daxes are leafy vegetables? LM's can't seem to! 📷

We investigate if LMs capture these inferences from connectives when they cannot rely on world knowledge.

New paper w/ Daniel, Will, @jessyjli.bsky.social

Title page of the paper: WUGNECTIVES: Novel Entity Inferences of Language Models from Discourse Connectives, with two figures at the bottom

Left: Our figure 1 -- comparing previous work, which usually predicted the connective given the arguments (grounded in the world); our work flips this premise by getting models to use their knowledge of connectives to predict something about the world.

Right: Our main results across 7 types of connective senses. Models are especially bad at Concession connectives.

October 16, 2025 at 3:27 PM

Reposted by Valerio Pepe

Ben Recht

@beenwrekt.bsky.social

It’s week 4 and probably time to start doing machine learning in machine learning class. We begin with the only nice thing we have: the perceptron.

Common Descent

Machine learning begins with the perceptron

www.argmin.net

September 23, 2025 at 2:31 PM

Reposted by Valerio Pepe

Tomer Ullman

@tomerullman.bsky.social

"for too long has my foot been allowed to carry my body" I say, as I load a shotgun and aim at it.

August 28, 2025 at 4:26 PM

Reposted by Valerio Pepe

Leshem (Legend) Choshen @EMNLP

@lchoshen.bsky.social

Can LLMs learn social skills by playing games?
A blogpost on human-model interaction, games, training and testing LLMs
research.ibm.com/blog/LLM-soc...
🤖📈🧠

Can LLMs learn social skills by playing games?

A new open-source framework, TextArena, pits large language models against each other in competitive environments designed to test and improve their communication skills.

research.ibm.com

July 24, 2025 at 2:16 PM

Reposted by Valerio Pepe

Tomer Ullman

@tomerullman.bsky.social

me: "I want my research to be featured in the NYT"

Genie: (grinning)

June 23, 2025 at 1:30 PM

Valerio Pepe

@valeriopepe.bsky.social

New blog post! www.lesswrong.com/posts/qHudHZ...

Following Emergent Misalignment, we show that finetuning even a single layer via LoRA on insecure code can induce toxic outputs in Qwen2.5-Coder-32B-Instruct, and that you can extract steering vectors to make the base model similarly misaligned 🧵

Emergent Misalignment on a Budget — LessWrong

TL;DR We reproduce emergent misalignment (Betley et al. 2025) in Qwen2.5-Coder-32B-Instruct using single-layer LoRA finetuning, showing that tweaking…

www.lesswrong.com

June 8, 2025 at 8:39 PM

Reposted by Valerio Pepe

Hokin

@hokin.bsky.social

Sam is 100% correct on this. Indeed, human babies have essential cognitive priors such as permanence, continuity, and boundary of objects, 3D Euclidean understanding of space, etc.

We spent 2 years to systematically to examine and show the lack of such in MLLMs: arxiv.org/abs/2410.10855

May 24, 2025 at 5:55 AM

Reposted by Valerio Pepe

Sam Gershman

@gershbrain.bsky.social

I think the BabyLM Challenge is really interesting, but also feel that there is something fundamentally ill-posed about how it maps onto the challenge facing human children. It's true that babies only get a relatively limited amount of linguistic experience, but...

The Transmitter @thetransmitter.bsky.social · May 19

Alona Fyshe @alonaf.bsky.social on the BabyLM Challenge, a competition that trains language models (LMs) on smaller datasets, more akin to how a baby learns, in search of solutions to some of the major challenges of today’s LLMs.

#NeuroAI #neuroskyence

www.thetransmitter.org/neuroai/the-...

Can babies inspire more efficient learning algorithms?

A competition that trains language models on smaller datasets, more akin to how a baby learns, seeks solutions to some of LLM’s major challenges.

www.thetransmitter.org

May 19, 2025 at 3:24 PM

Reposted by Valerio Pepe

Cognition

@cognitionjournal.bsky.social

Word learning is usually about what a word does refer to. But can toddlers learn from what it doesn’t?

Our new Cognition paper shows 20-month-olds use negative evidence to infer novel word meanings, reshaping theories of language development.

www.sciencedirect.com/science/arti...

May 20, 2025 at 4:12 PM

Valerio Pepe

@valeriopepe.bsky.social

Let's go Lio!!!

Cognitive Science Society @cogscisociety.bsky.social · Apr 22

Francesco Poli – 2024 PhD Thesis “Developing models for learning and exploration” from the University of Cambridge

Lionel Wong – 2024 PhD Thesis “From Words to World: Bridging Language and Thought” from @stanford.edu

Visit cognitivesciencesociety.org/glushko-diss... to learn more!

April 23, 2025 at 12:14 AM

Reposted by Valerio Pepe

Damien Masson

@damienhci.bsky.social

Photoshop for text. In our #CHI2025 paper “Textoshop”, we explore how interactions inspired by drawing software can help edit text. We consider words as pixels, sentences as regions, and tones as colours. #HCI #NLProc #LLMs #AI Thread 🧵
(1/10)

March 18, 2025 at 9:28 PM

Valerio Pepe

@valeriopepe.bsky.social

saw a new pika model was out on twitter & robot-gasoline-bench does not disappoint

January 28, 2025 at 3:52 PM

Reposted by Valerio Pepe

ACL

@aclmeeting.bsky.social

All the ACL chapters are here now: @aaclmeeting.bsky.social @emnlpmeeting.bsky.social @eaclmeeting.bsky.social @naaclmeeting.bsky.social #NLProc

November 19, 2024 at 3:48 AM

Reposted by Valerio Pepe

Griffiths Computational Cognitive Science Lab

@cocoscilab.bsky.social

(1/5) Very excited to announce the publication of Bayesian Models of Cognition: Reverse Engineering the Mind. More than a decade in the making, it's a big (600+ pages) beautiful book covering both the basics and recent work: mitpress.mit.edu/978026204941...

November 19, 2024 at 9:33 PM

Valerio Pepe

@valeriopepe.bsky.social

Hello bsky! I'm Valerio, an undergrad at Harvard studying computer science and cognitive science.

I'm interested in the inductive biases that make language learning and reasoning so easy for us humans, and what their analogues are in machines.

If you're around Boston, I would love to grab coffee!

November 16, 2024 at 10:53 PM

Reposted by Valerio Pepe

xuan (ɕɥɛn / sh-yen)

@xuanalogue.bsky.social

Okay the people requested one so here is an attempt at a Computational Cognitive Science starter pack -- with apologies to everyone I've missed! LMK if there's anyone I should add!

go.bsky.app/KDTg6pv

November 11, 2024 at 5:27 PM

Valerio Pepe

@valeriopepe.bsky.social

Excited to share this article I authored with Merrick Pierson Smela and George Church at the Wyss Institute! We present SeqVerify, the first automated pipeline for quality-control of whole-genome sequencing data, for all your stem-cell needs!