Lightnews — Scholar-powered news

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

Like ~everyone, I'll also be at #NeurIPS this week! Please reach out to chat about past (goal representations, cognitive science, intrep) or current interests (LLM mental state inference, social environments for RL). Also if you have leads on great coffee, craft beer, or tacos.

December 1, 2025 at 8:29 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

Belated update #2: my year at Meta FAIR through the AIM program was so nice that I’m sticking around for the long haul.

I’m excited to stay at FAIR and work with @asli-celikyilmaz.bsky.social and friends on fun LLM questions; I’ll be working from the New York office so we’re sticking around.

September 19, 2025 at 5:27 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

Belated update #1: I defended my PhD about a month ago! I appreciate the warm reception from everyone who made it in-person and virtually. Thanks to my committee, @lerrelpinto.com, @togelius.bsky.social, and @markkho.bsky.social for your feedback and fun questions.

September 17, 2025 at 7:46 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

Friends and virtual acquaintances! I’m defending my PhD tomorrow morning at 11:30 AM ET. If anyone would like to watch, let me know and I’ll send you the Zoom link (and if you’re in NYC and feel compelled to join in person, that works, too!)

August 6, 2025 at 6:41 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

#CogSci2025 friends! I'm here all week and would love to chat. I'd particularly love to talk to anyone thinking about Theory of Mind and how to evaluate it better (in both minds and machines, in different settings and contexts), and about goals and their representations. Find me at:

July 30, 2025 at 3:47 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

Cool new work on localizing and removing concepts using attention heads from colleagues at NYU and Meta!

Karen Ullrich (s/h) ✈️ Neurips @karen-ullrich.bsky.social · Jul 8

How would you make an LLM "forget" the concept of dog — or any other arbitrary concept? 🐶❓

We introduce SAMD & SAMI — a novel, concept-agnostic approach to identify and manipulate attention modules in transformers.

July 8, 2025 at 1:54 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

You (yes, you!) should work with Sydney! Either short-term this summer, or longer term at her nascent lab at NYU!

Sydney Levine @sydneylevine.bsky.social · Jun 6

🔆 I'm hiring! 🔆

There are two open positions:

1. Summer research position (best for master's or graduate student); focus on computational social cognition.
2. Postdoc (currently interviewing!); focus on computational social cognition and AI safety.

sites.google.com/corp/site/sy...

Sydney Levine - Open Positions

Summer Research Position I am seeking a part-time or full-time researcher for the summer (starting asap) to bring a project to completion. The project asks the question: do people around the world u...

sites.google.com

June 6, 2025 at 6:15 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

Fantastic new work by @johnchen6.bsky.social (with @brendenlake.bsky.social and me trying not to cause too much trouble).

We study systematic generalization in a safety setting and find LLMs struggle to consistently respond safely when we vary how we ask naive questions. More analyses in the paper!

John (Yueh-Han) Chen @johnchen6.bsky.social · May 29

Do LLMs show systematic generalization of safety facts to novel scenarios?

Introducing our work SAGE-Eval, a benchmark consisting of 100+ safety facts and 10k+ scenarios to test this!

- Claude-3.7-Sonnet passes only 57% of facts evaluated
- o1 and o3-mini passed <45%! 🧵

May 30, 2025 at 5:32 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

New preprint alert! We often prompt ICL tasks using either demonstrations or instructions. How much does the form of the prompt matter to the task representation formed by a language model? Stick around to find out 1/N

May 23, 2025 at 5:38 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

Another banger from Jenn, Felix, and Tomer that jumps right to the top of my reading list.

Tomer Ullman @tomerullman.bsky.social · Mar 6

new preprint on Theory of Mind in LLMs, a topic I know a lot of people care about (I care. I'm part of people):

"Re-evaluating Theory of Mind evaluation in large language models"

(by Hu* @jennhu.bsky.social , Sosa, and me)

link: arxiv.org/pdf/2502.21098

March 6, 2025 at 6:03 PM

Reposted by Guy Davidson ✈️ NeurIPS 2025

Brenden Lake

@brendenlake.bsky.social

I snuck a moment with my son Logan (2.5), ever the creative goal generator, into Fig. 1: "Papa, I made a Truck Carrier Truck!"
How do people compose existing concepts to create new goals? Can models generate and understand goals too?
nature.com/articles/s4225

February 21, 2025 at 6:41 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

Out today in Nature Machine Intelligence!

From childhood on, people can create novel, playful, and creative goals. Models have yet to capture this ability. We propose a new way to represent goals and report a model that can generate human-like goals in a playful setting... 1/N

February 21, 2025 at 4:29 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

I've been really enjoying the new Gemini 2.0-Flash as my go-to 'how do I do this', but the Experimental tag is there for a reason. The funniest failure mode I've had yet:

January 22, 2025 at 9:24 PM

Reposted by Guy Davidson ✈️ NeurIPS 2025

xuan (ɕɥɛn / sh-yen)

@xuanalogue.bsky.social

Made a second starter pack for folks in Computational Cognitive Science since the first one is now full! Again, let me know if you'd like to be added :)

Pack I: go.bsky.app/KDTg6pv
Pack II: go.bsky.app/TTjTNsu

xuan (ɕɥɛn / sh-yen) @xuanalogue.bsky.social · Nov 11

Okay the people requested one so here is an attempt at a Computational Cognitive Science starter pack -- with apologies to everyone I've missed! LMK if there's anyone I should add!

go.bsky.app/KDTg6pv

January 15, 2025 at 2:18 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

Big, if true
(Definitely true)
(Not very big; might grow up to be!)
(Meet Lila!!)

December 26, 2024 at 9:17 PM

Reposted by Guy Davidson ✈️ NeurIPS 2025

Najoung Kim

@najoung.bsky.social

Repost appreciated! 🙏

ACL 2025 Ling theory & Cognitive modeling track is looking for emergency reviewers. The emergency review period is between 3/18-26, and these reviewers will be excluded from the ARR cycle. If you're interested, please sign up here! docs.google.com/forms/d/1fH7...

ACL 2025 Ling theory & Cognitive modeling track emergency reviewer volunteer form

The Linguistic Theories, Cognitive Modeling, and Psycholinguistics track at ACL 2025 is looking for emergency reviewers. The emergency reviews will take place between 18th to 26th of March, 2025. Thes...

docs.google.com

December 18, 2024 at 3:37 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

I, for one, enjoyed the sermon delivered to us by the high priest of deep learning

December 13, 2024 at 10:47 PM

Reposted by Guy Davidson ✈️ NeurIPS 2025

Andrew Lampinen

@lampinen.bsky.social

What counts as in-context learning (ICL)? Typically, you might think of it as learning a task from a few examples. However, we’ve just written a perspective (arxiv.org/abs/2412.03782) suggesting interpreting a much broader spectrum of behaviors as ICL! Quick summary thread: 1/7

The broader spectrum of in-context learning

The ability of language models to learn a task from a few examples in context has generated substantial interest. Here, we provide a perspective that situates this type of supervised few-shot learning...

arxiv.org

December 10, 2024 at 6:17 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

Friends and frenemies! I'll be at #NeurIPS2024 all week and would love to catch up! Let's grab coffee/food and talk about goal generation and representation, how to think about goals with/in/for LLMs, cognitive science in 2025+, or give me advice for getting into (mech.) interpretability research!

December 8, 2024 at 11:28 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

New paper alert! There's a wealth of evidence that infants categorize spatial relations starting from 3-4 months onward. Can we evaluate neural networks in similar paradigms to babies, without special relational training? If so, what do we learn? 1/N (x-post from the other place)

Spatial relation categorization in infants and deep neural networks

Spatial relations, such as above, below, between, and containment, are important mediators in children’s understanding of the world (Piaget, 1954). Th…

www.sciencedirect.com

February 12, 2024 at 11:39 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

Today! Come see talks by some of the wonderful folks who inspire this line of work, and chat with early career researchers (incl. my amazing [twitterless?] collaborator Graham Todd and I) about our posters!
Room 260-262, starting now and until our intrinsic motivation runs out!

Guy Davidson ✈️ NeurIPS 2025 @guydav.bsky.social · Dec 9

I'm on my way to #NeurIPS2023! Let's chat about any and all of the following:
- How people represent (cognitive) goals, and how can machines generate such goals (presented at IMOL workshop on Saturday)
- The potential benefits of richer, more structured goal representations... (1/3)

The title, author list, and abstract of our NeurIPS workshop paper titled "Generating Human-Like Goals by Synthesizing Reward-Producing Programs"

December 16, 2023 at 3:52 PM

Guy Davidson ✈️ NeurIPS 2025

@guydav.bsky.social

I'm on my way to #NeurIPS2023! Let's chat about any and all of the following:
- How people represent (cognitive) goals, and how can machines generate such goals (presented at IMOL workshop on Saturday)
- The potential benefits of richer, more structured goal representations... (1/3)

December 9, 2023 at 7:13 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news