Carolyn Anderson
carolynanderson.bsky.social
Carolyn Anderson
@carolynanderson.bsky.social
Wellesley CS professor and computational linguist. Studies meaning with computational and experimental tools.
Our new reasoning benchmark based on the NPR Sunday Puzzle's weekly challenge shows that o1 / o3-mini-high are significantly better at verbal reasoning than other models (i.e. R1)-- more below!
February 4, 2025 at 3:32 PM
we had two great students driving this paper-- Francesca Lucchetti (Northeastern) and Wellesley's Zixuan Wu, who is currently applying to PhD programs!

Francesca came up with a really neat way of visualizing edits to the information content of prompts
January 23, 2025 at 6:57 PM