Lightnews — Scholar-powered news

Melanie Sclar

@melaniesclar.bsky.social

Check out our work on preference modeling through latent (& interpretable) attribute representation learning!

PrefPalette allows you to understand _why_ something is preferred and _how_ preference varies depending on context 🎨

Stella Li @stellali.bsky.social · Jul 22

WHY do you prefer something over another?

Reward models treat preference as a black-box😶‍🌫️but human brains🧠decompose decisions into hidden attributes

We built the first system to mirror how people really make decisions in our recent COLM paper🎨PrefPalette✨

Why it matters👉🏻🧵

July 22, 2025 at 7:52 PM

Reposted by Melanie Sclar

Stella Li

@stellali.bsky.social

WHY do you prefer something over another?

Reward models treat preference as a black-box😶‍🌫️but human brains🧠decompose decisions into hidden attributes

We built the first system to mirror how people really make decisions in our recent COLM paper🎨PrefPalette✨

Why it matters👉🏻🧵

July 22, 2025 at 2:59 PM

Melanie Sclar

@melaniesclar.bsky.social

See our work on procedurally generating challenging reasoning problems on detecting inconsistencies in stories! FlawedFictions is a great example of what I'm most excited about: reliable synthetic data for reasoning in under-explored domains.

(I'm at ICLR to chat, DMs open!)

Kabir Ahuja @kabirahuja2431.bsky.social · Apr 22

📢 New Paper!

Tired 😴 of reasoning benchmarks full of math & code? In our work we consider the problem of reasoning for plot holes in stories -- inconsistencies in a storyline that break the internal logic or rules of a story’s world 🌎

W @melaniesclar.bsky.social, and @tsvetshop.bsky.social

1/n

A screenshot of the first page of the paper, containing the paper title: Finding Flawed Fictions: Evaluating Complex Reasoning in Language Models via Plot Hole Detection and the names of the authors: Kabir Ahuja, Melanie Sclar, and Yulia Tsvetkov. All the three authors are from CSE department in the University of Washington in Seattle, USA. They can be reached at {kahuja,msclar,yuliats}@cs.washington.edu

April 24, 2025 at 2:26 AM

Melanie Sclar

@melaniesclar.bsky.social

Excited to be at #ICLR2025 🤩

I'll be giving an oral presentation for Creativity Index on Fri 25th 11:06, Garnet 212&219 🎙️

I'll also be presenting posters:
📍ExploreToM, Sat 26th 10:00, Hall 3 + 2B #49
📍CreativityIndex, Fri 25th 15:00, Hall 3 + 2B #618

Hope to see you there!

April 24, 2025 at 2:25 AM

Reposted by Melanie Sclar

Kabir Ahuja

@kabirahuja2431.bsky.social

📢 New Paper!

Tired 😴 of reasoning benchmarks full of math & code? In our work we consider the problem of reasoning for plot holes in stories -- inconsistencies in a storyline that break the internal logic or rules of a story’s world 🌎

W @melaniesclar.bsky.social, and @tsvetshop.bsky.social

1/n

April 22, 2025 at 6:50 PM

Reposted by Melanie Sclar

Hyunwoo Kim

@hyunwoo-kim.bsky.social

🚨New Paper! So o3-mini and R1 seem to excel on math & coding. But how good are they on other domains where verifiable rewards are not easily available, such as theory of mind (ToM)? Do they show similar behavioral patterns? 🤔 What if I told you it's...interesting, like the below?🧵

February 20, 2025 at 5:34 PM

Reposted by Melanie Sclar

Jack Hessel

@jmhessel.bsky.social

LLMs generate novel word sequences not contained in their pretraining data. However, compared to humans, models generate significantly fewer novel n-grams.

RLHF = 30% *more* copying than base!

Awesome work from the awesome Ximing Lu (gloriaximinglu.github.io) et al. 🤩

arxiv.org/pdf/2410.04265

A screenshot from the linked paper's figure 1. The figure is a pretty-complicated three column figure, but --- in essence, it sketches out how the authors compare llm sequences to the pretraining data / human authors to the pretraining data. Humans write more novel n-gram sequences.

November 22, 2024 at 6:14 AM

Reposted by Melanie Sclar

Ximing Lu

@gximing.bsky.social

Are LLMs 🤖 as creative as humans 👩‍🎓? Not quite!

Introducing CREATIVITY INDEX: a metric that quantifies the linguistic creativity of a text by reconstructing it from existing text snippets on the web. Spoiler: professional human writers like Hemingway are still far more creative than LLMs! 😲

November 22, 2024 at 2:00 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news