Lightnews — Scholar-powered news

Reposted

Dang Nguyen

@divingwithorcas.bsky.social

HR Simulator™: a game where you gaslight, deflect, and “let’s circle back” your way to victory.
Every email a boss fight, every “per my last message” a critical hit… or maybe you just overplayed your hand 🫠
Can you earn Enlightened Bureaucrat status?

(link below!)

September 26, 2025 at 6:41 PM

Reposted

chenhaotan.bsky.social

@chenhaotan.bsky.social

Prompting is our most successful tool for exploring LLMs, but the term evokes eye-rolls and grimaces from scientists. Why? Because prompting as scientific inquiry has become conflated with prompt engineering.

This is holding us back. 🧵and new paper with @ari-holtzman.bsky.social .

July 9, 2025 at 8:07 PM

aryanshri123.bsky.social

@aryanshri123.bsky.social

🤫Jailbreak prompts make aligned LMs produce harmful responses.🤔But is that info linearly decodable?

↗️We show many refused concepts are linearly represented, sometimes persist through instruction-tuning, and may also shape downstream behavior❗

arxiv.org/abs/2507.00239
🧵1/

July 3, 2025 at 8:07 PM

aryanshri123.bsky.social

@aryanshri123.bsky.social

We expose the "absence blindness" in the best LLMs, even when considering relatively short documents. Using LLMs as a judge or as graders may not be so reliable. Looking forward to see what comes out of this!

Harvey Fu @harveyfu.bsky.social · Jun 20

LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing?

🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative spaces”.
Paper: arxiv.org/abs/2506.11440

🧵[1/n]

June 20, 2025 at 10:21 PM

Reposted

Harvey Fu

@harveyfu.bsky.social

LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing?

🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative spaces”.
Paper: arxiv.org/abs/2506.11440

🧵[1/n]

June 20, 2025 at 10:03 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news