Lightnews — Scholar-powered news

Reposted by Itay Itzhak @ COLM 🍁

Multilingual Representation Workshop @ EMNLP 2025

@mrl-workshop.bsky.social

Introducing Global PIQA, a new multilingual benchmark for 100+ languages. This benchmark is the outcome of this year’s MRL shared task, in collaboration with 300+ researchers from 65 countries. This dataset evaluates physical commonsense reasoning in culturally relevant contexts.

October 29, 2025 at 3:50 PM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

Had a blast at CoLM! It really was as good as everyone says, congrats to the organizers 🎉
This week I’ll be in New York giving talks at NYU, Yale, and Cornell Tech.
If you’re around and want to chat about LLM behavior, safety, interpretability, or just say hi - DM me!

October 13, 2025 at 4:19 PM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

Thrilled to be part of this work led by
@adisimhi.bsky.social !

ManagerBench reveals a critical problem:
✅ LLMs can recognize harm
❌ But often choose it anyway to meet goals
🤖 Or overcorrect and become ineffective
We need better balance!

A must-read for safety folks!

Martin Tutek @mtutek.bsky.social · Oct 8

🤔What happens when LLM agents choose between achieving their goals and avoiding harm to humans in realistic management scenarios? Are LLMs pragmatic or prefer to avoid human harm?

🚀 New paper out: ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs🚀🧵

October 8, 2025 at 3:22 PM

Reposted by Itay Itzhak @ COLM 🍁

Yonatan Belinkov ✈️ COLM2025

@boknilev.bsky.social

Traveling to #COLM2025 this week, and here's some work from our group and collaborators:
Cognitive biases, hidden knowledge, CoT faithfulness, model editing, and LM4Science
See the thread for details and reach out if you'd like to discuss more!

October 7, 2025 at 1:41 PM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

At #ACL2025 and not sure what to do next? GEM 💎² is the place to be for awesome talks on the future of LLM evaluation. Come hear @GabiStanovsky, @EliyaHabba, @LChoshen and others rethink what it means to actually evaluate LLMs beyond accuracy and vibes. Thursday @ Hall C!

July 30, 2025 at 7:04 PM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

In Vienna for #ACL2025, and already had my first (vegan) Austrian sausage!

Now hungry for discussing:
– LLMs behavior
– Interpretability
– Biases & Hallucinations
– Why eval is so hard (but so fun)
Come say hi if that’s your vibe too!

July 27, 2025 at 6:11 AM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

🚨New paper alert🚨

🧠
Instruction-tuned LLMs show amplified cognitive biases — but are these new behaviors, or pretraining ghosts resurfacing?

Excited to share our new paper, accepted to CoLM 2025🎉!
See thread below 👇
#BiasInAI #LLMs #MachineLearning #NLProc

July 15, 2025 at 1:38 PM

Reposted by Itay Itzhak @ COLM 🍁

Fazl Barez

@fbarez.bsky.social

Excited to share our paper: "Chain-of-Thought Is Not Explainability"! We unpack a critical misconception in AI: models explaining their steps (CoT) aren't necessarily revealing their true reasoning. Spoiler: the transparency can be an illusion. (1/9) 🧵

July 1, 2025 at 3:41 PM

Reposted by Itay Itzhak @ COLM 🍁

Sebastian Gehrmann

@sebgehr.bsky.social

Are you recovering from your @colmweb.org abstract submission? GEM has a non-archival track that allows you to submit a two-page abstract in parallel?

Our workshop deadline is soon, please consider submitting your evaluation paper!

You can find our call for papers at gem-benchmark.com/workshop

March 24, 2025 at 3:36 PM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

New paper alert!

Curious how small prompt tweaks impact LLM accuracy but don’t want to run endless inferences? We got you. Meet DOVE - a dataset built to uncover these sensitivities.

Use DOVE for your analysis or contribute samples -we're growing and welcome you aboard!

Eliya Habba @eliyahabba.bsky.social · Mar 17

Care about LLM evaluation? 🤖 🤔

We bring you ️️🕊️ DOVE a massive (250M!) collection of LLMs outputs
On different prompts, domains, tokens, models...

Join our community effort to expand it with YOUR model predictions & become a co-author!

March 17, 2025 at 4:33 PM

Reposted by Itay Itzhak @ COLM 🍁

Tal Haklay

@talhaklay.bsky.social

1/13 LLM circuits tell us where the computation happens inside the model—but the computation varies by token position, a key detail often ignored!
We propose a method to automatically find position-aware circuits, improving faithfulness while keeping circuits compact. 🧵👇

March 6, 2025 at 10:15 PM

Reposted by Itay Itzhak @ COLM 🍁

Martin Tutek

@mtutek.bsky.social

🚨🚨 New preprint 🚨🚨

Ever wonder whether verbalized CoTs correspond to the internal reasoning process of the model?

We propose a novel parametric faithfulness approach, which erases information contained in CoT steps from the model parameters to assess CoT faithfulness.

arxiv.org/abs/2502.14829

Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps

When prompted to think step-by-step, language models (LMs) produce a chain of thought (CoT), a sequence of reasoning steps that the model supposedly used to produce its prediction. However, despite mu...

arxiv.org

February 21, 2025 at 12:43 PM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

We usually blame hallucinations on uncertainty or missing knowledge. But what if I told you that LLMs hallucinate even when they *know* the correct answer - and they do it with *high certainty* 🤯?
Check out our new paper that challenges assumptions on AI trustworthiness! 🧵👇

Adi Simhi @adisimhi.bsky.social · Feb 19

🚨New arXiv preprint!🚨
LLMs can hallucinate - but did you know they can do so with high certainty even when they know the correct answer? 🤯
We find those hallucinations in our latest work with @itay-itzhak.bsky.social, @fbarez.bsky.social, @gabistanovsky.bsky.social and Yonatan Belinkov

February 19, 2025 at 3:55 PM

Reposted by Itay Itzhak @ COLM 🍁

Sebastian Gehrmann

@sebgehr.bsky.social

GEM is so back! Our workshop for Generation, Evaluation, and Metrics is coming to an ACL near you.

Evaluation in the world of GenAI is more important than ever, so please consider submitting your amazing work.

CfP can be found at gem-benchmark.com/workshop

February 12, 2025 at 2:25 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news