Lightnews — Scholar-powered news

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

NLProc, deep learning, and machine learning. Ph.D. student @ Technion and The Hebrew University.
https://itay1itzhak.github.io/

Posts Replies Media Videos

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

Had a blast at CoLM! It really was as good as everyone says, congrats to the organizers 🎉
This week I’ll be in New York giving talks at NYU, Yale, and Cornell Tech.
If you’re around and want to chat about LLM behavior, safety, interpretability, or just say hi - DM me!

October 13, 2025 at 4:19 PM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

In Vienna for #ACL2025, and already had my first (vegan) Austrian sausage!

Now hungry for discussing:
– LLMs behavior
– Interpretability
– Biases & Hallucinations
– Why eval is so hard (but so fun)
Come say hi if that’s your vibe too!

July 27, 2025 at 6:11 AM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

🔄 Step 2: Cross-tuning.
We swap instruction datasets between models with different pretraining.
Result: Biases follow the pretrained model!

PCA clearly shows models group by pretraining base, not by instruction.
The bias “signature” stays intact, no matter the finetuning!

July 15, 2025 at 1:38 PM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

🎲 Step 1: Training randomness.
We finetune the same model 3× with different seeds.
Result: Some variation in bias scores, but behavior patterns stay stable compared to MMLU variance.
✅ Aggregating across seeds reveals consistent trends.

July 15, 2025 at 1:38 PM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

🧪 We introduce a two-step causal framework to disentangle the effects of:
- Pretraining
- Instruction tuning
- Training randomness

- 🍁 Bottom line - pretraining is the origin of bias. Finetuning? Just the messenger
#CausalInference #TrustworthyAI #NLP

July 15, 2025 at 1:38 PM

Itay Itzhak @ COLM 🍁

@itay-itzhak.bsky.social

🚨New paper alert🚨

🧠
Instruction-tuned LLMs show amplified cognitive biases — but are these new behaviors, or pretraining ghosts resurfacing?

Excited to share our new paper, accepted to CoLM 2025🎉!
See thread below 👇
#BiasInAI #LLMs #MachineLearning #NLProc

July 15, 2025 at 1:38 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news