Valerio Pepe
banner
valeriopepe.bsky.social
Valerio Pepe
@valeriopepe.bsky.social
Computer Science + Cognitive Science @harvard.edu, class of '26. Interested in language ∩ thought, language acquisition.

Visiting Student @MITCoCoSci @csail.mit.edu
Out of curiosity (and my own ignorance), how are teachers aware of students' socioeconomic backgrounds when the students are this young?

I can think of clothing as an immediate signal, and, over time, getting to know parents (and thus their occupations). Are these the main ways this is inferred?
September 5, 2025 at 5:18 PM
Reposted by Valerio Pepe
"for too long has my foot been allowed to carry my body" I say, as I load a shotgun and aim at it.
August 28, 2025 at 4:26 PM
> looking for a coffee
> have to judge if their coffee is burnt or flavorful
> "we have a Cimbali coffee machine"

> buy coffee

> it's burnt
July 18, 2025 at 6:55 PM
We take this as evidence that while misalignment directions may exist, the narrative is probably quite nuanced, and EM is not governed by a single vector, as some hypothesized in the aftermath of the original paper.

See it for yourself at:
www.lesswrong.com/posts/qHudHZ...
Emergent Misalignment on a Budget — LessWrong
TL;DR We reproduce emergent misalignment (Betley et al. 2025) in Qwen2.5-Coder-32B-Instruct using single-layer LoRA finetuning, showing that tweaking…
www.lesswrong.com
June 8, 2025 at 8:39 PM
However, the steered models often are more incoherent than the finetuned ones, suggesting that emergent misalignment is not entirely guided by a steering vector. The vectors themselves are also not very interpretable, so it is unclear what exactly they are capturing.
June 8, 2025 at 8:39 PM
The answer is: yes (sort of).

Though the finetune itself seems to be learning more than a single steering vector, extracting steering vectors and applying them (with sufficient scaling) to the same layer in an un-finetuned version of the model *does* elicit misaligned behavior.
June 8, 2025 at 8:39 PM
We finetuned a single layer, and show that on certain layers, this process renders the model nearly as misaligned as a full-layer finetune. This allows us to ask: can we capture this misalignment in a single steering vector taken from the layer?
June 8, 2025 at 8:39 PM
An interpretation of the original paper was that EM is mediated by a “misalignment direction” within the model, which the finetuning process changes, rendering the model much more toxic/misaligned.
June 8, 2025 at 8:39 PM
ai is truly revolutionary -- scientists hadn't previously considered what would happen if sally had simply eaten the marble instead, to know its location at all times
May 15, 2025 at 11:29 PM
fair, italy has some incredibly creative offensive slang -- fwiw my favorite usable roman insult ("porco dio" is too offensive for casual use) is "sei 'na pentola de facioli", "you are a pot of beans", i.e. you never stop muttering and talking
April 7, 2025 at 1:55 PM
as someone from Rome I'm currently sitting at my laptop like the mentats from Dune trying to figure out what words this could be referring to

we also take pride in preparing gnocchi incorrectly because the rest of italy can't make a decent carbonara to save their lives (no cream and no parmesan!)
April 7, 2025 at 1:30 PM
congratulations!!
March 27, 2025 at 2:14 PM
the icml keynote will be jensen huang speaking to an empty room
March 25, 2025 at 10:40 PM