Anna Tsvetkov
annatsv.bsky.social
Anna Tsvetkov
@annatsv.bsky.social
Postdoc @ Princeton AI Lab
Natural and Artificial Minds
Prev: PhD @ Brown, MIT FutureTech
Website: https://annatsv.github.io/
Some tasks admit dif algorithms that behave the same on the training data, so a model’s learned mechanism can look arbitrary unless we know what the task requires (the goals, constraints, and invariances that define a correct solution)

❓ Other cases like this or other limits of mech interp?
🧵 (2/2)
The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks
Do neural networks, trained on well-understood algorithmic tasks, reliably rediscover known algorithms for solving those tasks? Several recent studies, on tasks ranging from group arithmetic to in-con...
arxiv.org
November 25, 2025 at 11:55 PM
Introspection targets our ongoing or recently past mental states. What could it mean for a system that lacks any obvious analogue of a continuous stream of experience to have current or recently past “internal states” to introspect on?

Robert Long makes a similar point in his substack
November 1, 2025 at 6:33 PM
Would love to be included!
November 23, 2024 at 8:21 PM