James Michaelov
jamichaelov.bsky.social
James Michaelov
@jamichaelov.bsky.social
Postdoc at MIT. Research: language, the brain, NLP.

jmichaelov.com
In the most extreme case, LMs assign sentences such as ‘the car was given a parking ticket by the explorer’ (unlikely but possible event) a lower probability than ‘the car was given a parking ticket by the brake’ (animacy-violating event, semantically-related final word) over half of the time. 2/3
June 12, 2025 at 5:54 PM
I’ve had success using the infini-gram API for this (though it can get overloaded with user requests at times): infini-gram.io
Home
infini-gram.io
February 8, 2025 at 12:40 PM
I don’t think this is quite what you’re looking for, but @camrobjones.bsky.social recently ran some Turing-test-style studies and found that some people believed ELIZA to be a human (and participants were asked to give reasons for their responses)
December 3, 2024 at 1:09 PM
Seems like a great initiative to have some of these location-based ones! I’d love to be added if possible!
November 19, 2024 at 4:17 PM
If there’s still space (and you accept postdocs), could I be added?
November 11, 2024 at 6:54 PM
Thanks for creating this list - looks great! I’d love to be added if there’s still room
November 11, 2024 at 6:46 PM
Thank you!
November 11, 2024 at 12:15 PM
If there’s still room, is there any chance you could add me to this list?
November 11, 2024 at 11:40 AM
Also, I’m going to be attending EMNLP next week - reach out if you want to meet/chat
November 10, 2024 at 7:34 PM
Anyway, excited to learn and chat about about research along these lines and beyond here on Bluesky!
November 10, 2024 at 7:34 PM
Of course, none of this work would have been possible without my amazing PhD advisor Ben Bergen, and my other great collaborators: Seana Coulson, @catherinearnett.bsky.social, Tyler Chang, Cyma Van Petten, and Megan Bardolph!
November 10, 2024 at 7:34 PM
5: Recurrent models like RWKV and Mamba have recently emerged as viable alternatives to transformers. While they are intuitively more cognitively plausible, when used to model human language processing, how do they compare transformers? We find that they perform about the same overall:
Revenge of the Fallen? Recurrent Models Match Transformers at...
Transformers have generally supplanted recurrent neural networks as the dominant architecture for both natural language processing tasks and for modelling the effect of predictability on online...
openreview.net
November 10, 2024 at 7:34 PM
4: Is the N400 sensitive only to the predicted probability of the stimuli encountered, or also the predicted probability of alternatives? We revisit this question with state-of-the-art NLP methods, with the results supporting the former hypothesis:
Ignoring the alternatives: The N400 is sensitive to stimulus preactivation alone
The N400 component of the event-related brain potential is a neural signal of processing difficulty. In the language domain, it is widely believed to …
www.sciencedirect.com
November 10, 2024 at 7:34 PM
3: The N400, a neural index of language processing, is highly sensitive to the contextual probability of words. But to what extent can lexical prediction explain other N400 phenomena? Using GPT-3, we show that it can implicitly account for both semantic similarity and plausibility effects:
Strong Prediction: Language Model Surprisal Explains Multiple N400 Effects
Abstract. Theoretical accounts of the N400 are divided as to whether the amplitude of the N400 response to a stimulus reflects the extent to which the stimulus was predicted, the extent to which the s...
doi.org
November 10, 2024 at 7:34 PM
2: Do multilingual language models learn that different languages can have the same grammatical structures? We use the structural priming paradigm from psycholinguistics to provide evidence that they do:
Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models
James Michaelov, Catherine Arnett, Tyler Chang, Ben Bergen. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023.
aclanthology.org
November 10, 2024 at 7:34 PM
If you’re interested in hearing more of my thoughts on this topic, check out this article at Communications of the ACM by Sandrine Ceurstemont that includes quotes from an interview with me and my co-author Ben Bergen:
Bigger, Not Necessarily Better
The inverse scaling issue means larger LLMs sometimes handle things less well.
cacmb4.acm.org
November 10, 2024 at 7:34 PM
1. Training language models on more data generally improves their performance, but is this always the case? We show that inverse scaling can occur not just across models of different sizes, but also in individual models over the course of training:
Emergent Inabilities? Inverse Scaling Over the Course of Pretraining
James Michaelov, Ben Bergen. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023.
aclanthology.org
November 10, 2024 at 7:34 PM