Arianna Bisazza
banner
arianna-bis.bsky.social
Arianna Bisazza
@arianna-bis.bsky.social
Associate Professor at GroNLP ( @gronlp.bsky.social‬ ) #NLP | Multilingualism | Interpretability | Language Learning in Humans vs NeuralNets | Mum^2

Head of the InClow research group: https://inclow-lm.github.io/
Through repeated interactions & shifts in communication needs, the lexicon of a community evolves, eventually leading to language change

We show that NN simulations can help us unravel these complex processes, next to human experiments & corpus studies

See @yuqing0304.bsky.social’s thread below ⬇️
November 6, 2025 at 9:07 PM
- neural-agent simulations of language change (@yuqing0304.bsky.social)
- child-directed language & syntax learning in LMs (@frap98.bsky.social)
- Turkish benchmark of grammatical minimal pairs (@ezgibasar.bsky.social) & a massively multilingual one, MultiBLiMP (@jumelet.bsky.social)

...and more!
October 31, 2025 at 10:50 PM
InCLow topics #EMNLP2025:

- MT error prediction techniques & its reception by professional translators (@gsarti.com)
- thinking language in Large Reasoning Models (@jiruiqi.bsky.social)
- effect of stereotypes on LLM’s implicit personalization (@veraneplenbroek.bsky.social)

....
October 31, 2025 at 10:50 PM
We hope our work will advance the evaluation of LLMs in Turkish and, in general, encourage more research on the robustness of modern language technologies to typological diversity.
June 19, 2025 at 4:28 PM
Finally, our experimental paradigms reveal that even LLMs excelling on general minimal pairs can be brittle to variations in word orders & subordination strategies, unlike human speakers.

See paper for results with 13 LLMs, including mono- and multilingual models of different sizes!
June 19, 2025 at 4:28 PM
We also collect human acceptability judgements & show that *overall* harder phenomena for LLMs are also harder for people, but there are some notable exceptions.
June 19, 2025 at 4:28 PM
TurBLiMP expands the shortlist of existing language-specific BLiMPs with 2 important properties: high word order freedom & agglutination.

To study LLMs' robustness to these properties, we create experimental paradigms testing syntactic skills w/ different word orders & subordination strategies:
June 19, 2025 at 4:28 PM
This is hard, slow-paced work going well beyond benchmark translation (let alone LLM-assisted benchmark generation!) It requires real *linguistic* expertise & long discussions on what makes a phenomenon representative of a language. Here's our proposal, inspired by EnglishBLiMP w/ major adaptations:
June 19, 2025 at 4:28 PM
Grammatical benchmarks are essential to drive progress in truly multilingual Language Modeling & to overcome the linguistic biases we inherit from the English-centeredness of our field.

I'm particularly happy to contribute to this for a language I spent years learning and still found fascinating!
June 19, 2025 at 4:28 PM
Happy to hear you find the analysis useful, Marco! If you have any extra questions, don’t hesitate to contact @jiruiqi.bsky.social
June 5, 2025 at 8:59 AM