Computational Linguistics @UPF
banner
colt-upf.bsky.social
Computational Linguistics @UPF
@colt-upf.bsky.social
Gemma Boleda, Marco Baroni, Thomas Brochhagen, Iria de Dios Flores | Computational Linguistics and Linguistic Theory Universitat Pompeu Fabra.

upf.edu/web/colt
Barcelona
Do you use a pronoun more often when the entity you’re talking about is more predictable?

Previous work offers diverging answers so we conducted a meta-analysis, combining data from 20 studies across 8 different languages.

Now out in Language: muse.jhu.edu/article/969615
October 8, 2025 at 1:20 PM
Reposted by Computational Linguistics @UPF
📢 Seminari de recerca organitzat pel COLT- URLING, "LLM and human language: representations, judgments, and historical change".

📆 29/09/2025
🕦 15:30
🎤 Adele Goldberg (Princeton University)
🚩55.410, Edifici Tànger del Campus Poblenou - UPF
ℹ️ ja.cat/wi2t7

@colt-upf.bsky.social
September 26, 2025 at 8:12 AM
Reposted by Computational Linguistics @UPF
📢 Seminari de recerca organitzat pel COLT- URLING, "Associative memory in psycholinguistics and in AI architectures".

📆 01/10/2025
🕦 12:00
🎤 Jakub Dotlačil
🚩55.410, Edifici Tànger del Campus Poblenou - UPF
ℹ️ ja.cat/U5xH2

@colt-upf.bsky.social
September 29, 2025 at 10:45 AM
Reposted by Computational Linguistics @UPF
New paper! 🚨 I argue that LLMs represent a synthesis between distributed and symbolic approaches to language, because, when exposed to language, they develop highly symbolic representations and processing mechanisms in addition to distributed ones.
arxiv.org/abs/2502.11856
September 30, 2025 at 1:16 PM
Reposted by Computational Linguistics @UPF
📢I am hiring a Postdoc to work on post-training methods for low-resource languages. Apply by August 15 employment.ku.dk/faculty/?sho....
Let's talk at #ACL2025NLP in Vienna if you want to know more about the position and life in Denmark.
Postdoc in Natural Language Processing
employment.ku.dk
July 7, 2025 at 12:47 PM
Reposted by Computational Linguistics @UPF
Evaluating topic models (and document clustering methods) is hard. In fact, since our paper critiquing standard evaluation practices four years ago, there hasn't been a good replacement metric

That ends today (we hope)! Our new ACL paper introduces an LLM-based evaluation protocol 🧵
July 8, 2025 at 12:40 PM
🎉New paper "Prediction Hubs are Context-Informed Frequent Tokens in LLMs" from our lab, accepted at ACL 2025!

If you're interested in representational geometry, come find Beatrix Nielsen and Marco Baroni at the poster :)
Our paper "Prediction Hubs are Context-Informed Frequent Tokens in LLMs" has been accepted at ACL 2025!

Main points:
1. Hubness is not a problem when language models do next-token prediction.
2. Nuisance hubness can appear when other comparisons are made.
July 8, 2025 at 11:33 AM
Today at UPF Campus de la Ciutadella at 2:30 pm! Come slightly earlier to check in!

Sala Polivalent 24S18

maps.app.goo.gl/n1hBxiviKcLW...
⭐ Registration open til May 27th! ⭐
Website: www.upf.edu/web/colt/sym...

June 2nd, UPF

𝗦𝗽𝗲𝗮𝗸𝗲𝗿 𝗹𝗶𝗻𝗲𝘂𝗽:
Arianna Bisazza (language acquisition with NNs)
Naomi Saphra (emergence in LLM training dynamics)
Jean-Rémi King (TBD)
Louise McNally (pitfalls of contextual/formal accounts of semantics)
Announcing the COLT Symposium on June 2nd!

𝗘𝗺𝗲𝗿𝗴𝗲𝗻𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗼𝗳 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗶𝗻 𝗺𝗶𝗻𝗱𝘀 𝗮𝗻𝗱 𝗺𝗮𝗰𝗵𝗶𝗻𝗲𝘀

What properties of language are emerging from work in experimental and theoretical linguistics, neuroscience & LLM interpretability?

Info: tinyurl.com/colt-site
Register: tinyurl.com/colt-register

🧵1/3
June 2, 2025 at 8:38 AM
📢 𝗟𝗼𝗰𝗮𝘁𝗶𝗼𝗻 𝗰𝗵𝗮𝗻𝗴𝗲📢

UPF Campus de la Ciutadella
**Sala Polivalent 24.S18**

Thank you for bearing with us!
Last day to sign up for the COLT Symposium!
Register: tinyurl.com/colt-register

📢 𝗟𝗼𝗰𝗮𝘁𝗶𝗼𝗻 𝗰𝗵𝗮𝗻𝗴𝗲📢
June 2nd, 14:30 - 19:00

UPF Campus de la Ciutadella
Room 40.101

maps.app.goo.gl/1216LJRsWmTE...
⭐ Registration open til May 27th! ⭐
Website: www.upf.edu/web/colt/sym...

June 2nd, UPF

𝗦𝗽𝗲𝗮𝗸𝗲𝗿 𝗹𝗶𝗻𝗲𝘂𝗽:
Arianna Bisazza (language acquisition with NNs)
Naomi Saphra (emergence in LLM training dynamics)
Jean-Rémi King (TBD)
Louise McNally (pitfalls of contextual/formal accounts of semantics)
May 29, 2025 at 9:46 AM
Last day to sign up for the COLT Symposium!
Register: tinyurl.com/colt-register

📢 𝗟𝗼𝗰𝗮𝘁𝗶𝗼𝗻 𝗰𝗵𝗮𝗻𝗴𝗲📢
June 2nd, 14:30 - 19:00

UPF Campus de la Ciutadella
Room 40.101

maps.app.goo.gl/1216LJRsWmTE...
⭐ Registration open til May 27th! ⭐
Website: www.upf.edu/web/colt/sym...

June 2nd, UPF

𝗦𝗽𝗲𝗮𝗸𝗲𝗿 𝗹𝗶𝗻𝗲𝘂𝗽:
Arianna Bisazza (language acquisition with NNs)
Naomi Saphra (emergence in LLM training dynamics)
Jean-Rémi King (TBD)
Louise McNally (pitfalls of contextual/formal accounts of semantics)
Announcing the COLT Symposium on June 2nd!

𝗘𝗺𝗲𝗿𝗴𝗲𝗻𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗼𝗳 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗶𝗻 𝗺𝗶𝗻𝗱𝘀 𝗮𝗻𝗱 𝗺𝗮𝗰𝗵𝗶𝗻𝗲𝘀

What properties of language are emerging from work in experimental and theoretical linguistics, neuroscience & LLM interpretability?

Info: tinyurl.com/colt-site
Register: tinyurl.com/colt-register

🧵1/3
May 26, 2025 at 10:44 AM
⭐ Registration open til May 27th! ⭐
Website: www.upf.edu/web/colt/sym...

June 2nd, UPF

𝗦𝗽𝗲𝗮𝗸𝗲𝗿 𝗹𝗶𝗻𝗲𝘂𝗽:
Arianna Bisazza (language acquisition with NNs)
Naomi Saphra (emergence in LLM training dynamics)
Jean-Rémi King (TBD)
Louise McNally (pitfalls of contextual/formal accounts of semantics)
Announcing the COLT Symposium on June 2nd!

𝗘𝗺𝗲𝗿𝗴𝗲𝗻𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗼𝗳 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗶𝗻 𝗺𝗶𝗻𝗱𝘀 𝗮𝗻𝗱 𝗺𝗮𝗰𝗵𝗶𝗻𝗲𝘀

What properties of language are emerging from work in experimental and theoretical linguistics, neuroscience & LLM interpretability?

Info: tinyurl.com/colt-site
Register: tinyurl.com/colt-register

🧵1/3
May 20, 2025 at 8:13 AM
Announcing the COLT Symposium on June 2nd!

𝗘𝗺𝗲𝗿𝗴𝗲𝗻𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗼𝗳 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗶𝗻 𝗺𝗶𝗻𝗱𝘀 𝗮𝗻𝗱 𝗺𝗮𝗰𝗵𝗶𝗻𝗲𝘀

What properties of language are emerging from work in experimental and theoretical linguistics, neuroscience & LLM interpretability?

Info: tinyurl.com/colt-site
Register: tinyurl.com/colt-register

🧵1/3
May 14, 2025 at 4:56 PM
Announcing the COLT Symposium on June 2nd!

𝗘𝗺𝗲𝗿𝗴𝗲𝗻𝘁 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗼𝗳 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗶𝗻 𝗺𝗶𝗻𝗱𝘀 𝗮𝗻𝗱 𝗺𝗮𝗰𝗵𝗶𝗻𝗲𝘀

What properties of language are emerging from work in experimental and theoretical linguistics, neuroscience & LLM interpretability?

Info: tinyurl.com/colt-site
Register: tinyurl.com/colt-register

🧵1/3
May 13, 2025 at 9:00 AM
Please find us at #ICLR2025! We will present our work on intrinsic dimension as a cue for stages of language processing in LLMs.

Saturday morning, Poster session 5
Hall 3 + Hall2B #563
iclr.cc/virtual/2025...

Arxiv: arxiv.org/abs/2405.15471
April 22, 2025 at 2:37 PM
Reposted by Computational Linguistics @UPF
📢 Upcoming Seminar

Words are weird? On the role of lexical ambiguity in language
🗣 Gemma Boleda (Universitat Pompeu Fabra, Spain)
Why is language so ambiguous? Discover how ambiguity balances cognitive simplicity and communicative complexity through large-scale studies.
📍 UniMiB, Room U6-01C, Milan
March 3, 2025 at 1:41 PM
⚡New position paper from Gemma Boleda: is it time to make peace between symbolic and continuous approaches to language?
February 24, 2025 at 5:19 PM
Reposted by Computational Linguistics @UPF
The project I did with Marco Baroni and Iuri Macocco while I was in Barcelona is now on Arxiv: arxiv.org/abs/2502.10201 🎉

TLDR below 👇
Prediction hubs are context-informed frequent tokens in LLMs
Hubness, the tendency for few points to be among the nearest neighbours of a disproportionate number of other points, commonly arises when applying standard distance measures to high-dimensional data,...
arxiv.org
February 24, 2025 at 8:06 AM
Reposted by Computational Linguistics @UPF
This year, CoNLL will be accepting *non-archival* (as well as archival) submissions! www.conll.org #CoNLL2025

Follow CoNLL at
@conll-conf.bsky.social
CoNLL 2025 | CoNLL
www.conll.org
February 5, 2025 at 2:15 PM
Reposted by Computational Linguistics @UPF
Here's our work accepted to #ICLR2025!

We look at how intrinsic dimension evolves over LLM layers, spotting a universal high-dimensional phase.

This ID peak is where:

- linguistic features are built
- different LLMs are most similar,

with implications for task transfer

🧵 1/6
February 2, 2025 at 6:53 PM
Reposted by Computational Linguistics @UPF
Què és l’aprenentatge profund ?

La @marionamec.bsky.social de @neurofregides.bsky.social ens ho explica en motiu del Deep Learning Barcelona Symposium 2024 (@dlbcn.ai), aquest dijous 19 de desembre.

#deeplearning #ciencia #català #barcelona

www.youtube.com/shorts/R4u_Z...
Què és l'aprenentatge profund ? - La Dimoni de Maxwell #deeplearning #ciencia #català #barcelona
YouTube video by Deep Learning Barcelona
www.youtube.com
December 16, 2024 at 8:49 AM
🔊New EMNLP paper from Eleonora Gualdoni & @gboleda.bsky.social !

Why do objects have many names?

Human lexicons contain different words that speakers can use to refer to the same object, e.g., purple or magenta for the same color.

We investigate using tools from efficient coding...🧵

1/3
December 2, 2024 at 10:43 AM
Hello🌍! We're a computational linguistics group in Barcelona headed by Gemma Boleda, Marco Baroni & Thomas Brochhagen

We do psycholinguistics, cogsci, language evolution & NLP, with diverse backgrounds in philosophy, formal linguistics, CS & physics

Get in touch for postdoc, PhD & MS openings!
November 25, 2024 at 10:22 AM
⚡Postdoc opportunity w/ COLT

Beatriu de Pinós contract, 3 yrs, competitive call by Catalan government.

Apply with a PI (Marco Gemma or Thomas)

Reqs: min 2y postdoc experience outside Spain, not having lived in Spain for >12 months in the last 3y.

Application ~December-February (exact dates TBD)
November 25, 2024 at 9:56 AM