Lightnews — Scholar-powered news

Coleman Haley

@colemanhaley.bsky.social

NLP PhD candidate @ University of Edinburgh

Computational Linguistics | Typology | Morphology | Multimodal NLP | Cognitive Science

(Interpretability + Neurosymbolic models sometimes)

Posts Replies Media Videos

Coleman Haley

@colemanhaley.bsky.social

10/ We find our measures diverge from related psycholinguistic norms (concreteness and imageability), but this divergence is largely due to our measure's informativity dimension.

December 20, 2024 at 5:05 PM

Coleman Haley

@colemanhaley.bsky.social

9/ While we expect lexical classes to be grounded, we find functional classes—traditionally viewed as “grammatical” or “abstract”—also carry semantic content.

For example, determiners like "der" or "une" still contribute to meaning, challenging common assumptions in linguistics.

December 20, 2024 at 5:05 PM

Coleman Haley

@colemanhaley.bsky.social

8/ Across 30 typologically diverse languages, we find a cline between Nouns > Adjectives > Verbs.

This corroborates ideas from cognitive linguistics that suggest these classes lie in a continuum.

December 20, 2024 at 5:05 PM

Coleman Haley

@colemanhaley.bsky.social

7/ To validate our measure we look at the lexical-functional distinction in word classes:

Lexical classes: nouns, verbs, adjectives—contentful words.
Functional classes: prepositions, determiners—“grammatical” words.

How universal is this distinction? Is there a clear line?

December 20, 2024 at 5:05 PM

Coleman Haley

@colemanhaley.bsky.social

6/ Groundedness turns out to be the *decrease in surprisal* of a word when we see the image it refers to! But ensuring comparability is tricky (see paper for details).

December 20, 2024 at 5:05 PM

Coleman Haley

@colemanhaley.bsky.social

5/ Our use of images makes groundedness straightforward to compute. We need only the log probabilities from:
- a language model p(word | context
- an image captioning model p(word | context, meaning)

December 20, 2024 at 5:05 PM

Coleman Haley

@colemanhaley.bsky.social

3/ / To align function across languages, we use image captions. The linguistic content of a caption aims to express the contents of an image. An image represents the state of the world in a language-neutral way, and so can serve as an imperfect proxy for meaning.

December 20, 2024 at 5:05 PM

Coleman Haley

@colemanhaley.bsky.social

NEW PREPRINT!

Language is not just a formal system—it connects words to the world. But how do we measure this connection in a cross-linguistic, quantitative way?

🧵 Using multimodal models, we introduce a new approach: groundedness ⬇️

December 20, 2024 at 5:05 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news