Lightnews — Scholar-powered news

Reposted by Paul Lerner

Leonie Weissweiler

@weissweiler.bsky.social

🧑‍🔬I’m recruiting PhD students in Natural Language Processing @unileipzig.bsky.social Computer Science, together with @scadsai.bsky.social!

Topics include, but aren’t limited to:

🔎Linguistic Interpretability
🌍Multilingual Evaluation
📖Computational Typology

Please share!

#NLProc #NLP

December 11, 2025 at 1:36 PM

Reposted by Paul Lerner

MLIA ISIR

@mlia-isir.bsky.social

The team meeting of the week was presented by Alexandre Vérine, from PSL, about "Quality and Diversity in generative models through the lens of f-divergences."
Thanks a lot for this interesting talk!

November 24, 2025 at 6:13 PM

Reposted by Paul Lerner

MLIA ISIR

@mlia-isir.bsky.social

Accepted to a Workshop (1/2):

"Self-Retrieval from Distant Contexts for Document-Level Machine Translation", accepted to the Conference on Machine Translation (WMT25), from @ziqianpeng.bsky.social, @rachelbawden.bsky.social, @yvofr.bsky.social

October 28, 2025 at 8:57 AM

Paul Lerner

@lernerp.bsky.social

Come work with @yvofr.bsky.social @weissweiler.bsky.social and me at @mlia-isir.bsky.social for a M2 internship on Assessing the Morphological Competence of LLMs! For 5-6 months from February or March 2026. Paid 600€/month

November 6, 2025 at 9:02 AM

Paul Lerner

@lernerp.bsky.social

What's the plural of "LLM-as-a-Judge"?

October 24, 2025 at 3:19 PM

Paul Lerner

@lernerp.bsky.social

We find that LLMs translate some political parties unfairly using a new version of EuroParl, fully multi-parallel and including (political) metadata
hal.science/hal-05328251

Assessing the Political Fairness of Multilingual LLMs: A Case Study based on a 21-way Multiparallel EuroParl Dataset

The political biases of Large Language Models (LLMs) are usually assessed by simulating their answers to English surveys. In this work, we propose an alternative framing of political biases, relying on principles of fairness in multilingual translation. We systematically compare the translation quality of speeches in the European Parliament (EP), observing systematic differences with majority parties from left, center, and right being better translated than outsider parties. This study is made possible by a new, 21-way multiparallel version of EuroParl, the parliamentary proceedings of the EP, which includes the political affiliations of each speaker. The dataset consists of 1.5M sentences for a total of 40M words and 249M characters. It covers three years, 1000+ speakers, 7 countries, 12 EU parties, 25 EU committees, and hundreds of national parties.

hal.science

October 23, 2025 at 4:07 PM

Paul Lerner

@lernerp.bsky.social

introducing 🤔 ppllm, a Python Library to Compute LLM's Perplexity and Surprisal github.com/PaulLerner/p...

GitHub - PaulLerner/ppllm: 🤔 A Python Library to Compute LLM's Perplexity and Surprisal

🤔 A Python Library to Compute LLM's Perplexity and Surprisal - PaulLerner/ppllm

github.com

October 15, 2025 at 5:24 PM

Paul Lerner

@lernerp.bsky.social

make.org/FR/consultat...

Contribute to current consultations - Comment l’IA peut-elle améliorer la vie des Français en limitant les risques ? - Make.org

Finding proposals is easier when working together. Discover a democratic place where you can discuss the big issues you care about, submit your proposals concerning them and vote on proposals proposed...

make.org

September 4, 2025 at 2:00 PM

Reposted by Paul Lerner

Pasquale Minervini

@neuralnoise.com

"in 2025 we will have flying cars" 😂😂😂

July 5, 2025 at 4:17 PM

Paul Lerner

@lernerp.bsky.social

Last week, I presented my work on "Assessing the Political Biases of Multilingual LLMs" at the EALM workshop @ TALN 2025 ! Thanks again to the ANR Diké project for organizing the workshop

July 7, 2025 at 7:59 AM

Reposted by Paul Lerner

MLIA ISIR

@mlia-isir.bsky.social

📢 🎉 The team has one paper accepted to #MTsummit2025!
"Investigating Length Issues in Document-level Machine Translation" by @ziqianpeng.bsky.social, @rachelbawden.bsky.social and @yvofr.bsky.social in collaboration with @inriaparisnlp.bsky.social
📍 Geneva | 🗓️ 23-27,June
📕 arxiv.org/abs/2412.17592

Investigating Length Issues in Document-level Machine Translation

Transformer architectures are increasingly effective at processing and generating very long chunks of texts, opening new perspectives for document-level machine translation (MT). In this work, we chal...

arxiv.org

June 10, 2025 at 6:45 PM

Paul Lerner

@lernerp.bsky.social

"meticulously" is so absent from this list (from aclanthology.org/2025.coling-... )

June 16, 2025 at 8:25 AM

Paul Lerner

@lernerp.bsky.social

Am I the only reviewer that actually fills this "Reviewer Checklist"? And why do Area Chairs never answer when the paper needs to be desk-rejected? And reviews are due in 3 days 🫠

June 16, 2025 at 7:46 AM

Reposted by Paul Lerner

MLIA ISIR

@mlia-isir.bsky.social

For the EALM Workshop
"On Assessing the Political Biases of Multilingual Large Language Models" by @lernerp.bsky.social Laurène Cave, @haldaume3.bsky.social Léo Labat, Gaël Lejeune, Pierre-Antoine Lequeu, @bpiwowar.bsky.social Nazanin Shafiabadi and yvofr.bsky.social, collaborated with the STIH lab

June 10, 2025 at 6:39 PM

Paul Lerner

@lernerp.bsky.social

Amazed at what a COLING paper could look like in the 80's

February 20, 2025 at 9:26 AM

Paul Lerner

@lernerp.bsky.social

Hope you enjoyed our poster at #AISummit! I'm standing next to Pierre-Antoine Lequeu, @salimhafid.bsky.social, and @manonberriche.bsky.social but there's more people involved! Zoom-in to read their names or learn more about the project here about.make.org/democratic-c...

February 11, 2025 at 9:54 AM

Paul Lerner

@lernerp.bsky.social

accurate quote for NLP researchers visiting Louvre Abu Dhabi after COLING 2025

January 25, 2025 at 6:30 AM

Paul Lerner

@lernerp.bsky.social

Really appreciate the feedback on this paper! It was mainly inspired by Valentin Hofmann et al. DagoBERT/"Superbizarre" papers

January 22, 2025 at 7:08 PM

Paul Lerner

@lernerp.bsky.social

Hope you enjoyed the presentation!

January 22, 2025 at 7:05 PM

Reposted by Paul Lerner

Yoav Goldberg

@yoavgo.bsky.social

RL promises "systems that can adapt to their environment". However, no RL system that I know of actually fulfill anything close to this goal, and, furthermore, I'd argue that all the current RL methodologies are actively hostile to this goal. Prove me wrong.

December 30, 2024 at 10:41 PM

Paul Lerner

@lernerp.bsky.social

best prompt ever

January 8, 2025 at 11:07 AM

Paul Lerner

@lernerp.bsky.social

We offer a M2 internship on Visual Question Answering at LISN (Paris-Saclay University, co-supervised by Thomas Gerald, Sahar Ghannay, and Anne Vilnat)

The goal is to create a dataset of questions for education (based on schoolbook content)

Duration of 5 or 6 months (starting in March or April)

January 6, 2025 at 12:26 PM

Paul Lerner

@lernerp.bsky.social

if you're interested in morphological segmentation, I started working on a word-based/lexematic approach where a given word is segmented into a base and affix (e.g. "invaluable" -> "in- valuable" where "valuable" can be further decomposed into "value -able")

github.com/PaulLerner/n...

GitHub - PaulLerner/neoseg: A tool for Lexematic Segmentation by Paul Lerner

A tool for Lexematic Segmentation by Paul Lerner. Contribute to PaulLerner/neoseg development by creating an account on GitHub.

github.com

December 23, 2024 at 10:05 AM

Reposted by Paul Lerner

Yoav Goldberg

@yoavgo.bsky.social

but, their summaries are still far better than pretty much all we had before, only that now we are stuck because there is no way of improving the quality in a principled (or even non-principled) way. so kinda stuck, not a great position to be in.

December 17, 2024 at 12:22 AM

Paul Lerner

@lernerp.bsky.social

My brother did an awesome PhD 🤩

December 16, 2024 at 5:22 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news