SheffieldNLP
sheffieldnlp.bsky.social
SheffieldNLP
@sheffieldnlp.bsky.social
Established 1993, the University of Sheffield's #NLProc Group is one of the UK's largest natural language processing research centres.
https://t.co/D8c2TQcavj
Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes

Tyler Loakman, William Thorne, Chenghua Lin
arxiv.org/abs/2507.13335
#EMNLP2025 Findings
Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes
Humour, as a complex language form, is derived from myriad aspects of life, whilst existing work on computational humour has focussed almost exclusively on short pun-based jokes. In this work, we inve...
arxiv.org
August 21, 2025 at 1:33 PM
GreekBarBench: A Challenging Benchmark for Free-Text Legal Reasoning and Citations

Odysseas Chlapanis, Dimitrios Galanis, Nikos Aletras, Ion Androutsopoulos
arxiv.org/abs/2505.17267
#EMNLP2025 Findings
GreekBarBench: A Challenging Benchmark for Free-Text Legal Reasoning and Citations
We introduce GreekBarBench, a benchmark that evaluates LLMs on legal questions across five different legal areas from the Greek Bar exams, requiring citations to statutory articles and case facts. To ...
arxiv.org
August 21, 2025 at 1:33 PM
Enhancing Logical Reasoning in Language Models via Symbolically-Guided Monte Carlo Process Supervision

Xingwei Tan, Marco Valentino, Mahmud Elahi Akhter, Maria Liakata, Nikos Aletras
arxiv.org/abs/2505.20415
#EMNLP2025 Main
Enhancing Logical Reasoning in Language Models via Symbolically-Guided Monte Carlo Process Supervision
Large language models (LLMs) have shown promising performance in mathematical and logical reasoning benchmarks. However, recent studies have pointed to memorization, rather than generalization, as one...
arxiv.org
August 21, 2025 at 1:33 PM
How Private are Language Models in Abstractive Summarization?

Anthony Hughes, Ning Ma, Nikos Aletras
arxiv.org/abs/2412.12040
#EMNLP2025 Main
How Private are Language Models in Abstractive Summarization?
In sensitive domains such as medical and legal, protecting sensitive information is critical, with protective laws strictly prohibiting the disclosure of personal data. This poses challenges for shari...
arxiv.org
August 21, 2025 at 1:33 PM
Beyond Hate Speech: NLP’s Challenges and Opportunities in Uncovering Dehumanizing Language

Hamidreza Saffari, Mohammadamin Shafiei, Hezhao Zhang, Lasana T. Harris, Nafise Sadat Moosavi
arxiv.org/abs/2402.13818
#EMNLP2025 Main
Beyond Hate Speech: NLP's Challenges and Opportunities in Uncovering Dehumanizing Language
Dehumanization, i.e., denying human qualities to individuals or groups, is a particularly harmful form of hate speech that can normalize violence against marginalized communities. Despite advances in ...
arxiv.org
August 21, 2025 at 1:33 PM
Does Multimodal Large Language Model Truly Unlearn? Stealthy MLLM Unlearning Attack

Xianren Zhang, Hui Liu, Delvin Ce Zhang, Xianfeng Tang, Qi He, Dongwon Lee, Suhang Wang
www.arxiv.org/abs/2506.17265
#EMNLP2025 Main
Does Multimodal Large Language Model Truly Unlearn? Stealthy MLLM Unlearning Attack
Multimodal Large Language Models (MLLMs) trained on massive data may memorize sensitive personal information and photos, posing serious privacy risks. To mitigate this, MLLM unlearning methods are pro...
www.arxiv.org
August 21, 2025 at 1:33 PM
Label Set Optimization via Activation Distribution Kurtosis for Zero-Shot Classification with Generative Models

Yue Li, Zhixue Zhao, Carolina Scarton
arxiv.org/abs/2410.19195
#EMNLP2025 Main
Label Set Optimization via Activation Distribution Kurtosis for Zero-shot Classification with Generative Models
In-context learning (ICL) performance is known to be sensitive to the prompt design, yet the impact of class label options in zero-shot classification has been largely overlooked. This study presents ...
arxiv.org
August 21, 2025 at 1:33 PM
Beyond the Score: Uncertainty-Calibrated LLMs for Automated Essay Assessment

Ahmed Karim, Qiao Wang, Zheng Yuan
openreview.net/forum?id=UiW...
#EMNLP2025 Main
Beyond the Score: Uncertainty-Calibrated LLMs for Automated Essay...
Automated Essay Scoring (AES) systems now attain near–human agreement on public benchmarks, yet real-world adoption—especially in high-stakes examinations—remains limited. A principal obstacle is...
openreview.net
August 21, 2025 at 1:33 PM
Formalizing Complex Mathematical Statements with LLMs: A Study on Mathematical Definitions

Lan Zhang, Marco Valentino (our upcoming Lecturer), Andre Freitas
arxiv.org/abs/2502.12065
#EMNLP2025 Main
Formalizing Complex Mathematical Statements with LLMs: A Study on Mathematical Definitions
Thanks to their linguistic capabilities, LLMs offer an opportunity to bridge the gap between informal mathematics and formal languages through autoformalization. However, it is still unclear how well ...
arxiv.org
August 21, 2025 at 1:33 PM