Matthias Hagen
matthias-hagen.bsky.social
Matthias Hagen
@matthias-hagen.bsky.social
Professor of "Databases and Information Systems" at Friedrich-Schiller-Universität Jena, Germany, and member of @webis.de.
Research in information retrieval and natural language processing.
Reposted by Matthias Hagen
Honored to win the ICTIR Best Paper Honorable Mention Award for "Axioms for Retrieval-Augmented Generation"!
Our new axioms are integrated with ir_axioms: github.com/webis-de/ir_...
Nice to see axiomatic IR gaining momentum.
July 18, 2025 at 2:18 PM
Reposted by Matthias Hagen
Now @fschlatt.bsky.social presents "TITE: Token-Independent Text Encoder for Information Retrieval" at #SIGIR2025

Paper: webis.de/publications...
July 16, 2025 at 9:08 AM
Reposted by Matthias Hagen
Want to know how to make bi-encoders more than 3x faster with a new backbone encoder model? Check out our talk on the Token-Independent Text Encoder (TITE) #SIGIR2025 in the efficiency track. It pools vectors within the model to improve efficiency dl.acm.org/doi/10.1145/...
July 16, 2025 at 7:28 AM
Reposted by Matthias Hagen
Happy to share that our paper "The Viability of Crowdsourcing for RAG Evaluation" received the Best Paper Honourable Mention at #SIGIR2025! Very grateful to the community for recognizing our work on improving RAG evaluation.

 📄 webis.de/publications...
July 16, 2025 at 9:04 PM
Reposted by Matthias Hagen
Thank you Carlos for the shout-out of Lightning IR in the LSR tutorial at #SIGIR2025

If you want to fine your own LSR models, check out our framework at github.com/webis-de/lig...
July 13, 2025 at 2:42 PM
Reposted by Matthias Hagen
Do not forget to participate in the #TREC2025 Tip-of-the-Tongue (ToT) Track :)

The corpus and baselines (with run files) are now available and easily accessible via the ir_datasets API and the HuggingFace Datasets API.

More details are available at: trec-tot.github.io/guidelines
June 27, 2025 at 2:46 PM
Reposted by Matthias Hagen
🧵 4/4 The shared task continues the research on LLM-based advertising. Participants can submit systems for two sub-tasks: First, generate responses with and without ads. Second, classify whether a response contains an ad.
Submissions are open until May 10th and we look forward to your contributions.
April 30, 2025 at 11:17 AM
Reposted by Matthias Hagen
🧵 3/4 In a lot of cases, survey participants did not notice brand or product placements in the responses. As a first step towards ad-blockers for LLMs, we created a dataset of responses with and without ads and trained classifiers on the task of identifying the ads.
dl.acm.org/doi/10.1145/...
April 30, 2025 at 11:17 AM
Reposted by Matthias Hagen
🧵 2/4 Given the high operating costs of LLMs, they require a business model to sustain them and advertising is a natural candidate.
Hence, we have analyzed how well LLMs can blend product placements with "organic" responses and whether users are able to identify the ads.
dl.acm.org/doi/10.1145/...
April 30, 2025 at 11:17 AM
Reposted by Matthias Hagen
Can LLM-generated ads be blocked? With OpenAI adding shopping options to ChatGPT, this question gains further importance.
If you are interested in contributing to the research on LLM-based advertising, please check out our shared task: touche.webis.de/clef25/touch...

More details below.
April 30, 2025 at 11:17 AM
Reposted by Matthias Hagen
The Workshop on Open Web Search at #ECIR2025 just starts with a keynote by @claclarke.bsky.social on Annotative Indexing. #WOWS25 #WOWS2025 #ECIR25
April 10, 2025 at 7:16 AM
Reposted by Matthias Hagen
The Workshop on Open Web Search just finished #WOWS2025 #ECIR2025.

It was a very cool experience with many interesting talks. Lets hope we can do it again next year at #ECIR2026 in Delft :)
April 10, 2025 at 3:05 PM
Reposted by Matthias Hagen
Today I had the pleasure to talk about child-safe search at #ECIR2025. We created an cranfield-style evaluation dataset to contrast relevance with harm in web search scenarios.

Details: webis.de/publications...
April 10, 2025 at 3:14 PM
Reposted by Matthias Hagen
Now we have @fschlatt.bsky.social on the #ECIR2025 stage predenting the research on the Set-Encoder.

The paper is online at: webis.de/publications...
April 9, 2025 at 8:00 AM
Reposted by Matthias Hagen
Short Paper: Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-ranking webis.de/publications...

Full Paper: Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders webis.de/publications...
Webis Publications
Publications by the Webis group
webis.de
April 9, 2025 at 12:37 PM
Reposted by Matthias Hagen
Honored to receive the best short paper award and best paper honourable mention award at #ECIR2025. Thank you to all co-authors @maik-froebe.bsky.social, @hscells.bsky.social, Shengyao Zhuang, @bevankoopman.bsky.social, Guido Zuccon, Benno Stein, @martin-potthast.com, @matthias-hagen.bsky.social 🥳
April 9, 2025 at 12:37 PM
Reposted by Matthias Hagen
I was very happy to talk about corpus subsampling at #ECIR2025 today.

Please find the paper at webis.de/publications...

And lat bur not least, here are some of my favorite impressions of the first day of ECIR :)
April 7, 2025 at 10:30 PM
Reposted by Matthias Hagen
📢 Our paper "The Viability of Crowdsourcing for RAG Evaluation" has been accepted to #SIGIR2025 !
We compared how good humans and LLMs are at writing and judging RAG responses, assembling 1800+ responses across 3 styles, and 47K+ pairwise judgments in 7 quality dimensions. 🧵➡️
April 7, 2025 at 3:34 PM
Reposted by Matthias Hagen
Interested in joining our research group or do you know someone who might be interested?
We have a new vacancy: Research position at the Webis group on Watermarking for Large Language Models.
More information:
webis.de/for-students...
February 17, 2025 at 8:55 AM