Lightnews — Scholar-powered news

Reposted by Harry Scells

Webis Group

@webis.de

We just released "German Commons", the largest openly-licensed German text dataset for LLM training: 154B tokens with clear usage rights for research and commercial use.

huggingface.co/datasets/coral-nlp/german-commons

coral-nlp/german-commons · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

October 27, 2025 at 12:45 PM

Reposted by Harry Scells

IRRJ

@irrj.sigmoid.social.ap.brid.gy

We are proud to announce that we are now indexed by @dblp! Click below for Volume 1, Number 1, 2025

https://dblp.org/db/journals/irrj/irrj1.html

dblp: IRRJ, Volume 1

Bibliographic content of IRRJ, Volume 1

dblp.org

September 9, 2025 at 12:48 PM

Reposted by Harry Scells

ECIR 2026

@ecir2026.eu

The organization of #ECIR2026 has started! We just had our first call with all track chairs. With the calls now finalized, online and distributed across mailing lists, we’re moving on to the rest of the conference preparation!

@ecir2026.eu
📍 Delft, 30 Mar – 2 Apr 2026
👉 ecir2026.eu

September 1, 2025 at 12:21 PM

Reposted by Harry Scells

Webis Group

@webis.de

Honored to win the ICTIR Best Paper Honorable Mention Award for "Axioms for Retrieval-Augmented Generation"!
Our new axioms are integrated with ir_axioms: github.com/webis-de/ir_...
Nice to see axiomatic IR gaining momentum.

July 18, 2025 at 2:18 PM

Reposted by Harry Scells

Webis Group

@webis.de

Congrats to the authors @lgnp.bsky.social @timhagen.bsky.social @maik-froebe.bsky.social @matthias-hagen.bsky.social @benno-stein.de @martin-potthast.com @hscells.bsky.social from @unikassel.bsky.social @hessianai.bsky.social @scadsai.bsky.social @unituebingen.bsky.social @uni-jena.de & Uni Weimar

July 16, 2025 at 9:04 PM

Reposted by Harry Scells

Webis Group

@webis.de

Happy to share that our paper "The Viability of Crowdsourcing for RAG Evaluation" received the Best Paper Honourable Mention at #SIGIR2025! Very grateful to the community for recognizing our work on improving RAG evaluation.

📄 webis.de/publications...

July 16, 2025 at 9:04 PM

Reposted by Harry Scells

Ferdinand Schlatt

@fschlatt.bsky.social

Want to know how to make bi-encoders more than 3x faster with a new backbone encoder model? Check out our talk on the Token-Independent Text Encoder (TITE) #SIGIR2025 in the efficiency track. It pools vectors within the model to improve efficiency dl.acm.org/doi/10.1145/...

July 16, 2025 at 7:28 AM

Reposted by Harry Scells

Maik Fröbe

@maik-froebe.bsky.social

Now @fschlatt.bsky.social presents "TITE: Token-Independent Text Encoder for Information Retrieval" at #SIGIR2025

Paper: webis.de/publications...

July 16, 2025 at 9:08 AM

Reposted by Harry Scells

Maik Fröbe

@maik-froebe.bsky.social

Lukas Gienapp presents "The Viability of Crowdsourcing for RAG Evaluation" at #SIGIR2025

The paper is available at: webis.de/publications...

July 15, 2025 at 1:53 PM

Reposted by Harry Scells

Damiano Spina

@damianospina.com

Lucky to witness #IRRJ editor-in-chief @djoerd.idf.social.ap.brid.gy signing a copy of the first edition of @irrj.sigmoid.social.ap.brid.gy for Ian Soboroff, author of the paper “Don’t Use LLMs to Make Relevance Judgments” in the volume.

#SIGIR2025

irrj.org/article/view...

July 15, 2025 at 7:06 AM

Reposted by Harry Scells

Ferdinand Schlatt

@fschlatt.bsky.social

@mrparryparry.bsky.social presenting our work on reproducing TREC DL 2019 judgements and the implications for evaluating modern ranking models on modern collections. Paper: arxiv.org/abs/2502.20937

Variations in Relevance Judgments and the Shelf Life of Test Collections

The fundamental property of Cranfield-style evaluations, that system rankings are stable even when assessors disagree on individual relevance decisions, was validated on traditional test collections. ...

arxiv.org

July 14, 2025 at 2:49 PM

Reposted by Harry Scells

Djoerd Hiemstra 🍉

@djoerd.idf.social.ap.brid.gy

#sigir2025 excellent reviewers

July 14, 2025 at 7:24 AM

Reposted by Harry Scells

Ferdinand Schlatt

@fschlatt.bsky.social

Thank you Carlos for the shout-out of Lightning IR in the LSR tutorial at #SIGIR2025

If you want to fine your own LSR models, check out our framework at github.com/webis-de/lig...

July 13, 2025 at 2:42 PM

Reposted by Harry Scells

ScaDS.AI Dresden/Leipzig

@scadsai.bsky.social

From July 13-17, 2025, @scadsai.bsky.social will join the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval in Padua, Italy. Our researchers have made the following contributions.

Learn more about #SIGIR2025:
👉 https://sigir2025.dei.unipd.it/

July 10, 2025 at 9:46 AM

Reposted by Harry Scells

Webis Group

@webis.de

Congratulations to the authors @lgnp.bsky.social @deckersniklas.bsky.social @martin-potthast.com @hscells.bsky.social !

📄 Preprint: arxiv.org/abs/2407.21515
💻 Code: github.com/webis-de/ada...

Learning Effective Representations for Retrieval Using Self-Distillation with Adaptive Relevance Margins

Representation-based retrieval models, so-called biencoders, estimate the relevance of a document to a query by calculating the similarity of their respective embeddings. Current state-of-the-art bien...

arxiv.org

June 22, 2025 at 12:33 PM

Reposted by Harry Scells

Webis Group

@webis.de

Our paper on self-distillation for training bi-encoders got accepted at #ICTIR2025! By exploiting pretrained encoder capabilities, our approach eliminates expensive teacher models and batch sampling while maintaining the same effectiveness.

June 22, 2025 at 12:33 PM

Reposted by Harry Scells

Universität Tübingen

@unituebingen.bsky.social

Die @unituebingen feiert Erfolg im Exzellencluster-Wettbewerb 🎉Freude und Jubel bei der Pressekonferenz zur Entscheidung der DFG am 22.05.25 mit gut 200 Gästen. Weitere Infos und Fotos gibt es online 👉 uni-tuebingen.de/universitaet... @cmfi.bsky.social @ml4science.bsky.social

May 26, 2025 at 9:55 AM

Reposted by Harry Scells

Deutsche Forschungsgemeinschaft

@dfg.de

Die #Exzellenzcluster stehen fest: Heute hat die Exzellenzkommission 70 Projekte zur Förderung ausgewählt. 45 Cluster werden fortgesetzt, 25 neu eingerichtet. Die Förderung beginnt ab 1. Jan. 2026 für 7 Jahre, die Fördersumme beträgt insg. 539 Mio. €/Jahr. Die Liste: www.dfg.de/resource/blo... 1/3

Deutschlandkarte der Verteilung der Exzellenzcluster auf Bundesländer und Universitäten.

May 22, 2025 at 3:05 PM

Reposted by Harry Scells

Chuan Meng

@chuanmeng.bsky.social

Join us for the QPP Workshop today starting at 9 AM in the Sagrestia, IMT Campus!

Chuan Meng @chuanmeng.bsky.social · Apr 4

📢 The final schedule for the ECIR 2025 workshop on Query Performance Prediction in the era of LLMs is now live!
📅 Join us on 10th April 2025: qppworkshop.github.io
🎤 Keynote by @gdebasis.bsky.social: "The Role of Query Performance Prediction in Developing Adaptive Search and RAG Systems"

QPP++ 2025: Query Performance Prediction and its Applications in the Era of Large Language Models

qppworkshop.github.io

April 10, 2025 at 6:33 AM

Reposted by Harry Scells

Ingo Frommholz

@frommholz.org

ESSIR 2025, the European Summer School on Information Retrieval in Wolverhampton, UK, July 7-11! Dive into cutting-edge Information Retrieval & AI, network with experts. Plus, don’t miss the interactive FDIA Symposium! 🎓

👉 2025.essir.eu #IR #AI #ECIR2025

April 9, 2025 at 10:26 AM

Reposted by Harry Scells

Ferdinand Schlatt

@fschlatt.bsky.social

Short Paper: Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-ranking webis.de/publications...

Full Paper: Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders webis.de/publications...

Webis Publications

Publications by the Webis group

webis.de

April 9, 2025 at 12:37 PM

Reposted by Harry Scells

Maik Fröbe

@maik-froebe.bsky.social

Now we have @fschlatt.bsky.social on the #ECIR2025 stage predenting the research on the Set-Encoder.

The paper is online at: webis.de/publications...

April 9, 2025 at 8:00 AM

Reposted by Harry Scells

Ferdinand Schlatt

@fschlatt.bsky.social

Honored to receive the best short paper award and best paper honourable mention award at #ECIR2025. Thank you to all co-authors @maik-froebe.bsky.social, @hscells.bsky.social, Shengyao Zhuang, @bevankoopman.bsky.social, Guido Zuccon, Benno Stein, @martin-potthast.com, @matthias-hagen.bsky.social 🥳

April 9, 2025 at 12:37 PM

Reposted by Harry Scells

Maik Fröbe

@maik-froebe.bsky.social

I was very happy to talk about corpus subsampling at #ECIR2025 today.

Please find the paper at webis.de/publications...

And lat bur not least, here are some of my favorite impressions of the first day of ECIR :)

April 7, 2025 at 10:30 PM

Reposted by Harry Scells

Webis Group

@webis.de

🧵 2/4 Key findings:
1️⃣ Humans write best? No! LLM responses are rated better than human.
2️⃣ Essay answers? No! Bullet lists are often preferred.
3️⃣ Evaluate with BLEU? No! Reference-based metrics don't align with human preferences.
4️⃣ LLMs as judges? No! Prompted models produce inconsistent labels.

April 7, 2025 at 3:34 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news