Lightnews — Scholar-powered news

Lukas Gienapp

@lgnp.bsky.social

420 followers 120 following 0 posts

ML/IR research @ hessianAI / Kassel University & ScaDS.AI. RAG & neural retrieval models.

Posts Replies Media Videos

Reposted by Lukas Gienapp

Webis Group

@webis.de

We just released "German Commons", the largest openly-licensed German text dataset for LLM training: 154B tokens with clear usage rights for research and commercial use.

huggingface.co/datasets/coral-nlp/german-commons

coral-nlp/german-commons · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

October 27, 2025 at 12:45 PM

Reposted by Lukas Gienapp

Webis Group

@webis.de

Happy to share that our paper "The Viability of Crowdsourcing for RAG Evaluation" received the Best Paper Honourable Mention at #SIGIR2025! Very grateful to the community for recognizing our work on improving RAG evaluation.

📄 webis.de/publications...

July 16, 2025 at 9:04 PM

Reposted by Lukas Gienapp

Ferdinand Schlatt

@fschlatt.bsky.social

Want to know how to make bi-encoders more than 3x faster with a new backbone encoder model? Check out our talk on the Token-Independent Text Encoder (TITE) #SIGIR2025 in the efficiency track. It pools vectors within the model to improve efficiency dl.acm.org/doi/10.1145/...

July 16, 2025 at 7:28 AM

Reposted by Lukas Gienapp

Maik Fröbe

@maik-froebe.bsky.social

Lukas Gienapp presents "The Viability of Crowdsourcing for RAG Evaluation" at #SIGIR2025

The paper is available at: webis.de/publications...

July 15, 2025 at 1:53 PM

Reposted by Lukas Gienapp

Webis Group

@webis.de

Our paper on self-distillation for training bi-encoders got accepted at #ICTIR2025! By exploiting pretrained encoder capabilities, our approach eliminates expensive teacher models and batch sampling while maintaining the same effectiveness.

June 22, 2025 at 12:33 PM

Reposted by Lukas Gienapp

Webis Group

@webis.de

📢 Our paper "The Viability of Crowdsourcing for RAG Evaluation" has been accepted to #SIGIR2025 !
We compared how good humans and LLMs are at writing and judging RAG responses, assembling 1800+ responses across 3 styles, and 47K+ pairwise judgments in 7 quality dimensions. 🧵➡️

April 7, 2025 at 3:34 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news