Lightnews — Scholar-powered news

Light up
your news

Create account Sign in

About Privacy Terms Help

Sumit

Sumit

@reachsumit.com

200 followers 36 following 2K posts

Senior MLE at Meta. Trying to keep up with the Information Retrieval domain!

Blog: https://blog.reachsumit.com/
Newsletter: https://recsys.substack.com/

Posts Replies Media Videos

Pinned

Sumit @reachsumit.com · 1d

I published Vol. 129 of "Top Information Retrieval Papers of the Week" on Substack.
🔗 recsys.substack.com/p/agentic-re...

Agentic Retrieval for Corpus-Level Reasoning, Compact, High-Performance Caching for RAG Agents, and More!

Vol.129 for Nov 03 - Nov 09, 2025

recsys.substack.com

Sumit

@reachsumit.com

Adaptive Regularization for Large-Scale Sparse Feature Embedding Models

Alibaba introduces an adaptive regularization method that addresses the one-epoch overfitting problem in CTR/CVR models.

📝 arxiv.org/abs/2511.06374
👨🏽‍💻 anonymous.4open.science/r/AdaptiveRe...

Adaptive Regularization for Large-Scale Sparse Feature Embedding Models

The one-epoch overfitting problem has drawn widespread attention, especially in CTR and CVR estimation models in search, advertising, and recommendation domains. These models which rely heavily on lar...

November 11, 2025 at 8:09 AM

Sumit

@reachsumit.com

The Value of Personalized Recommendations: Evidence from Netflix

Netflix examines how personalized recommendations drive user engagement by building a discrete choice model on 2 million users.

📝 arxiv.org/abs/2511.07280

The Value of Personalized Recommendations: Evidence from Netflix

Personalized recommendation systems shape much of user choice online, yet their targeted nature makes separating out the value of recommendation and the underlying goods challenging. We build a discre...

November 11, 2025 at 8:08 AM

Sumit

@reachsumit.com

A Representation Sharpening Framework for Zero Shot Dense Retrieval

Amazon proposes a training-free framework that augments document representations with contrastive queries to improve zero-shot dense retrieval without retraining the model.

📝 arxiv.org/abs/2511.05684

A Representation Sharpening Framework for Zero Shot Dense Retrieval

Zero-shot dense retrieval is a challenging setting where a document corpus is provided without relevant queries, necessitating a reliance on pretrained dense retrievers (DRs). However, since these DRs...

November 11, 2025 at 8:07 AM

Sumit

@reachsumit.com

A Remarkably Efficient Paradigm to Multimodal Large Language Models for Sequential Recommendation

Introduces a method to compress multimodal item representations into single tokens and enhance sequential position awareness.

📝 arxiv.org/abs/2511.05885

A Remarkably Efficient Paradigm to Multimodal Large Language Models for Sequential Recommendation

In this paper, we proposed Speeder, a remarkably efficient paradigm to multimodal large language models for sequential recommendation. Speeder introduces 3 key components: (1) Multimodal Representatio...

November 11, 2025 at 8:06 AM

Sumit

@reachsumit.com

Make It Long, Keep It Fast: End-to-End 10k-Sequence Modeling at Billion Scale on Douyin

ByteDance scales long-sequence modeling to 10k-length histories through efficient cross-attention, user-centric batching, and length-extrapolative training.

📝 arxiv.org/abs/2511.06077

Make It Long, Keep It Fast: End-to-End 10k-Sequence Modeling at Billion Scale on Douyin

Short-video recommenders such as Douyin must exploit extremely long user histories without breaking latency or cost budgets. We present an end-to-end system that scales long-sequence modeling to 10k-l...

November 11, 2025 at 8:05 AM

Sumit

@reachsumit.com

Evaluation of retrieval-based QA on QUEST-LOFT

Google DeepMind provides an in-depth analysis of RAG on QUEST-LOFT, demonstrating that RAG combined with structured outputs and self-verification significantly outperforms long-context approaches.

📝 arxiv.org/abs/2511.06125

Evaluation of retrieval-based QA on QUEST-LOFT

Despite the popularity of retrieval-augmented generation (RAG) as a solution for grounded QA in both academia and industry, current RAG methods struggle with questions where the necessary information ...

November 11, 2025 at 8:04 AM

Sumit

@reachsumit.com

LLaDA-Rec: Discrete Diffusion for Parallel Semantic ID Generation in Generative Recommendation

Proposes a discrete diffusion framework that reformulates recommendation as parallel semantic ID generation.

📝 arxiv.org/abs/2511.06254
👨🏽‍💻 github.com/TengShi-RUC/...

LLaDA-Rec: Discrete Diffusion for Parallel Semantic ID Generation in Generative Recommendation

Generative recommendation represents each item as a semantic ID, i.e., a sequence of discrete tokens, and generates the next item through autoregressive decoding. While effective, existing autoregress...

November 11, 2025 at 8:03 AM

Sumit

@reachsumit.com

Rethinking Retrieval-Augmented Generation for Medicine: A Large-Scale, Systematic Expert Evaluation and Practical Insights

Presents a comprehensive evaluation of RAG in medicine, revealing that standard RAG often degrades performance.

📝 arxiv.org/abs/2511.06738
👨🏽‍💻 github.com/Yale-BIDS-Ch...

Rethinking Retrieval-Augmented Generation for Medicine: A Large-Scale, Systematic Expert Evaluation and Practical Insights

Large language models (LLMs) are transforming the landscape of medicine, yet two fundamental challenges persist: keeping up with rapidly evolving medical knowledge and providing verifiable, evidence-g...

November 11, 2025 at 8:01 AM

Sumit

@reachsumit.com

Have We Really Understood Collaborative Information? An Empirical Investigation

Introduces a quantitative definition of collaborative information in recommender systems, analyzing its manifestation and impact on various recommendation algorithms.

📝 arxiv.org/abs/2511.06905

Have We Really Understood Collaborative Information? An Empirical Investigation

Collaborative information serves as the cornerstone of recommender systems which typically focus on capturing it from user-item interactions to deliver personalized services. However, current understa...

November 11, 2025 at 7:57 AM

Sumit

@reachsumit.com

Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks

NVIDIA introduces a text embedding model trained on 16M query-document pairs, achieving top performance on multilingual benchmarks.

📝 arxiv.org/abs/2511.07025
👨🏽‍💻 huggingface.co/nvidia/llama...

Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks

We introduce llama-embed-nemotron-8b, an open-weights text embedding model that achieves state-of-the-art performance on the Multilingual Massive Text Embedding Benchmark (MMTEB) leaderboard as of Oct...

November 11, 2025 at 7:56 AM

Sumit

@reachsumit.com

Q-RAG: Long Context Multi-step Retrieval via Value-based Embedder Training

Proposes Q-RAG, a resource-efficient multi-step retrieval approach that fine-tunes embedder models using RL, achieving state-of-the-art results on long-context benchmarks.

📝 arxiv.org/abs/2511.07328

Q-RAG: Long Context Multi-step Retrieval via Value-based Embedder Training

Retrieval-Augmented Generation (RAG) methods enhance LLM performance by efficiently filtering relevant context for LLMs, reducing hallucinations and inference cost. However, most existing RAG methods ...

November 11, 2025 at 7:55 AM

Sumit

@reachsumit.com

DMA: Online RAG Alignment with Human Feedback

Uses multi-granularity human feedback to continuously align retrieval and ranking in RAG systems through online learning.

📝 arxiv.org/abs/2511.04880

DMA: Online RAG Alignment with Human Feedback

Retrieval-augmented generation (RAG) systems often rely on static retrieval, limiting adaptation to evolving intent and content drift. We introduce Dynamic Memory Alignment (DMA), an online learning f...

November 10, 2025 at 3:25 AM

Sumit

@reachsumit.com

EncouRAGe: Evaluating RAG Local, Fast, and Reliable

Introduces a Python framework for developing and evaluating Retrieval-Augmented Generation systems with modular components and diverse metrics.

📝 arxiv.org/abs/2511.04696
👨🏽‍💻 anonymous.4open.science/r/encourage-...

EncouRAGe: Evaluating RAG Local, Fast, and Reliable

We introduce EncouRAGe, a comprehensive Python framework designed to streamline the development and evaluation of Retrieval-Augmented Generation (RAG) systems using Large Language Models (LLMs) and Em...

November 10, 2025 at 3:24 AM

Sumit

@reachsumit.com

Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation

Proposes a framework that systematically filters noisy documents through query-aware clustering and multi-agent iterative refinement.

📝 arxiv.org/abs/2511.04700
👨🏽‍💻 github.com/SongW-SW/Win...

Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation

Retrieval-augmented generation (RAG) enhances large language models (LLMs) by integrating external knowledge sources to address their limitations in accessing up-to-date or specialized information. A ...

November 10, 2025 at 3:23 AM

Sumit

@reachsumit.com

Search Is Not Retrieval: Decoupling Semantic Matching from Contextual Assembly in RAG

Proposes a dual-layer architecture separating fine-grained search chunks from coarse-grained retrieval contexts, improving composability and context fidelity.

📝 arxiv.org/abs/2511.04939

Search Is Not Retrieval: Decoupling Semantic Matching from Contextual Assembly in RAG

Retrieval systems are essential to contemporary AI pipelines, although most confuse two separate processes: finding relevant information and giving enough context for reasoning. We introduce the Searc...

November 10, 2025 at 3:20 AM

Sumit

@reachsumit.com

QUESTER: Query Specification for Generative Retrieval

Introduces a method where small LLMs generate keyword queries for BM25 retrieval, trained via reinforcement learning to balance effectiveness and efficiency in information retrieval.

📝 arxiv.org/abs/2511.05301

QUESTER: Query Specification for Generative Retrieval

Generative Retrieval (GR) differs from the traditional index-then-retrieve pipeline by storing relevance in model parameters and directly generating document identifiers. However, GR often struggles t...

November 10, 2025 at 3:19 AM

Sumit

@reachsumit.com

TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework

Compresses retrieval content using knowledge graphs and reduces reasoning steps through iterative process-aware preference optimization.

📝 arxiv.org/abs/2511.05385
👨🏽‍💻 github.com/Applied-Mach...

TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework

Retrieval-Augmented Generation (RAG) utilizes external knowledge to augment Large Language Models' (LLMs) reliability. For flexibility, agentic RAG employs autonomous, multi-round retrieval and reason...

November 10, 2025 at 3:19 AM

Sumit

@reachsumit.com

I published Vol. 129 of "Top Information Retrieval Papers of the Week" on Substack.
🔗 recsys.substack.com/p/agentic-re...

Agentic Retrieval for Corpus-Level Reasoning, Compact, High-Performance Caching for RAG Agents, and More!

Vol.129 for Nov 03 - Nov 09, 2025

recsys.substack.com

November 9, 2025 at 5:14 PM

Sumit

@reachsumit.com

NVIDIA Nemotron Nano V2 VL

NVIDIA introduces a 12B vision-language model achieving leading OCR performance with strong capabilities in document understanding, long video comprehension, and reasoning tasks.

📝 arxiv.org/abs/2511.03929
👨🏽‍💻 huggingface.co/nvidia/NVIDI...

NVIDIA Nemotron Nano V2 VL

We introduce Nemotron Nano V2 VL, the latest model of the Nemotron vision-language series designed for strong real-world document understanding, long video comprehension, and reasoning tasks. Nemotron...

November 7, 2025 at 3:39 AM

Sumit

@reachsumit.com

RAGalyst: Automated Human-Aligned Agentic Evaluation for Domain-Specific RAG

Presents an automated framework for evaluating domain-specific RAG systems with human-aligned LLM-as-a-Judge metrics and agentic synthetic QA generation.

📝 arxiv.org/abs/2511.04502

RAGalyst: Automated Human-Aligned Agentic Evaluation for Domain-Specific RAG

Retrieval-Augmented Generation (RAG) is a critical technique for grounding Large Language Models (LLMs) in factual evidence, yet evaluating RAG systems in specialized, safety-critical domains remains ...

November 7, 2025 at 3:37 AM

Sumit

@reachsumit.com

KGFR: A Foundation Retriever for Generalized Knowledge Graph Question Answering

Enables zero-shot generalization to unseen KGs through LLM-generated relation descriptions and question-conditioned initialization.

📝 arxiv.org/abs/2511.04093
👨🏽‍💻 github.com/yncui-nju/KGFR

KGFR: A Foundation Retriever for Generalized Knowledge Graph Question Answering

Large language models (LLMs) excel at reasoning but struggle with knowledge-intensive questions due to limited context and parametric knowledge. However, existing methods that rely on finetuned LLMs o...

November 7, 2025 at 3:36 AM

Sumit

@reachsumit.com

On the Brittleness of CLIP Text Encoders

Analyzes how CLIP text encoders react to minor input variations in multimedia retrieval, finding that syntactic and semantic perturbations cause the largest instabilities while brittleness concentrates in trivial surface edits.

📝 arxiv.org/abs/2511.04247

On the Brittleness of CLIP Text Encoders

Multimodal co-embedding models, especially CLIP, have advanced the state of the art in zero-shot classification and multimedia information retrieval in recent years by aligning images and text in a sh...

November 7, 2025 at 3:35 AM

Sumit

@reachsumit.com

Cache Mechanism for Agent RAG Systems

Introduces an annotation-free caching framework that reduces RAG storage to 0.015% of original corpus while achieving 79.8% has-answer rate and 80% latency reduction.

📝 arxiv.org/abs/2511.02919

Cache Mechanism for Agent RAG Systems

Recent advances in Large Language Model (LLM)-based agents have been propelled by Retrieval-Augmented Generation (RAG), which grants the models access to vast external knowledge bases. Despite RAG's s...

November 6, 2025 at 3:35 AM

Sumit

@reachsumit.com

No-Human in the Loop: Agentic Evaluation at Scale for Recommendation

Walmart introduces a multi-agent framework that benchmarks 36 LLMs as judges for complementary-item recommendation without human annotation.

📝 arxiv.org/abs/2511.03051

No-Human in the Loop: Agentic Evaluation at Scale for Recommendation

Evaluating large language models (LLMs) as judges is increasingly critical for building scalable and trustworthy evaluation pipelines. We present ScalingEval, a large-scale benchmarking study that sys...

November 6, 2025 at 3:35 AM

Sumit

@reachsumit.com

Generative Sequential Recommendation via Hierarchical Behavior Modeling

Presents a generative framework for multi-behavior sequential recommendation with cross-level interaction and sequential augmentation.

📝 arxiv.org/abs/2511.03155
👨🏽‍💻 github.com/wzf2000/GAMER

Generative Sequential Recommendation via Hierarchical Behavior Modeling

Recommender systems in multi-behavior domains, such as advertising and e-commerce, aim to guide users toward high-value but inherently sparse conversions. Leveraging auxiliary behaviors (e.g., clicks,...

November 6, 2025 at 3:34 AM