Sumit
banner
reachsumit.com
Sumit
@reachsumit.com
Senior MLE at Meta. Trying to keep up with the Information Retrieval domain!

Blog: https://blog.reachsumit.com/
Newsletter: https://recsys.substack.com/
Pinned
I published Vol. 135 of "Top Information Retrieval Papers of the Week" on Substack.
🔗 recsys.substack.com/p/understand...
Understanding Stability in Modern Vector Databases, A Generative Paradigm Shift for Click-Through Rate Prediction, and More!
Vol.135 for Dec 15 - Dec 21, 2025
recsys.substack.com
Step-DeepResearch Technical Report

Presents a cost-effective 32B parameter deep research agent achieving expert-level performance through atomic capability training and progressive optimization from mid-training to RL.

📝 arxiv.org/abs/2512.20491
👨🏽‍💻 github.com/stepfun-ai/S...
Step-DeepResearch Technical Report
As LLMs shift toward autonomous agents, Deep Research has emerged as a pivotal metric. However, existing academic benchmarks like BrowseComp often fail to meet real-world demands for open-ended resear...
arxiv.org
December 25, 2025 at 7:46 AM
M³KG-RAG: Multi-hop Multimodal Knowledge Graph-enhanced Retrieval-Augmented Generation

Proposes a framework with multi-hop multimodal knowledge graphs and a pruning mechanism to improve audio-visual reasoning in multimodal large language models.

📝 arxiv.org/abs/2512.20136
M$^3$KG-RAG: Multi-hop Multimodal Knowledge Graph-enhanced Retrieval-Augmented Generation
Retrieval-Augmented Generation (RAG) has recently been extended to multimodal settings, connecting multimodal large language models (MLLMs) with vast corpora of external knowledge such as multimodal k...
arxiv.org
December 25, 2025 at 7:45 AM
Retrieval-augmented Prompt Learning for Pre-trained Foundation Models

Presents a retrieval-augmented framework that enhances prompt learning by decoupling knowledge from memorization through a knowledge-store of training data.

📝 arxiv.org/abs/2512.20145
👨🏽‍💻 github.com/zjunlp/Promp...
PromptKG/research/RetroPrompt at main · zjunlp/PromptKG
PromptKG Family: a Gallery of Prompt Learning & KG-related research works, toolkits, and paper-list. - zjunlp/PromptKG
github.com
December 25, 2025 at 7:44 AM
Collaborative Group-Aware Hashing for Fast Recommender Systems

Proposes a hashing method that integrates inherent group information to learn hash codes, improving both accuracy in sparse settings and efficiency for large-scale recommendations.

📝 arxiv.org/abs/2512.20172
Collaborative Group-Aware Hashing for Fast Recommender Systems
The fast online recommendation is critical for applications with large-scale databases; meanwhile, it is challenging to provide accurate recommendations in sparse scenarios. Hash technique has shown i...
arxiv.org
December 25, 2025 at 7:41 AM
Mem³R: Memory Retrieval via Reflective Reasoning for LLM Agents

Introduces an autonomous memory-retrieval controller that transforms retrieve-then-answer pipelines into closed-loop processes using evidence-gap tracking.

📝 arxiv.org/abs/2512.20237
👨🏽‍💻 github.com/Leagein/memr3
MemR$^3$: Memory Retrieval via Reflective Reasoning for LLM Agents
Memory systems have been designed to leverage past experiences in Large Language Model (LLM) agents. However, many deployed memory systems primarily optimize compression and storage, with comparativel...
arxiv.org
December 25, 2025 at 7:40 AM
Representation-Enhanced Cascading Multi-Level Interest Learning for Multi-Behavior Recommendation

Presents a parallel learning framework that captures multi-level positive and negative feedback signals from behavioral sequences.

📝 dl.acm.org/doi/10.1145/...
👨🏽‍💻 github.com/lhybq/PPN-ARE
Representation-Enhanced Cascading Multi-Level Interest Learning for Multi-Behavior Recommendation | ACM Transactions on Information Systems
Multi-behavior recommendation leverages multiple user-item interaction information to alleviate data sparsity. Although different types of user-item interactions are temporally mutually exclusive, the...
dl.acm.org
December 25, 2025 at 7:39 AM
Laser: Governing Long-Horizon Agentic Search via Structured Protocol and Context Register

Introduces a framework that stabilizes agentic search through symbolic action protocols and compact context management.

📝 arxiv.org/abs/2512.20458
👨🏽‍💻 github.com/ShootingWong...
Laser: Governing Long-Horizon Agentic Search via Structured Protocol and Context Register
Recent advances in Large Language Models (LLMs) and Large Reasoning Models (LRMs) have enabled agentic search systems that interleave multi-step reasoning with external tool use. However, existing fra...
arxiv.org
December 25, 2025 at 7:37 AM
Making Large Language Models Efficient Dense Retrievers

Finds that MLP layers in LLM-based retrievers are highly redundant while attention layers remain critical, enabling substantial compression through coarse-to-fine MLP pruning.

📝 arxiv.org/abs/2512.20612
👨🏽‍💻 github.com/Yibin-Lei/Ef...
Making Large Language Models Efficient Dense Retrievers
Recent work has shown that directly fine-tuning large language models (LLMs) for dense retrieval yields strong performance, but their substantial parameter counts make them computationally inefficient...
arxiv.org
December 25, 2025 at 7:35 AM
MegaRAG: Multimodal Knowledge Graph-Based Retrieval Augmented Generation

Presents a multimodal graph-based RAG method that automatically constructs knowledge graphs from visual documents, enabling cross-modal reasoning for better content understanding.

📝 arxiv.org/abs/2512.20626
MegaRAG: Multimodal Knowledge Graph-Based Retrieval Augmented Generation
Retrieval-augmented generation (RAG) enables large language models (LLMs) to dynamically access external information, which is powerful for answering questions over previously unseen documents. Noneth...
arxiv.org
December 25, 2025 at 7:34 AM
How important is Recall for Measuring Retrieval Quality?

Introduces a simple measure independent of total relevant documents and evaluates retrieval quality metrics against LLM-based response quality judgments across multiple datasets.

📝 arxiv.org/abs/2512.20854
How important is Recall for Measuring Retrieval Quality?
In realistic retrieval settings with large and evolving knowledge bases, the total number of documents relevant to a query is typically unknown, and recall cannot be computed. In this paper, we evalua...
arxiv.org
December 25, 2025 at 7:32 AM
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

NVIDIA releases a 30B parameter MoE model with competitive accuracy and 3.3x higher throughput while supporting 1M token contexts.

📝 arxiv.org/abs/2512.20848
🤗 huggingface.co/nvidia/NVIDI...
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
December 25, 2025 at 7:31 AM
NVIDIA Nemotron 3: Efficient and Open Intelligence

NVIDIA introduces a family of models (Nano, Super, Ultra) using hybrid Mamba-Transformer MoE architecture with up to 1M token context and state-of-the-art reasoning performance.

📝 arxiv.org/abs/2512.20856
NVIDIA Nemotron 3: Efficient and Open Intelligence
We introduce the Nemotron 3 family of models - Nano, Super, and Ultra. These models deliver strong agentic, reasoning, and conversational capabilities. The Nemotron 3 family uses a Mixture-of-Experts ...
arxiv.org
December 25, 2025 at 7:29 AM
Accurate and Diverse Recommendations via Propensity-Weighted Linear Autoencoders

Introduces a propensity scoring method using sigmoid functions on logarithmic item frequency to improve recommendation diversity while maintaining accuracy.

📝 arxiv.org/abs/2512.20896
👨🏽‍💻 github.com/cars1015/IPS...
Accurate and Diverse Recommendations via Propensity-Weighted Linear Autoencoders
In real-world recommender systems, user-item interactions are Missing Not At Random (MNAR), as interactions with popular items are more frequently observed than those with less popular ones. Missing o...
arxiv.org
December 25, 2025 at 7:27 AM
ReaSeq: Unleashing World Knowledge via Reasoning for Sequential Modeling

Alibaba introduces a reasoning-enhanced framework that leverages LLMs to address knowledge poverty and systemic blindness in recommender systems.

📝 arxiv.org/abs/2512.21257
ReaSeq: Unleashing World Knowledge via Reasoning for Sequential Modeling
Industrial recommender systems face two fundamental limitations under the log-driven paradigm: (1) knowledge poverty in ID-based item representations that causes brittle interest modeling under data s...
arxiv.org
December 25, 2025 at 7:26 AM
C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling

Ant Group introduces a family of code embedding models using Pooling by Multihead Attention to break information bottlenecks in code retrieval.

📝 arxiv.org/abs/2512.21332
👨🏽‍💻 github.com/codefuse-ai/...
C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling
We present C2LLM - Contrastive Code Large Language Models, a family of code embedding models in both 0.5B and 7B sizes. Building upon Qwen-2.5-Coder backbones, C2LLM adopts a Pooling by Multihead Atte...
arxiv.org
December 25, 2025 at 7:24 AM
Faster Distributed Inference-Only Recommender Systems via Bounded Lag Synchronous Collectives

Huawei proposes a bounded lag synchronous operation for distributed recommender systems that improves both latency and throughput in inference-only DLRM runs.

📝 arxiv.org/abs/2512.19342
Faster Distributed Inference-Only Recommender Systems via Bounded Lag Synchronous Collectives
Recommender systems are enablers of personalized content delivery, and therefore revenue, for many large companies. In the last decade, deep learning recommender models (DLRMs) are the de-facto standa...
arxiv.org
December 23, 2025 at 7:46 AM
LIR³AG: A Lightweight Rerank Reasoning Strategy Framework for Retrieval-Augmented Generation

Enables non-reasoning models to transfer reasoning strategies by restructuring retrieved evidence into coherent reasoning chains.

📝 arxiv.org/abs/2512.18329
LIR$^3$AG: A Lightweight Rerank Reasoning Strategy Framework for Retrieval-Augmented Generation
Retrieval-Augmented Generation (RAG) effectively enhances Large Language Models (LLMs) by incorporating retrieved external knowledge into the generation process. Reasoning models improve LLM performan...
arxiv.org
December 23, 2025 at 7:45 AM
Factorized Transport Alignment for Multimodal and Multiview E-commerce Representation Learning

Etsy unifies multimodal and multi-view learning for e-commerce search, using factorized transport to efficiently align primary and non-primary images with text views.

📝 arxiv.org/abs/2512.18117
Factorized Transport Alignment for Multimodal and Multiview E-commerce Representation Learning
The rapid growth of e-commerce requires robust multimodal representations that capture diverse signals from user-generated listings. Existing vision-language models (VLMs) typically align titles with ...
arxiv.org
December 23, 2025 at 7:44 AM
Efficient Optimization of Hierarchical Identifiers for Generative Recommendation

Proposes greedy and hybrid tree construction methods that reduce identifier building time to 2-8% while maintaining or improving retrieval quality.

📝 arxiv.org/abs/2512.18434
👨🏽‍💻 github.com/joshrosie/re...
Efficient Optimization of Hierarchical Identifiers for Generative Recommendation
SEATER is a generative retrieval model that improves recommendation inference efficiency and retrieval quality by utilizing balanced tree-structured item identifiers and contrastive training objective...
arxiv.org
December 23, 2025 at 7:42 AM
Generative Giants, Retrieval Weaklings: Why do Multimodal Large Language Models Fail at Multimodal Retrieval?

Analyzes why MLLMs underperform in zero-shot multimodal retrieval, revealing text-dominated spaces and misaligned feature components.

📝 arxiv.org/abs/2512.19115
Generative Giants, Retrieval Weaklings: Why do Multimodal Large Language Models Fail at Multimodal Retrieval?
Despite the remarkable success of multimodal large language models (MLLMs) in generative tasks, we observe that they exhibit a counterintuitive deficiency in the zero-shot multimodal retrieval task. I...
arxiv.org
December 23, 2025 at 7:40 AM
QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

Presents a dynamic RAG approach that uses pre-training corpus statistics to determine when to retrieve.

📝 arxiv.org/abs/2512.19134
👨🏽‍💻 github.com/ZhishanQ/QuC...
QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation
Dynamic Retrieval-Augmented Generation adaptively determines when to retrieve during generation to mitigate hallucinations in large language models (LLMs). However, existing methods rely on model-inte...
arxiv.org
December 23, 2025 at 7:40 AM
Scalable Distributed Vector Search via Accuracy Preserving Index Construction

Presents a hierarchical vector index system that achieves up to 9.64x higher throughput by using balanced partition granularity and accuracy-preserving recursive construction.

📝 arxiv.org/abs/2512.17264
Scalable Distributed Vector Search via Accuracy Preserving Index Construction
Scaling Approximate Nearest Neighbor Search (ANNS) to billions of vectors requires distributed indexes that balance accuracy, latency, and throughput. Yet existing index designs struggle with this tra...
arxiv.org
December 22, 2025 at 7:14 AM
DEER: A Comprehensive and Reliable Benchmark for Deep-Research Expert Reports

Introduces a benchmark for evaluating AI-generated research reports with 50 tasks across 13 domains, combining expert-grounded rubrics and document-level fact-checking.

📝 arxiv.org/abs/2512.17776
DEER: A Comprehensive and Reliable Benchmark for Deep-Research Expert Reports
As large language models (LLMs) advance, deep research systems can generate expert-level reports via multi-step reasoning and evidence-based synthesis, but evaluating such reports remains challenging....
arxiv.org
December 22, 2025 at 7:13 AM
Exploiting ID-Text Complementarity via Ensembling for Sequential Recommendation

Snap introduces a method that trains ID-based and text-based sequential recommendation models independently, then combines them through ensembling to leverage their complementary strengths.

📝 arxiv.org/abs/2512.17820
Exploiting ID-Text Complementarity via Ensembling for Sequential Recommendation
Modern Sequential Recommendation (SR) models commonly utilize modality features to represent items, motivated in large part by recent advancements in language and vision modeling. To do so, several wo...
arxiv.org
December 22, 2025 at 7:11 AM
A Reproducible and Fair Evaluation of Partition-aware Collaborative Filtering

Presents a benchmark of partition-aware collaborative filtering methods, revealing that FPSR models remain competitive but don't consistently outperform block-aware baselines.

📝 arxiv.org/abs/2512.17015
A Reproducible and Fair Evaluation of Partition-aware Collaborative Filtering
Similarity-based collaborative filtering (CF) models have long demonstrated strong offline performance and conceptual simplicity. However, their scalability is limited by the quadratic cost of maintai...
arxiv.org
December 22, 2025 at 7:10 AM