Lightnews — Scholar-powered news

@momergul.bsky.social

32 followers 170 following 8 posts

CS PhD Student @Cornell

Posts Replies Media Videos

momergul.bsky.social

@momergul.bsky.social

Tons of other insights in the paper. We show that the strength of the helper / search tool is a key consideration. Replacing our retriever with an oracle results in all models converging to always seeking help. The noisiness of the retriever is a feature not a bug!

October 2, 2025 at 7:40 PM

momergul.bsky.social

@momergul.bsky.social

Baseline RL implementations often converge to sub-optimal policies that always or never search. MASH uses a lightweight warm start data generation & SFT pipeline that induces better search behaviors. MASH models can discover a mix of 0/1/2 searches as needed while baselines fail.

October 2, 2025 at 7:40 PM

momergul.bsky.social

@momergul.bsky.social

For (ii), MASH shows strong abstention behavior off-the-shelf! Its performance is analogous to abstention baselines that require pre-determining knowledge boundaries and model-specific training data. It beats SFT approaches and is competitive with DPO!

October 2, 2025 at 7:40 PM

momergul.bsky.social

@momergul.bsky.social

We evaluate MASH under 2 settings: (i) w/ access to search, (ii) w/o search as an abstention model.

For (i), MASH outperforms efficient search baselines, esp. for multi-hop datasets (7.6% accuracy boost), even matching search baselines w/o any search penalties!

October 2, 2025 at 7:40 PM

momergul.bsky.social

@momergul.bsky.social

🚨Modeling Abstention via Selective Help-seeking

LLMs learn to use search tools to answer questions they would otherwise hallucinate on. But can this also teach them what they know vs not?

We introduce MASH that trains LLMs for search and gets abstentions for free!

October 2, 2025 at 7:40 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news