Lightnews — Scholar-powered news

Leo Boytsov

@srchvrs.bsky.social

710 followers 170 following 110 posts

Machine learning scientist and engineer speaking πtorch & C++ (ph-D CMU) working on (un)natural language processing, speaking πtorch & C++. Opinions sampled from MY OWN 100T param LM.

Posts Replies Media Videos

Leo Boytsov

@srchvrs.bsky.social

Paper presents fascinating (as described by AI 😆) case study on how seemingly innocuous modification to neural net can drastically alter its perceived robustness against gradient-based adversarial attacks.
searchivarius.org/blog/curious...

August 27, 2025 at 3:49 AM

Leo Boytsov

@srchvrs.bsky.social

PS: Yes, this is a frontier LLM and it still cannot fully replace an editor.⏹️

August 9, 2025 at 3:05 PM

Leo Boytsov

@srchvrs.bsky.social

4. Model complains about its own suggestion.
5. Bonus point: of course, often times the complaints are incorrect. If you further poke the model it will likely accept being wrong. Which, in turn, may not mean much because models are also clearly trained to agree with humans as much as possible.
↩️

August 9, 2025 at 3:05 PM

Leo Boytsov

@srchvrs.bsky.social

Results? If read tables correctly, there's only very modest boost in both recall & NDCG, which is within 2%. Given that the procedure requires a second retrieval, it does not seem to worth an effort.
🟦
dl.acm.org/doi/abs/10.1...

A Large-Scale Study of Reranker Relevance Feedback at Inference | Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval

dl.acm.org

July 18, 2025 at 6:01 PM

Leo Boytsov

@srchvrs.bsky.social

PRF was not forgotten in the neural IR times, but how does it perform really? Revanth Gangi Reddy & colleagues ran a rather thorough experiment and published it SIGIR.
↩️

July 18, 2025 at 6:01 PM

Leo Boytsov

@srchvrs.bsky.social

It was doc2query before doc2query and, in fact, it improved performance (by a few%) of the IBM Watson QA system that beat human champions in Jeopardy!
↩️
research.ibm.com/publications...

Statistical source expansion for question answering for CIKM 2011

Statistical source expansion for question answering for CIKM 2011 by Nico Schlaefer et al.

research.ibm.com

July 18, 2025 at 6:01 PM

Leo Boytsov

@srchvrs.bsky.social

I think this is a problem of completely unsupervised and blind approach of adding terms to the query. If we had some supervision signal to filter out potentially bad terms, this would work out better. In fact, a supervised approach was previously used to add terms to documents!
↩️

July 18, 2025 at 6:01 PM

Leo Boytsov

@srchvrs.bsky.social

Fixing this issue produced a sub-topic in the IR community devoted to fixing this issue and identifying cases where performance degrades substantially in advance. Dozens of approaches were proposed, but I do not think it was successful. Why⁉️
↩️

July 18, 2025 at 6:01 PM

Leo Boytsov

@srchvrs.bsky.social

PRF tends to improve things on average, but has a rather nasty property of tanking outcomes for some queries rather dramatically: When things go wrong (i.e., unlucky unrelated terms are added to the query), they can go very wrong. ↩️

July 18, 2025 at 6:01 PM

Leo Boytsov

@srchvrs.bsky.social

PRF is an old technique introduced 40 years ago in the SMART system (arguably the first open-source IR system). ↩️
x.com/srchvrs/stat...

Leo Boytsov on X: "🧵40 years ago the SMART IR system was released. It introduced a few key concepts including vector space interpretation of the retrieval process and the relevance feedback algorithm. I also think it was probably the first open source search engine. ↩️" / X

🧵40 years ago the SMART IR system was released. It introduced a few key concepts including vector space interpretation of the retrieval process and the relevance feedback algorithm. I also think it was probably the first open source search engine. ↩️

x.com

July 18, 2025 at 6:01 PM

Leo Boytsov

@srchvrs.bsky.social

If you submitted a messy paper, it's pointless to address every little comment and promise fixing it in the final version. 🟦

July 2, 2025 at 3:04 AM

Leo Boytsov

@srchvrs.bsky.social

Instead, think hard about questions you can ask. What is the main misunderstanding? What will you have to do so that a reviewer will accept your work next time. Which concise questions can you ask to avoid misunderstanding in the future? ↩️

July 2, 2025 at 3:04 AM

Leo Boytsov

@srchvrs.bsky.social

Humans are creating AGI and you claim that their intelligence is overrated?

May 22, 2025 at 4:26 AM

Leo Boytsov

@srchvrs.bsky.social

Laptop keyboards are close to being unusable. Tremendous productivity hit.

April 27, 2025 at 3:16 AM

Leo Boytsov

@srchvrs.bsky.social

However, unlike many others who see threat in the form of a "terminator-like" super-intelligence, @lawrennd.bsky.social worries about unpredictability of automated decision making by entity that's superior in some ways, inferior in others, but importantly is disconnected from the needs of humans. ⏹️

April 7, 2025 at 2:24 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news