Lightnews — Scholar-powered news

Reposted by Alessio Devoto

Daniel Scalena

@danielsc4.it

💡 We compare prompting (zero and multi-shot + explanations) and inference-time interventions (ActAdd, REFT and SAEs).

Following SpARE (@yuzhaouoe.bsky.social @alessiodevoto.bsky.social), we propose ✨ contrastive SAE steering ✨ with mutual info to personalize literary MT by tuning latent features 4/

May 23, 2025 at 12:23 PM

Reposted by Alessio Devoto

Aryo Pradipta Gema

@aryopg.bsky.social

MMLU-Redux just touched down at #NAACL2025! 🎉
Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope 😅
If anyone's swinging by, give our research some love! Hit me up if you check it out! 👋

May 2, 2025 at 1:00 PM

Reposted by Alessio Devoto

Pasquale Minervini

@neuralnoise.com

My amazing collaborators will present several works at ICLR and NAACL later this month -- please catch up with them if you're attending! I tried to summarise our recent work in a blog post: neuralnoise.com/2025/march-r...

April 19, 2025 at 8:15 AM

Reposted by Alessio Devoto

Pasquale Minervini

@neuralnoise.com

Please share it within your circles! edin.ac/3DDQK1o

March 13, 2025 at 11:59 AM

Reposted by Alessio Devoto

Nathan Godey

@nthngdy.bsky.social

🚀 New Paper Alert! 🚀

We introduce Q-Filters, a training-free method for efficient KV Cache compression!

It is compatible with FlashAttention and can compress along generation which is particularly useful for reasoning models ⚡

TLDR: we make Streaming-LLM smarter using the geometry of attention

March 6, 2025 at 4:02 PM

Reposted by Alessio Devoto

Lorenzo Loconte

@loreloc.bsky.social

Live from the CoLoRAI workshop at AAAI
(april-tools.github.io/colorai/)

Nadav Cohen is now giving his talk on "What Makes Data Suitable for Deep Learning?"

Tools from quantum physics are shown to be useful in building more expressive deep learning models by changing the data distribution.

March 4, 2025 at 2:54 PM

Reposted by Alessio Devoto

Naomi Saphra

@nsaphra.bsky.social

2018: Saliency maps give plausible interpretations of random weights, triggering skepticism and catalyzing the mechinterp cultural movement, which now advocates for SAEs.

2025: SAEs give plausible interpretations of random weights, triggering skepticism and ...

Sanity Checks for Saliency Maps
Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, Been Kim
Saliency methods have emerged as a popular tool to highlight features in an input deemed relevant for the prediction of a learned model. Several saliency methods have been proposed, often guided by visual appeal on image data. In this work, we propose an actionable methodology to evaluate what kinds of explanations a given method can and cannot provide. We find that reliance, solely, on visual assessment can be misleading. Through extensive experiments we show that some existing saliency methods are independent both of the model and of the data generating process. Consequently, methods that fail the proposed tests are inadequate for tasks that are sensitive to either data or model, such as, finding outliers in the data, explaining the relationship between inputs and outputs that the model learned, and debugging the model. We interpret our findings through an analogy with edge detection in images, a technique that requires neither training data nor model. Theory in the case of a linear model and a single-layer convolutional neural network supports our experimental findings.

Sparse Autoencoders Can Interpret Randomly Initialized Transformers
Thomas Heap, Tim Lawson, Lucy Farnik, Laurence Aitchison
Sparse autoencoders (SAEs) are an increasingly popular technique for interpreting the internal representations of transformers. In this paper, we apply SAEs to 'interpret' random transformers, i.e., transformers where the parameters are sampled IID from a Gaussian rather than trained on text data. We find that random and trained transformers produce similarly interpretable SAE latents, and we confirm this finding quantitatively using an open-source auto-interpretability pipeline. Further, we find that SAE quality metrics are broadly similar for random and trained transformers. We find that these results hold across model sizes and layers. We discuss a number of number interesting questions that this work raises for the use of SAEs and auto-interpretability in the context of mechanistic interpretability.

March 3, 2025 at 6:42 PM

Reposted by Alessio Devoto

Kosta Derpanis

@csprofkgd.bsky.social

Graphical tensor notation for interpretability

www.lesswrong.com/posts/BQKKQi...

February 23, 2025 at 9:16 PM

Reposted by Alessio Devoto

hardmaru

@hardmaru.bsky.social

Introducing The AI CUDA Engineer: An agentic AI system that automates the production of highly optimized CUDA kernels.

sakana.ai/ai-cuda-engi...

The AI CUDA Engineer can produce highly optimized CUDA kernels, reaching 10-100x speedup over common machine learning operations in PyTorch.

Examples:

February 20, 2025 at 1:50 AM

Reposted by Alessio Devoto

Sebastian Raschka (rasbt)

@sebastianraschka.com

It's 2025, and I’ve finally updated my Python setup guide to use uv + venv instead of conda + pip!
Here's my go-to recommendation for uv + venv in Python projects for faster installs, better dependency management: github.com/rasbt/LLMs-f...
(Any additional suggestions?)

February 15, 2025 at 7:14 PM

Alessio Devoto

@alessiodevoto.bsky.social

Cool research on how models memorize data 📝 : The 'Manifold Memorization Hypothesis' by Brendan Ross, Hamidreza Kamkariet al. suggests memorization occurs when the model's learned manifold matches the true data manifold but with too small 'local intrinsic dimensionality'.

February 5, 2025 at 4:11 PM

Alessio Devoto

@alessiodevoto.bsky.social

Massive activations & weights in LLMs, two cool works 🤓:

- The Super Weight: finds performance can be totally degraded when pruning a *single* weight - Mengxia Yu et al.

- Massive Activations in LLM:finds some (crucial) activations have very high norm irrespective of context - Mingjie Sun et al.

February 4, 2025 at 5:55 PM

Reposted by Alessio Devoto

Adina Yakup

@adinayakup.bsky.social

On the last day before the Spring Festival holiday in China, DeepSeek released a NEW work on @hf.co 🤯

Janus-Pro🔥 autoregressive framework that unifies multimodal understanding and generation
huggingface.co/deepseek-ai/...
✨ 1B / 7B
✨ MIT License

deepseek-ai/Janus-Pro-1B · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

January 27, 2025 at 5:36 PM

Reposted by Alessio Devoto

Katharine Hayhoe

@katharinehayhoe.com

Not only that, but much of the science community here is already stronger and larger than it was on X.

On Twitter, my feed of scientists who study climate-related topics topped out at 3300. Here, we’re at 4500 already and it’s still growing.

Pin here: bsky.app/profile/did:...

Nature @nature.com · Jan 24

Roughly 6,000 readers answered our poll, with many declaring that Bluesky was nicer, kinder and less antagonistic to science than X

https://go.nature.com/42tH8Ai

Bluesky’s science takeover: 70% of Nature poll respondents use platform

Roughly 6,000 readers answered our poll, with many declaring that Bluesky was nicer, kinder and less antagonistic to science than X.

go.nature.com

January 24, 2025 at 8:26 PM

Reposted by Alessio Devoto

Simone Scardapane

@sscardapane.bsky.social

*MoE Graph Transformers for Interpretable Particle Collision Detection*
by @alessiodevoto.bsky.social @sgiagu.bsky.social et al.

We propose a MoE graph transformer for particle collision analysis, with many nice interpretability insights (e.g., expert specialization).

arxiv.org/abs/2501.03432

January 10, 2025 at 2:12 PM

Reposted by Alessio Devoto

Vicki

@vickiboykis.com

deepseek GGUF just dropped, if you have 207GB disk/40GB RAM for the smallest version huggingface.co/collections/...

Deepseek V3 (All Versions) - a unsloth Collection

Deepseek V3 - available in bf16, original, and GGUF formats, with support for 2, 3, 4, 5, 6 and 8-bit quantized versions.

huggingface.co

January 8, 2025 at 12:20 AM

Alessio Devoto

@alessiodevoto.bsky.social

LLMs inner representations🔬
Llamas Work in English: LLMs default to English-based concept representations, regardless of input language @wendlerc.bsky.social et al
Semantic Hub: Multimodal models create a single shared semantic space, structured by their primary language @zhaofengwu.bsky.social et a

December 22, 2024 at 5:57 PM

Reposted by Alessio Devoto

Jeremy Howard

@howard.fm

I'll get straight to the point.

We trained 2 new models. Like BERT, but modern. ModernBERT.

Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.

It's much faster, more accurate, longer context, and more useful. 🧵

December 19, 2024 at 4:45 PM

Alessio Devoto

@alessiodevoto.bsky.social

In Vision & Audio transformers, not all tokens need the same compute resources! We propose “modular learners” to control compute at token-level granularity (MHA & MLP): hard tokens get more, easy ones get less!
w/ @sscardapane.bsky.social @neuralnoise.com @bartoszWojcik
Soon #AAAI25
Link 👇

December 19, 2024 at 4:33 PM

Reposted by Alessio Devoto

Max Kleiman-Weiner

@maxkw.bsky.social

Josh Tenenbaum on scaling up vs growing up and the path to human-like reasoning #NeurIPS2024

December 15, 2024 at 6:14 PM

Reposted by Alessio Devoto

Simone Scardapane

@sscardapane.bsky.social

*Adaptive Computation Modules: Granular Conditional Computation For Efficient Inference*
with @alessiodevoto.bsky.social @neuralnoise.com

Happy to share our work on distilling efficient transformers with dynamic modules' activation was accepted at #AAAI2025. 🔥

arxiv.org/abs/2312.10193

December 11, 2024 at 2:25 PM

Reposted by Alessio Devoto

Simone Scardapane

@sscardapane.bsky.social

*Sparse Crosscoders for Cross-Layer Features and Model Diffing*
by @colah.bsky.social @anthropic.com

Investigates stability & dynamics of "interpretable features" with cross-layers SAEs. Can also be used to investigate differences in fine-tuned models.

transformer-circuits.pub/2024/crossco...

December 6, 2024 at 2:02 PM

Alessio Devoto

@alessiodevoto.bsky.social

Very cool work! 👏🚀 Unfortunately, errors in the original dataset will propagate to all new languages 😕

We investigated the issue of existing errors in the original MMLU in
arxiv.org/abs/2406.04127

@aryopg.bsky.social @neuralnoise.com

Sara Hooker @sarahooker.bsky.social · Dec 5

Is MMLU Western-centric? 🤔

As part of a massive cross-institutional collaboration:
🗽Find MMLU is heavily overfit to western culture
🔍 Professional annotation of cultural sensitivity data
🌍 Release improved Global-MMLU 42 languages

📜 Paper: arxiv.org/pdf/2412.03304
📂 Data: hf.co/datasets/Coh...

December 6, 2024 at 1:57 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news