Lightnews — Scholar-powered news

Apoorv Khandelwal

@apoorvkh.com

Will be at ACL this week! #ACL2025 #ACL2025NLP

Presenting Tian Yun’s paper on abstract reasoners at CoNLL on Thursday.

I’ve been investigating how LLMs internally compose functions lately. Happy to chat about that (among other things) and hang out in Vienna!

July 28, 2025 at 5:09 AM

Reposted by Apoorv Khandelwal

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Tests on USAMO immediately after problems were posted yield surprisingly bad model performance. Suggests there's much more training on test than expected.
arxiv.org/abs/2503.219...

Scores of R1, Flash-thinking, Claude 4.7, QwQ, o1-pro, o3-mini on USAMO 2025. Scores less than 5% of max score.

March 31, 2025 at 7:08 PM

Reposted by Apoorv Khandelwal

Dr Sasha Luccioni

@sashamtl.bsky.social

Just read that AI’s energy consumption in data centers is nothing to be worried about because most of the hyperscale datacenters running AI are "powered by renewable energy or low-carbon nuclear power."

Let's debunk that, shall we?

March 19, 2025 at 7:24 PM

Reposted by Apoorv Khandelwal

Naomi Saphra

@nsaphra.bsky.social

If you're in the northeastern US and you're submitting a paper to COLM on March 27, you should absolutely be sending its abstract to New England NLP on March 28.

New England NLP Meeting Series

nenlp.github.io

March 19, 2025 at 7:59 PM

Apoorv Khandelwal

@apoorvkh.com

We made a library (torchrunx) to make multi-GPU / multi-node PyTorch easier, more robust, and more modular! 🧵

github.com/apoorvkh/tor...
Docs: torchrun.xyz

`(uv) pip install torchrunx` today!

(w/ the very talented, Peter Curtin, Brown CS '25)

GitHub - apoorvkh/torchrunx: Easily run PyTorch on multiple GPUs & machines

Easily run PyTorch on multiple GPUs & machines. Contribute to apoorvkh/torchrunx development by creating an account on GitHub.

github.com

March 11, 2025 at 4:54 PM

Reposted by Apoorv Khandelwal

William Merrill

@lambdaviking.bsky.social

✨How does the depth of a transformer affect its reasoning capabilities? New preprint by myself and @Ashish_S_AI shows that a little depth goes a long way to increase transformers’ expressive power

We take this as encouraging for further research on looped transformers!🧵

Paper: A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers

March 7, 2025 at 4:46 PM

Reposted by Apoorv Khandelwal

Sonia Murthy

@soniakmurthy.bsky.social

(1/9) Excited to share my recent work on "Alignment reduces LM's conceptual diversity" with @tomerullman.bsky.social and @jennhu.bsky.social, to appear at #NAACL2025! 🐟

We want models that match our values...but could this hurt their diversity of thought?
Preprint: arxiv.org/abs/2411.04427

February 10, 2025 at 5:20 PM

Apoorv Khandelwal

@apoorvkh.com

I started a blog! First post is everything I know about setting up (fast, reproducible, error-proof) Python project environments using the latest tools. These methods have saved me a lot of grief. Also a short guide to CUDA in appendix :)

blog.apoorvkh.com/posts/projec...

Managing Project Dependencies

blog.apoorvkh.com

February 7, 2025 at 3:45 PM

Reposted by Apoorv Khandelwal

James Tompkin

@jamestompkin.bsky.social

Can GANs compete in 2025? In 'The GAN is dead; long live the GAN! A Modern GAN Baseline', we show that a minimalist GAN w/o any tricks can match the performance of EDM with half the size and one-step generation - github.com/brownvc/r3gan - work of Nick Huang, @skylion.bsky.social, Volodymyr Kuleshov

January 10, 2025 at 7:08 PM

Apoorv Khandelwal

@apoorvkh.com

A couple sources for academic talks that I really like!

Cohere For AI (www.youtube.com/playlist?lis...)

Simons Institute (www.youtube.com/@SimonsInsti...)

Simons Institute

The Simons Institute for the Theory of Computing is the world's leading venue for collaborative research in theoretical computer science. Established on July 1, 2012, the Institute is housed in Calvin...

www.youtube.com

January 10, 2025 at 8:05 PM

Reposted by Apoorv Khandelwal

Naomi Saphra

@nsaphra.bsky.social

Let he who hath not \usepackage[subtle]{savetrees}

December 18, 2024 at 1:27 AM

Reposted by Apoorv Khandelwal

Jennifer Hu

@jennhu.bsky.social

Slides from the tutorial are now posted here!

neurips.cc/media/neurip...

neurips.cc

December 11, 2024 at 4:43 PM

Reposted by Apoorv Khandelwal

Alexander Doria

@dorialexander.bsky.social

“They said it could not be done”. We’re releasing Pleias 1.0, the first suite of models trained on open data (either permissibly licensed or uncopyrighted): Pleias-3b, Pleias-1b and Pleias-350m, all based on the two trillion tokens set from Common Corpus.

December 5, 2024 at 4:39 PM

Reposted by Apoorv Khandelwal

Ben Lipkin

@benlipkin.bsky.social

Lots of folks talking about scaling LLM inference over this last year

Internally, I’ve been developing and using a library that makes this extremely easy, and I decided to open-source it
Meet the decoding library: github.com/benlipkin/de...

1/7

GitHub - benlipkin/decoding: Composable inference algorithms with LLMs and programmable logic

Composable inference algorithms with LLMs and programmable logic - benlipkin/decoding

github.com

November 25, 2024 at 4:19 PM

Reposted by Apoorv Khandelwal

Joe Stacey

@joestacey.bsky.social

Okay genius idea to improve quality of #nlp #arr reviews. Literally give gold stars to the best reviewers, visible on open review next to your anonymously ID during review process.

Here’s why it would work, and why would you should RT this fab idea:

November 24, 2024 at 9:01 PM

Reposted by Apoorv Khandelwal

Hamish Ivison

@hamishivi.bsky.social

Excited to release Tulu 3! We worked hard to try and make the best open post-training recipe we could, and the results are good!
I was lucky enough to work on almost every stage of the pipeline in one way or another. Some comments + highlights ⬇️

November 21, 2024 at 5:45 PM

Apoorv Khandelwal

@apoorvkh.com

Nature wrote a nice article about our work!

www.nature.com/articles/d41...

AI’s computing gap: academics lack access to powerful chips needed for research

Survey highlights disparity between academic and industry scientists’ access to computing power needed to train machine-learning models.

www.nature.com

November 21, 2024 at 4:23 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news