Lightnews — Scholar-powered news

Reposted by Sasha Rush

Naomi Saphra

@nsaphra.bsky.social

If you're in Berkeley or like a nice streamed talk, I'm about to give a talk at the Simons Institute today: "You Know It Or You Don’t: Compositionality and Phase Transitions in LMs". Tune in at 4PM pacific!

Talk By

No abstract available.

simons.berkeley.edu

February 5, 2025 at 7:49 PM

Sasha Rush

@srushnlp.bsky.social

What to know about DeepSeek

youtu.be/0eMzc-WnBfQ?...

In which we attempt to figure out MoE, o1, scaling, tech reporting, modern semiconductors, microeconomics, and international geopolitics.

How DeepSeek Changes the LLM Story

YouTube video by Sasha Rush 🤗

youtu.be

February 4, 2025 at 3:41 PM

Sasha Rush

@srushnlp.bsky.social

For reasons, I find myself thinking a lot about the history of US/USSR Cold War science, particularly in applied math. Does anyone have a recommendation for a good book on this topic?

January 30, 2025 at 5:04 PM

Sasha Rush

@srushnlp.bsky.social

10 short videos about LLM infrastructure to help you appreciate Pages 12-18 of the DeepSeek-v3 paper (arxiv.org/abs/2412.19437)

www.youtube.com/watch?v=76gu...

Flash LLMs: Pipeline Parallel

YouTube video by Sasha Rush 🤗

www.youtube.com

January 7, 2025 at 3:01 PM

Sasha Rush

@srushnlp.bsky.social

I tried reading this book, and I was just shocked at how little insight it had, and it's sheer inability to focus. The fact that it is being recommended to policy makers...

www.gatesnotes.com/The-Coming-W...

www.gatesnotes.com

January 6, 2025 at 10:12 PM

Sasha Rush

@srushnlp.bsky.social

I'm going to do a live coding stream for the next couple of hours. We'll start by running through some WebGPU tutorials. Can also talk about some AI stuff.

www.youtube.com/watch?v=sqKq...

Python + WebGPU

YouTube video by Sasha Rush 🤗

www.youtube.com

December 27, 2024 at 7:00 PM

Reposted by Sasha Rush

Lewis Tunstall

@lewtun.bsky.social

We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute 🔥

How? By combining step-wise reward models with tree search algorithms :)

We're open sourcing the full recipe and sharing a detailed blog post 👇

December 16, 2024 at 5:08 PM

Sasha Rush

@srushnlp.bsky.social

As coding LLMs get faster at inference, iterating verification-in-the-loop tests becomes the bottleneck for coding agents. Probably need quite different programming systems for these settings, or even things like "batchable" runtimes, whatever that means.

December 11, 2024 at 6:43 PM

Reposted by Sasha Rush

antonio vergari ⚔️ short-circuiting

@nolovedeeplearning.bsky.social

We organised a lively poster fest with many students rehearsing for the upcoming @neuripsconf.bsky.social next week and others discussing their cool works!

Thanks to #GAIL, the #Generative #AI lab in #Edinburgh for sponsoring the event!

December 6, 2024 at 8:43 PM

Sasha Rush

@srushnlp.bsky.social

December 6, 2024 at 2:28 AM

Reposted by Sasha Rush

Andrew Gordon Wilson

@andrewgwils.bsky.social

I wanted to make my first post about a project close to my heart. Linear algebra is an underappreciated foundation for machine learning. Our new framework CoLA (Compositional Linear Algebra) exploits algebraic structure arising from modelling assumptions for significant computational savings! 1/4

December 5, 2024 at 3:15 PM

Reposted by Sasha Rush

Margaret Mitchell

@mmitchell.bsky.social

Proud of my amazing colleague @sashamtl.bsky.social for her much deserved recognition on advancing the science of AI energy use.
BBC100: www.bbc.co.uk/news/resourc...
Fast Company: www.fastcompany.com/91233692/why...
Sasha has been working tirelessly moving things fwd--endurance & brilliance in one.

Screenshot of BBC 100 picture of Sasha and blurb; linked in post.

December 3, 2024 at 6:47 PM

Reposted by Sasha Rush

Kempner Institute at Harvard University

@kempnerinstitute.bsky.social

NEW: we have an exciting opportunity for a tenure-track professor at the #KempnerInstitute and the John A. Paulson School of Engineering and Applied Sciences (SEAS). Read the full description & apply today: academicpositions.harvard.edu/postings/14362
#ML #AI

December 3, 2024 at 1:24 AM

Reposted by Sasha Rush

Nathan Lambert

@natolambert.bsky.social

We're hiring another predoctoral researcher for my team at Ai2/OLMo next year. The goal of this position is to mentor and grow future academic stars of NLP/AI over 1-2 years before grad school.

This ends up being people done with BS or MS who want to continue to a PhD soon.
https://buff.ly/49nuggo

December 3, 2024 at 11:52 PM

Sasha Rush

@srushnlp.bsky.social

Is there a community that writes RL-first programming languages? Something like (Num)Pyro that takes seriously the idea of separating the policy specification from the learning process.

December 2, 2024 at 2:24 PM

Reposted by Sasha Rush

Todd Morrill

@toddmorrill.bsky.social

ICYMI, @srushnlp.bsky.social recently gave a nice talk speculating about the methods/data used to train OpenAI's o1 model. The key idea seems to be scaling up chain-of-thought (CoT) generation using auxiliary verifier models that can give feedback on the correctness of the generation. 1/N

Sasha Rush @srushnlp.bsky.social · Nov 12

Talk: Speculations on Test-Time Scaling

A tutorial on the technical aspects behind OpenAI's o1 and open research questions in this space.

youtu.be/6PEJ96k1kiw
Slides+bibliography: github.com/srush/awesom...

Speculations on Test-Time Scaling (o1)

YouTube video by Sasha Rush 🤗

youtu.be

December 1, 2024 at 3:24 AM

Reposted by Sasha Rush

François Fleuret

@francois.fleuret.org

My deep learning course at the University of Geneva is available on-line. 1000+ slides, ~20h of screen-casts. Full of examples in PyTorch.

fleuret.org/dlc/

And my "Little Book of Deep Learning" is available as a phone-formatted pdf (nearing 700k downloads!)

fleuret.org/lbdl/

November 26, 2024 at 6:15 AM

Reposted by Sasha Rush

William Merrill

@lambdaviking.bsky.social

A starter pack for research on Formal Languages and Neural Networks!

go.bsky.app/eKG7Ua

November 26, 2024 at 4:02 PM

Reposted by Sasha Rush

International Conference on 3D Vision

@3dvconf.bsky.social

Hello Bluesky! 🔵 We start our account by having our third guess for Ask Me Anything session #3DV2025AMA!

Noah Snavely @snavely.bsky.social from Cornell & Google DeepMind! 🌟

🕒 You have now 24 HOURS to ask him anything — drop your questions in the comments below!

Keep it engaging but respectful!

November 26, 2024 at 3:33 PM

Sasha Rush

@srushnlp.bsky.social

Rare personal tweet:
Subletting our furnished apartment in Brooklyn for the spring at a significant discount. It's quite nice and in a fun location. under price. Email me know if you are interested, I will send pictures.

November 25, 2024 at 8:39 PM

Sasha Rush

@srushnlp.bsky.social

Several incredible NeurIPS tutorials this year. Worth navigating through the Swifties.

Edoardo Ponti @edoardo-ponti.bsky.social · Nov 20

Another nano gem from my amazing student
Piotr Nawrot!

A repo & notebook on sparse attention for efficient LLM inference: github.com/PiotrNawrot/...

This will also feature in my #NeurIPS 2024 tutorial "Dynamic Sparsity in ML" with André Martins: dynamic-sparsity.github.io Stay tuned!

A sparse mask of attention scores based on VerticalAndSlashAttention and a plot of loss vs sparsity ratio for various methods.

November 21, 2024 at 10:01 PM

Reposted by Sasha Rush

Edoardo Ponti

@edoardo-ponti.bsky.social

Another nano gem from my amazing student
Piotr Nawrot!

A repo & notebook on sparse attention for efficient LLM inference: github.com/PiotrNawrot/...

This will also feature in my #NeurIPS 2024 tutorial "Dynamic Sparsity in ML" with André Martins: dynamic-sparsity.github.io Stay tuned!

November 20, 2024 at 12:51 PM

Reposted by Sasha Rush

Nathan Lambert

@natolambert.bsky.social

I've spent the last two years scouring all available resources on RLHF specifically and post training broadly. Today, with the help of a totally cracked team, we bring you the fruits of that labor — Tülu 3, an entirely open frontier model post training recipe. We beat Llama 3.1 Instruct.

Thread.

November 21, 2024 at 5:01 PM

Sasha Rush

@srushnlp.bsky.social

Discrete diffusion has become a very hot topic again this year. Dozens of interesting ICLR submissions and some exciting attempts at scaling. Here's a bibliography on the topic from the Kuleshov group (my open office neighbors).

github.com/kuleshov-gro...

GitHub - kuleshov-group/awesome-discrete-diffusion-models: A curated list for awesome discrete diffusion models resources.

A curated list for awesome discrete diffusion models resources. - kuleshov-group/awesome-discrete-diffusion-models

github.com

November 21, 2024 at 6:39 PM

Sasha Rush

@srushnlp.bsky.social

A disproportionate number of sucessful CS academics have some intense cardio hobby. Took me some years to understand.

James Medlock @jdcmedlock.bsky.social · Nov 19

Every time I see someone post this image it goes viral

November 21, 2024 at 12:37 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news