Lightnews — Scholar-powered news

Shahroz Butt

@shahrozbutt.bsky.social

140 followers 1.1K following 2 posts

BS student @UAF | exploring deep learning

Posts Replies Media Videos

Reposted by Shahroz Butt

Samuel Vaiter

@samuelvaiter.com

Automatic differentiation in forward mode computes derivatives by breaking down functions into elem operations and propagating derivatives alongside values. It’s efficient for functions with fewer inputs than outputs and for Jacobian-vect prod, using for instance dual numbers.

December 13, 2024 at 6:00 AM

Reposted by Shahroz Butt

Gergő Bocsárdi

@agbocsardi.com

#ai, #ml or #llm people here, what do you think about the “super weight” paper?

TLDR: deleting one single weight from a 7B model turns it completely incoherent, destroying it’s ability to generate legible text.

arxiv.org/pdf/2411.07191

A figure from the attached paper showing the difference in output between a benchmark model, and one with the super weight removed. The benchmark model generates a reasonable answer, the one where the weight is missing generates complete gibberish

December 1, 2024 at 7:06 AM

Reposted by Shahroz Butt

Christoph Molnar

@christophmolnar.bsky.social

No one can explain stochastic gradient descent better than this panda.

a panda bear is rolling around in the grass in a zoo enclosure .

Alt: a panda bear is rolling around in the grass in a zoo enclosure .

media.tenor.com

November 24, 2024 at 3:04 PM

Reposted by Shahroz Butt

Marc Marone

@marcmarone.com

I noticed a lot of starter packs skewed towards faculty/industry, so I made one of just NLP & ML students: go.bsky.app/vju2ux

Students do different research, go on the job market, and recruit other students. Ping me and I'll add you!

November 23, 2024 at 7:54 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news