Lightnews — Scholar-powered news

walterblueu.bsky.social

@walterblueu.bsky.social

Dave is a good dude, sad to him go on a personal level

December 2, 2024 at 4:29 PM

walterblueu.bsky.social

@walterblueu.bsky.social

At least he beat the shit out of the air near Jake Paul. That's more than I've ever done.

November 16, 2024 at 1:32 PM

walterblueu.bsky.social

@walterblueu.bsky.social

But in all seriousness the main breakthrough that led to the architecture behind LLMs was a bit of a happy accident as Bob Ross would have said. Get a bunch of smart and well funded people flailing away at a really hard problem and sometimes a solution pops out arxiv.org/abs/1706.03762

Attention Is All You Need

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and d...

arxiv.org

November 15, 2024 at 3:45 AM

walterblueu.bsky.social

@walterblueu.bsky.social

I think they tried Pepsi in that MIT article, don't think they stumbled onto Coke yet though.

November 15, 2024 at 3:23 AM

walterblueu.bsky.social

@walterblueu.bsky.social

It is certainly looking like larger and larger LLMs won't get to AGI. Was reading an article on @theinformation.bsky.social today about how companies are looking at other strategies (arxiv.org/abs/2411.07279). Seems like a new development every week at this point, hard to make predictions.

The Surprising Effectiveness of Test-Time Training for Abstract Reasoning

Language models have shown impressive performance on tasks within their training distribution, but often struggle with novel problems requiring complex reasoning. We investigate the effectiveness of t...

arxiv.org

November 15, 2024 at 1:42 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news