walterblueu.bsky.social
@walterblueu.bsky.social
Dave is a good dude, sad to him go on a personal level
December 2, 2024 at 4:29 PM
At least he beat the shit out of the air near Jake Paul. That's more than I've ever done.
November 16, 2024 at 1:32 PM
But in all seriousness the main breakthrough that led to the architecture behind LLMs was a bit of a happy accident as Bob Ross would have said. Get a bunch of smart and well funded people flailing away at a really hard problem and sometimes a solution pops out arxiv.org/abs/1706.03762
Attention Is All You Need
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and d...
arxiv.org
November 15, 2024 at 3:45 AM
I think they tried Pepsi in that MIT article, don't think they stumbled onto Coke yet though.
November 15, 2024 at 3:23 AM
It is certainly looking like larger and larger LLMs won't get to AGI. Was reading an article on @theinformation.bsky.social today about how companies are looking at other strategies (arxiv.org/abs/2411.07279). Seems like a new development every week at this point, hard to make predictions.
The Surprising Effectiveness of Test-Time Training for Abstract Reasoning
Language models have shown impressive performance on tasks within their training distribution, but often struggle with novel problems requiring complex reasoning. We investigate the effectiveness of t...
arxiv.org
November 15, 2024 at 1:42 AM