mercer-kernel.bsky.social
@mercer-kernel.bsky.social
kernels and graphical models
i find it refreshing compared to standard comments from established researchers proclaiming "don't work on LLMs" or "don't publish many papers" without stating the complexities and incentives of the real world in which all phd students live.
December 21, 2024 at 8:44 PM
thanks very much for this thoughtful post @kyunghyuncho.bsky.social. i especially liked the part that your take is very nuanced and acknowledges the opportunity cost students feel when working on a research topic in modern AI research landscape.
December 21, 2024 at 8:44 PM
Reposted
great thoughtful post @jfrankle.com. respectfully, don't agree that "bitter lesson is over". sepp hochreiter, no doubt a giant of the field, missed it. below is what another giant of the field (michael i jordan) said about deep learning for NLP in his reddit AMA back in 2015 and here we are ...
December 13, 2024 at 7:12 PM
(2) much less convinced in the case of NLP than, say, vision, the way to go is to couple huge amounts of data with black-box learning architectures." </quote end>
December 13, 2024 at 7:12 PM
<quote start> "Although current deep learning research tends to claim to encompass NLP, I’m (1) much less convinced about the strength of the results, compared to the results in, say, vision;
December 13, 2024 at 7:12 PM
great thoughtful post @jfrankle.com. respectfully, don't agree that "bitter lesson is over". sepp hochreiter, no doubt a giant of the field, missed it. below is what another giant of the field (michael i jordan) said about deep learning for NLP in his reddit AMA back in 2015 and here we are ...
December 13, 2024 at 7:12 PM
thanks!
December 11, 2024 at 2:56 AM
this is wonderful. @lampinen.bsky.social do you think that pre-training of language models is equivalent to meta learning in some reformulation?

p.s. love your causality work as well.
December 11, 2024 at 12:55 AM
e.g., i like ludwig schmidt's work which shows better data is the only thing that substantiatlly moves the needle on adversarial robustness. slideslive.com/39015739/a-d...
A data-centric view on reliable generalization
slideslive.com
December 8, 2024 at 6:02 AM
love this thread @yisongyue.bsky.social ! curious if you also believe that thinking about data as a first class citizen is another research direction that is very rich and wasn't considered important earlier?
December 8, 2024 at 6:02 AM
totally agree! but do you believe that systems that can train better ML models for us will be able to better science than us as well? i understand this could be a far-stretched conclusion ...
December 6, 2024 at 8:07 PM
great work! genuinely excited about how this could help neural nets the way your previous work on exploiting linear algebra helped gps.
December 5, 2024 at 10:41 PM
michael nielsen could be our generation's carl sagan - he just needs to be invited to more podcasts and tv shows.

loved his recent post so had to say this thttps://michaelnotebook.com/optimism/index.html
How to be a wise optimist about science and technology?
michaelnotebook.com
December 5, 2024 at 4:24 AM
great slides! can't help but wonder - are we bayesian retrofitting by thinking about in-context learning this way? (see cool post
probapproxincorrect.substack.com/p/some-thoug... from @desirivanova.bsky.social )
Some thoughts on the role of Bayesianism in the age of Modern AI
It works because it works.
probapproxincorrect.substack.com
December 5, 2024 at 3:16 AM