https://web.mit.edu/phillipi/
We do the simplest thing: just train a model (e.g., a next-token predictor) on all elements of the concatenated dataset [X,Y,Z].
You end up with a better model of dataset X than if you had trained on X alone!
6/9
We do the simplest thing: just train a model (e.g., a next-token predictor) on all elements of the concatenated dataset [X,Y,Z].
You end up with a better model of dataset X than if you had trained on X alone!
6/9
5/9
5/9
If you ask it to “imagine hearing,” its representation becomes more like that of an auditory model.
3/9
If you ask it to “imagine hearing,” its representation becomes more like that of an auditory model.
3/9
We are interested in identifying commonalities between different models and modalities, and providing unifications.
2/9
We are interested in identifying commonalities between different models and modalities, and providing unifications.
2/9
chatgpt.com/share/689364...
chatgpt.com/share/689364...
This talk says: don't worry, we don't need to equip NNs with special distance measures, these structures emerge "for free" at scale!
I've given this one before, but there will be a bit that's new.
This talk says: don't worry, we don't need to equip NNs with special distance measures, these structures emerge "for free" at scale!
I've given this one before, but there will be a bit that's new.
This one is about distance between settings of the *weights*.
Its answer: incorporate knowledge about the neural net *architecture* into how you measure distance.
This one is about distance between settings of the *weights*.
Its answer: incorporate knowledge about the neural net *architecture* into how you measure distance.
This paper is about distance between embeddings.
It says: measure how humans perceive distance, then adjust a neural net to match.
This improves transfer to lots of tasks (but not all tasks).
This paper is about distance between embeddings.
It says: measure how humans perceive distance, then adjust a neural net to match.
This improves transfer to lots of tasks (but not all tasks).
They are all about the following question: how to characterize the geometry of deep learning problems, and in particular how to measure *distance*?
Each paper/talk gives a rather different answer, detailed below:
They are all about the following question: how to characterize the geometry of deep learning problems, and in particular how to measure *distance*?
Each paper/talk gives a rather different answer, detailed below:
A few items I wasn't sure where to put. You could break it down differently. What did I get wrong?
A few items I wasn't sure where to put. You could break it down differently. What did I get wrong?
A big dream in AI is to create world models of sufficient quality that you can train agents within them.
Classic simulators lack visual diversity and realism. GenAI lacks physical accuracy. But combining the two can work pretty well!
Paper: arxiv.org/abs/2411.00083
A big dream in AI is to create world models of sufficient quality that you can train agents within them.
Classic simulators lack visual diversity and realism. GenAI lacks physical accuracy. But combining the two can work pretty well!
Paper: arxiv.org/abs/2411.00083