Head of interpretability research at EleutherAI, but posts are my own views, not Eleuther’s.
it's where judaism and christianity got the idea of ethical monotheism, afterlife, and final judgment but without any of their baggage
(no eternal hell, no historically questionable dogmas, etc.)
it's where judaism and christianity got the idea of ethical monotheism, afterlife, and final judgment but without any of their baggage
(no eternal hell, no historically questionable dogmas, etc.)
If we care about the process used to create things then humans can still have jobs and meaningful lives
The idea that ends can be detached from means is the root of many evils
If we care about the process used to create things then humans can still have jobs and meaningful lives
The idea that ends can be detached from means is the root of many evils
estimating the causal effect of either learning or unlearning one datapoint (or set of datapoints) on the neural network's behavior on other datapoints
estimating the causal effect of either learning or unlearning one datapoint (or set of datapoints) on the neural network's behavior on other datapoints
They aren't made of fixed mechanisms.
They have flows of information and intensities of neural activity. They can't be organized into a set of parts with fixed functions.
In the words of Gilles Deleuze, they're bodies without organs (BwO).
They aren't made of fixed mechanisms.
They have flows of information and intensities of neural activity. They can't be organized into a set of parts with fixed functions.
In the words of Gilles Deleuze, they're bodies without organs (BwO).
https://pytorch.org/docs/stable/generated/torch.nn.functional.embedding_bag.html
https://pytorch.org/docs/stable/generated/torch.nn.functional.embedding_bag.html
First, it sheds light on how deep learning works. The "volume hypothesis" says DL is similar to randomly sampling a network from weight space that gets low training loss. But this can't be tested if we can't measure volume.
First, it sheds light on how deep learning works. The "volume hypothesis" says DL is similar to randomly sampling a network from weight space that gets low training loss. But this can't be tested if we can't measure volume.
And networks which memorize their training data without generalizing have lower local volume— higher complexity— than generalizing ones.
And networks which memorize their training data without generalizing have lower local volume— higher complexity— than generalizing ones.
Importance sampling using gradient info helps address this issue by making us more likely to sample outliers.
Importance sampling using gradient info helps address this issue by making us more likely to sample outliers.
The distance from the anchor to the edge of the region, along the random direction, gives us an estimate of how big (or how probable) the region is as a whole.
The distance from the anchor to the edge of the region, along the random direction, gives us an estimate of how big (or how probable) the region is as a whole.
You can think of this as a measure of complexity: less probable, means more complex.
You can think of this as a measure of complexity: less probable, means more complex.
We crunched the numbers and here's the answer:
We crunched the numbers and here's the answer:
you know you're on a roll when arxiv throttles you
you know you're on a roll when arxiv throttles you
Natural selection alone doesn't explain "train-test" or "sim-to-real" generalization, which clearly happens.
At every level of organization, life can zero-shot adapt to novel situations. https://www.youtube.com/watch?v=jJ9O5H2AlWg
Natural selection alone doesn't explain "train-test" or "sim-to-real" generalization, which clearly happens.
At every level of organization, life can zero-shot adapt to novel situations. https://www.youtube.com/watch?v=jJ9O5H2AlWg
But we should accept the existence of perspective-neutral facts about how perspectives relate to one another, to avoid vicious skeptical paradoxes. https://arxiv.org/abs/2410.13819
But we should accept the existence of perspective-neutral facts about how perspectives relate to one another, to avoid vicious skeptical paradoxes. https://arxiv.org/abs/2410.13819