ENS Paris. (prev ETH Zurich, Edinburgh, Oxford..)
Working on mathematical foundations/probabilistic interpretability of ML (what NNs learn🤷♂️, disentanglement🤔, king-man+woman=queen?👌…)
We take a step towards unravelling its mystery by explaining why the phenomenon of disentanglement arises in generative latent variable models.
Blog post: carl-allen.github.io/theory/2024/...
With José A. Carrillo, @gabrielpeyre.bsky.social and @pierreablin.bsky.social, we tackle this in our new preprint: A Unified Perspective on the Dynamics of Deep Transformers arxiv.org/abs/2501.18322
ML and PDE lovers, check it out!
With José A. Carrillo, @gabrielpeyre.bsky.social and @pierreablin.bsky.social, we tackle this in our new preprint: A Unified Perspective on the Dynamics of Deep Transformers arxiv.org/abs/2501.18322
ML and PDE lovers, check it out!
We take a step towards unravelling its mystery by explaining why the phenomenon of disentanglement arises in generative latent variable models.
Blog post: carl-allen.github.io/theory/2024/...
We take a step towards unravelling its mystery by explaining why the phenomenon of disentanglement arises in generative latent variable models.
Blog post: carl-allen.github.io/theory/2024/...
Fun project with @confusezius.bsky.social, @zeynepakata.bsky.social, @dimadamen.bsky.social and
@olivierhenaff.bsky.social.
Turns out you can, and here is how: arxiv.org/abs/2411.15099
Really excited to this work on multimodal pretraining for my first bluesky entry!
🧵 A short and hopefully informative thread:
#ICLR2025 @iclr-conf.bsky.social
#ICLR2025 @iclr-conf.bsky.social
What does the actual data show? Less than 500 farms/year will be pay more tax as a result of this change every year. Possibly as few as 100.
What does the actual data show? Less than 500 farms/year will be pay more tax as a result of this change every year. Possibly as few as 100.