@kmahowald.bsky.social, and it is the outcome of Finlay's Bachelor's thesis! Catch him presenting it in #EMNLP2025 :)
Paper: arxiv.org/abs/2509.26643
Code: github.com/Tr1ple-F/con...
@kmahowald.bsky.social, and it is the outcome of Finlay's Bachelor's thesis! Catch him presenting it in #EMNLP2025 :)
Paper: arxiv.org/abs/2509.26643
Code: github.com/Tr1ple-F/con...
* A slow-reconvergence phase, where predictions slowly become more similar again (especially in larger models).
* A slow-reconvergence phase, where predictions slowly become more similar again (especially in larger models).
* A uniform phase, where all seeds output nearly-uniform distributions.
* A sharp-convergence phase, where models align, largely due to unigram frequency learning.
* A uniform phase, where all seeds output nearly-uniform distributions.
* A sharp-convergence phase, where models align, largely due to unigram frequency learning.
Paper: arxiv.org/abs/2507.08802
Paper: arxiv.org/abs/2507.08802