Patrick Haller
phmaker.bsky.social
Patrick Haller
@phmaker.bsky.social
PhD student | parameter- and sample-efficient language modeling | at HU Berlin
Are transformers really all we need? I doubt it. We tested alternative backbones for language models in low-resource scenarios — #Mamba, #xLSTM, and #HGRN2 — and they work surprisingly well!

📄 Paper: aclanthology.org/2024.conll-b...

Thanks for being part of the #BabyLM Challenge! 👶
BabyHGRN: Exploring RNNs for Sample-Efficient Language Modeling
Patrick Haller, Jonas Golde, Alan Akbik. The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning. 2024.
aclanthology.org
March 11, 2025 at 10:05 AM