Anthony Fuller
anthonyfuller.bsky.social
Anthony Fuller
@anthonyfuller.bsky.social
PhD Student at Carleton University (Ottawa, Canada)
https://antofuller.github.io/
Awesome, thanks for the explanation!
February 4, 2025 at 4:55 PM
Cool work? I think the position encoding method looks similar to DeBERTa’s disentangled attention: arxiv.org/abs/2006.03654
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Recent progress in pre-trained neural language models has significantly improved the performance of many natural language processing (NLP) tasks. In this paper we propose a new model architecture DeBE...
arxiv.org
February 4, 2025 at 2:51 PM