Lightnews — Scholar-powered news

Wissam Antoun

@wissamantoun.bsky.social

93 followers 7 following 14 posts

PhD at ALMAnaCH/Inria Paris,
@aubmindlab Alumni
Interested in AI, NLP, Video Games

wissamantoun.com

Posts Replies Media Videos

Wissam Antoun

@wissamantoun.bsky.social

⚠️ Finetuning stability matters.

ModernBERT exhibits instabilities in downstream fine-tuning tasks.

While DeBERTaV3 offers more stable training dynamics.

April 14, 2025 at 3:41 PM

Wissam Antoun

@wissamantoun.bsky.social

Data quality matters?

High-quality pretraining data accelerates convergence but offers minimal gains in final performance.

We suggest that current benchmarks may be saturated, limiting their ability to distinguish model improvements.

April 14, 2025 at 3:41 PM

Wissam Antoun

@wissamantoun.bsky.social

Key takeaway:

When trained on identical data, DeBERTaV3 outperforms ModernBERT in benchmark tasks.

ModernBERT's strength is faster training and inference, but it doesn't surpass DeBERTaV3 in accuracy on NLU tasks.

April 14, 2025 at 3:41 PM

Wissam Antoun

@wissamantoun.bsky.social

Check the improved performance across a variety of general and domain-specific French NLP tasks.

The new models vastly outperform their predecessors and even match domain-specific finetunes🧑‍⚕️.

[5/8]

November 15, 2024 at 5:07 PM

Wissam Antoun

@wissamantoun.bsky.social

A newly built tokenizer based on WordPiece:
- 32,768 tokens
- addition of newline and tab characters
- support emojis with zero-width-joiner
- numbers are split into two digits tokens
- support French elisions

[3/8]

November 15, 2024 at 5:07 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news