https://alemiaschi.github.io/
📅 Training data release: 22 September 2026
📅 Training data release: 22 September 2026
🧵(5/5)
🧵(5/5)
✅ Morphemes are recognized better than meaningless substrings.
✅ Awareness emerges early for suffixes and roots, later for non-morphemic units
✅ Productivity, word frequency and tokenization shape this ability.
🧵(4/5)
✅ Morphemes are recognized better than meaningless substrings.
✅ Awareness emerges early for suffixes and roots, later for non-morphemic units
✅ Productivity, word frequency and tokenization shape this ability.
🧵(4/5)
- substring position and length;
- morphemic vs. non-morphemic substrings;
- pre-training checkpoints.
🧵(3/5)
- substring position and length;
- morphemic vs. non-morphemic substrings;
- pre-training checkpoints.
🧵(3/5)
🧵(2/5)
🧵(2/5)
📂 Code & Dataset: github.com/snizio/Lexic...
🧵(5/5)
📂 Code & Dataset: github.com/snizio/Lexic...
🧵(5/5)
🧵(4/5)
🧵(4/5)
✅ A new framework to assess lexical abilities across tasks & word types
✅ A lexical resource for Italian with definitions & examples
✅ Analysis of model size, multilinguality & linguistic features
✅ Human eval via the Optimal Innovation Hypothesis
🧵(3/5)
✅ A new framework to assess lexical abilities across tasks & word types
✅ A lexical resource for Italian with definitions & examples
✅ Analysis of model size, multilinguality & linguistic features
✅ Human eval via the Optimal Innovation Hypothesis
🧵(3/5)
🧵(2/5)
🧵(2/5)