Lightnews — Scholar-powered news

Ilker Kesen

@ilkerkesen.bsky.social

I'm forcing GPT-5.1 to translate some English text to some target language that I don't know. It decides to use its thinking feature, and then reasons about switching to the target language for the entire output, including explanations and conversational parts. Sigh.

November 14, 2025 at 2:31 PM

Ilker Kesen

@ilkerkesen.bsky.social

This week at #EMNLP2025, I'll present our research on pretraining a multilingual pixel language model. Join the multilinguality session on Friday at 10:30 in Room A301 to learn more about pixel models and their benefits in multilingual settings. (Unfortunately I’ll be on Zoom)

November 3, 2025 at 5:39 PM

Reposted by Ilker Kesen

LAGoM NLP

@lagom-nlp.bsky.social

When is a language hard to model? Previous research has suggested that morphological complexity both does and does not play a role, but it does so by relating the performance of language models to corpus statistics of words or subword tokens in isolation.

November 3, 2025 at 11:53 AM

Ilker Kesen

@ilkerkesen.bsky.social

📢New preprint: We introduce 📏Cetvel, a unified benchmark for evaluating language understanding, generation, and cultural capacity of LLMs in Turkish🇹🇷 #AI #LLM #NLProc

Joint work with Abrek Er, @gozdegulsahin.bsky.social, @aykuterdem.bsky.social from KUIS AI Center.

September 5, 2025 at 1:40 PM

Ilker Kesen

@ilkerkesen.bsky.social

Excited to share that our paper "Multilingual Pretraining for Pixel Language Models" has been accepted to the #EMNLP2025 main conference! Please see the thread below and the paper itself for more details.

Ilker Kesen @ilkerkesen.bsky.social · Jun 4

Announcing our recent work “Multilingual Pretraining for Pixel Language Models”! We introduce PIXEL-M4, a pixel language model pretrained on four visually & linguistically diverse scripts: English, Hindi, Ukrainian & Simplified Chinese. #NLProc

August 21, 2025 at 12:42 PM

Ilker Kesen

@ilkerkesen.bsky.social

Announcing our recent work “Multilingual Pretraining for Pixel Language Models”! We introduce PIXEL-M4, a pixel language model pretrained on four visually & linguistically diverse scripts: English, Hindi, Ukrainian & Simplified Chinese. #NLProc

June 4, 2025 at 1:45 PM

Reposted by Ilker Kesen

Isra Salazar

@israsalazar.bsky.social

Today we are releasing Kaleidoscope 🎉

A comprehensive multimodal & multilingual benchmark for VLMs! It contains real questions from exams in different languages.

🌍 20,911 questions and 18 languages
📚 14 subjects (STEM → Humanities)
📸 55% multimodal questions

April 10, 2025 at 10:31 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news