Lightnews — Scholar-powered news

Oscar Sainz

@osainz.bsky.social

24 followers 55 following 3 posts

Postdoctoral Researcher at the University of the Basque Country (UPV/EHU).

Posts Replies Media Videos

Reposted by Oscar Sainz

HiTZ zentroa (UPV/EHU)

@hitz-zentroa.bsky.social

#Latxa txatbota probatarako erabilgarri jarri dugu! latxa.hitz.eus

Jaso ditugun eskaerei erantzunez zuen eskura jarri dugu Latxaren bertsio ahaltsuena, chatGPT-tik gertu dabilena, baina euskara txukunagoa sortuz.

Gradio

Click to try out the app!

latxa.hitz.eus

October 31, 2025 at 6:57 AM

Reposted by Oscar Sainz

HiTZ zentroa (UPV/EHU)

@hitz-zentroa.bsky.social

Ayer uno de nuestros investigadores, Oscar Sainz (@osainz.bsky.social‬), fue galardonado con el premio a la mejor tesis doctoral en Inteligencia Artificial por la Asociación Española para la Inteligencia Artificial (AEPIA). ¡Enhorabuena! 🥳

July 10, 2025 at 7:25 AM

Reposted by Oscar Sainz

BERRIA

@berria.eus

«Kaixo, Latxa naiz. Zer jakin nahi duzu gaur?». Euskarazko txatbota sortu du EHUko HiTZ ikerketa zentroak. Oraindik ez dute publikora zabaldu, baina garatzaileek eta enpresek eskuratzeko aukera dute. BERRIAko testuak erabili dituzte Latxa entrenatzeko.
t.co/OPVNnBG2xW?utm_...

«Kaixo, Latxa naiz. Zer jakin nahi duzu gaur?»

Euskarazko txatbota sortu du EHUko HiTZ ikerketa zentroak. Oraindik ez dute jendaurrean zabaldu, baina garatzaileek eta enpresek eskuratzeko aukera dute. BERRIAko testuak erabili dituzte Latxa entr...

t.co

June 16, 2025 at 9:00 PM

Oscar Sainz

@osainz.bsky.social

Do you know that you can continue pretraining Instructed LLMs without losing their instruction following capabilities?

We did so to teach Basque to Llama models with promising results!

Interestingly, you only need English instructions and target language corpora 🤯

1/3

HiTZ zentroa (UPV/EHU) @hitz-zentroa.bsky.social · Jun 11

[4/7]
Key findings:
1️⃣Language corpora is essential: models need exposure to plain Basque text
2️⃣Starting from instructed models beats the standard base→instruct pipeline
3️⃣English-only instructions work well, but combining with Basque instructions yields the most robust models

June 11, 2025 at 6:01 PM

Reposted by Oscar Sainz

HiTZ zentroa (UPV/EHU)

@hitz-zentroa.bsky.social

[1/7]
#newHitzPaper

Many languages are underserved by open LLMs, and face the following question: Which is the best way to produce open instruction-tuned LLMs for low-resource languages?

We obtained great results for a cost-effective option!

📄Paper: arxiv.org/abs/2506.07597

June 11, 2025 at 10:27 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news