mgaido91.bsky.social
@mgaido91.bsky.social
Reposted
🔍 Stiamo studiando come l'AI viene usata in Italia e per farlo abbiamo costruito un sondaggio!

👉 bit.ly/sondaggio_ai...

(è anonimo, richiede ~10 minuti, e se partecipi o lo fai girare ci aiuti un sacco🙏)

Ci interessa anche raggiungere persone che non si occupano e non sono esperte di AI!
Qualtrics Survey | Qualtrics Experience Management
The most powerful, simple and trusted way to gather experience data. Start your journey to experience management and try a free account today.
bit.ly
June 3, 2025 at 10:24 AM
Reposted
🚀 New tech report out! Meet FAMA, our open-science speech foundation model family for both ASR and ST in 🇬🇧 English and 🇮🇹 Italian.

The models are live and ready to try on @hf.co:
🔗 huggingface.co/collections/...

📄 Preprint: arxiv.org/abs/2505.22759

#ASR #ST #OpenScience #MultilingualAI
FAMA - a FBK-MT Collection
The First Large-Scale Open-Science Speech Foundation Model for English and Italian
huggingface.co
May 30, 2025 at 3:35 PM
Reposted
📢 Come and join our group!
We offer a fully funded 3-year PhD position:

📔 Automatic translation with large multimodal models: iecs.unitn.it/education/ad...

📍Full details for application: iecs.unitn.it/education/ad...

📅 Deadline May 12, 2025

#NLProc #FBK
Reserved topic scholarships | Doctoral Program - Information Engineering and Computer Science
iecs.unitn.it
April 22, 2025 at 10:13 AM
Interesting to see multimodal LLM built by combining modality encoders and LLM with adapters, as in the SFM+LLM paradigm, independently for each modality. This modularity may ease the creation of more MLMs from collaborations of single-modality experts. arxiv.org/abs/2501.04561
https://arxiv.org/abs/2501.04561
t.co
April 16, 2025 at 1:20 PM
Reposted
📢 The evaluation period of the Instruction Following task at
@iwslt.bsky.social just started!

🖥️ Consider submitting your speech-to-text system!

The outputs can be easily uploaded on the SPEECHM platform developed in the Meetween project (www.meetween.eu)!
➡️ iwslt2025.speechm.cloud.cyfronet.pl
iwslt2025.speechm.cloud.cyfronet.pl
April 1, 2025 at 12:39 PM
Reposted
While we look forward to a sunny Geneva, why wait to join the conversation?

We’ve created a starter pack for our #GITT2025 friends!
🕵️ Follow researchers working on gender bias in MT
💬 Stay up to date and dive into the discussion!

All info at sites.google.com/tilburgunive...
February 28, 2025 at 9:22 AM
very interesting to see more and more methods to close the length mismatch between speech and text sequences (aka length adapter -- see arxiv.org/abs/2402.12025) for SFM+LLM models! This one merging CTC and Q-former sounds very cool to me:
arxiv.org/abs/2412.01145
AlignFormer: Modality Matching Can Achieve Better Zero-shot Instruction-Following Speech-LLM
Integrating speech into LLM (speech-LLM) has gaining increased attention recently. The mainstream solution is to connect a well-trained speech encoder and LLM with a neural adapter. However, the lengt...
arxiv.org
February 14, 2025 at 10:34 AM
Reposted
Next up: simultaneous speech translation!

🎯 Goal: to explore ways to translate speech into another language like simultaneous interpreting.

🔗 Link: iwslt.org/2025/simulta...
Simultaneous track
Home of the IWSLT conference and SIGSLT.
iwslt.org
January 30, 2025 at 7:31 PM
Reposted
First up, a new task for 2025:
*Instruction-following for speech processing!*

Explore instruction-following for speech ⇨
Integrate speech foundation models with LLMs across tasks such as speech translation, recognition, summarization, and QA.

🔗: iwslt.org/2025/instruc...
Instruction-following Speech Processing track
Home of the IWSLT conference and SIGSLT.
iwslt.org
January 28, 2025 at 6:13 PM
Reposted
Today's task: model compression!!
🆕 New at IWSLT! But no less exciting 🔥

🎯 Goal: Compress a large, general-purpose multimodal model, making speech translation more efficient ⚡️, deployable 📲, and sustainable ♻️, while preserving translation quality ⭐️
#AI #SpeechTech #ModelCompression #LLMcompression
January 29, 2025 at 4:47 PM
Reposted
I'm happy to share that our paper "Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison" has been accepted at @naaclmeeting.bsky.social 2025! #NAACL2025

@mgaido91.bsky.social 👏

📃 Preprint: arxiv.org/abs/2501.02370
⏰ Code will be released soon

#NLProc #Speech
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Following the remarkable success of Large Language Models (LLMs) in NLP tasks, there is increasing interest in extending their capabilities to speech -- the most common form in communication. To integ...
arxiv.org
January 23, 2025 at 8:44 AM
Reposted
Hello world! 👋 We're coming out of hibernation to bring you this happy news:
1) We're organising the 3rd edition of GITT at #MTSummit! Working on #gender & #translation #technology? We'll see you there!
2) We're moving away from Twitter, so share the news and help us find old and new GITT friends!
a polar bear cub is laying in a pile of branches .
ALT: a polar bear cub is laying in a pile of branches .
media.tenor.com
January 22, 2025 at 12:17 PM
Our #iwslt 2025 task on instruction-following speech models is out! Submission by April 15th. Check it out at: iwslt.org/2025/instruc...
Instruction-following Speech Processing track
Home of the IWSLT conference and SIGSLT.
iwslt.org
January 9, 2025 at 9:43 AM