Greta Tuckute
banner
gretatuckute.bsky.social
Greta Tuckute
@gretatuckute.bsky.social
Studying language in biological brains and artificial ones at the Kempner Institute at Harvard University.
www.tuckute.com
Reposted by Greta Tuckute
6/
We also wondered: if neuroscientists use functional localizers to map networks in the brain, could we do the same for MiCRo’s experts?

The answer: yes! The very same localizers successfully recovered the corresponding expert modules in our models!
October 20, 2025 at 12:10 PM
We hope that AuriStream will serve as a task-performant model system for studying how language structure is learned from speech.

The Interspeech paper sets the stage—more work building on this idea coming soon! And as always, please feel free to get in touch with comments etc.!
August 19, 2025 at 1:12 AM
3️⃣ Temporally fine-grained → 5ms tokens preserve acoustic detail (e.g. speaker identity).
4️⃣ Unified → AuriStream learns strong speech representations and generates plausible continuations—bridging representation learning and sequence modeling in the audio domain.
August 19, 2025 at 1:12 AM
4 key advantages of AuriStream:

1️⃣ Causal → allows the study of speech/language processing as it unfolds in real time.
2️⃣ Inspectable → predictions can naturally be decoded into the cochleagram/audio, enabling visualization and interpretation.
August 19, 2025 at 1:12 AM
Examples: audio before red line = ground-truth prompt; after = AuriStream’s prediction, visualized in the time-frequency cochleagram space.

AuriStream shows that causal prediction over short audio chunks (cochlear tokens) is enough to generate meaningful sentence continuations!
August 19, 2025 at 1:12 AM
Complementing AuriStream’s strong representational capabilities, AuriStream learns short- and long-range speech statistics—completing phonemes and common words at short scales, and generating diverse continuations at longer scales, as evident by the qualitative examples below.
August 19, 2025 at 1:12 AM
We demonstrate that:

🔹 AuriStream embeddings capture information about phoneme identity, word identity, and lexical semantics.
🔹 AuriStream embeddings serve as a strong backbone on downstream audio tasks (SUPERB benchmark, such as ASR and intent classification).
August 19, 2025 at 1:12 AM