Lightnews — Scholar-powered news

erogol.com

@erogol.com

Here is my take on new DeepSeek-V3.2-Exp

erogol.substack.com/p/model-chec...

Model check - DeepSeek-V3.2-Exp - Fine-Grained Sparse Attention for Efficient Long-Context LLMs

Going over the recently released DeepSeek-V3.2-Exp technical paper, source code and innovations.

erogol.substack.com

October 1, 2025 at 4:11 PM

erogol.com

@erogol.com

My post on MiMo-Audio

open.substack.com/pub/erogol/p...

🔥 Trained on 100M+ hours and shows emergent few-shot learning:
• Voice conversion
• Emotion transfer• Speech translation
• Cross-modal reasoning

⚡ Key finding: Speech follows same scaling laws as text LLMs

Model Check - MiMo-Audio: Scaling Speech Pre-Training to 100M Hours

Going over the code and the technical report of the new Speech LM model from Xiaomi that rivals GPT4o-audio and Gemini

open.substack.com

September 22, 2025 at 5:18 PM

erogol.com

@erogol.com

Machine Learns #55 is out!

Full of new models… check it out

open.substack.com/pub/erogol/p...

Machine Learns #55

Voice + reasoning releases (Ling‑flash‑2.0, VoxCPM, Kimi K2, ultraVAD) and 2 papers: long‑horizon execution & decay‑free LR schedules.

open.substack.com

September 18, 2025 at 1:01 PM

erogol.com

@erogol.com

machine learns #54 is out
open.substack.com/pub/erogol/p...

Machine Learns #54

🤖 Voice models, long-context tricks, and a token-order loss worth trying Flashy audio releases + 5 papers (MoC, TOP, FELLE, M2N2, Motif TR)

open.substack.com

September 4, 2025 at 11:07 AM

erogol.com

@erogol.com

My breakdown of VibeVoice - new open-weight TTS model from Microsoft.

open.substack.com/pub/erogol/p...

Model Check - VibeVoice: Next-Token Diffusion Meets Long-Form Speech Generation

Going over the code and the technical report of the new TTS model from Microsoft Research.

open.substack.com

August 26, 2025 at 11:54 AM

erogol.com

@erogol.com

ms released a tts model… nice…

You can create long form convos and podcasts with 4 distinct voice

huggingface.co/microsoft/Vi...

microsoft/VibeVoice-1.5B · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

August 25, 2025 at 5:10 PM

erogol.com

@erogol.com

KyutaiTTS solved streaming text-to-speech with a state machine that generates audio word-by-word as text arrives.

220ms latency, 10-second voice cloning, 32 concurrent users on single GPU.

No more waiting for complete sentences.

Full analysis: erogol.substack.com/p/model-chec...

Model check - KyutaiTTS: Streaming Text-to-Speech with Delayed Streams Modeling

Going over the Kyutai's new TTS model and its delayed streaming model.

erogol.substack.com

August 2, 2025 at 7:46 PM

erogol.com

@erogol.com

This is such a great idea

sakanaai.bsky.social @sakanaai.bsky.social · Jun 12

We’re excited to introduce Text-to-LoRA: a Hypernetwork that generates task-specific LLM adapters (LoRAs) based on a text description of the task. Catch our presentation at #ICML2025!

Paper: arxiv.org/abs/2506.06105
Code: github.com/SakanaAI/Tex...

June 12, 2025 at 1:59 PM

erogol.com

@erogol.com

claude is the best coding model

gemini cause frequent syntax errors

openai does not even understand the task at hand

June 10, 2025 at 1:38 PM

erogol.com

@erogol.com

lately spending sometime with Diffusion LMs and working on NanoGPT style LlaDA model

so far I've not achieved comparable results to AR models but its a good start

github.com/erogol/BlaGP...

BlaGPT/bla_gpt/llada.py at main · erogol/BlaGPT

Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible experimentation and exploration. - erogol/BlaGPT

github.com

June 1, 2025 at 2:12 PM

Reposted

sakanaai.bsky.social

@sakanaai.bsky.social

This work was done in collaboration with Jeff Clune’s lab at UBC, and led by his PhD students Jenny Zhang and Shengran Hu, together with Cong Lu and Robert Lange.

Paper: arxiv.org/abs/2505.22954
Code: github.com/jennyzzt/dgm

May 30, 2025 at 2:33 AM

erogol.com

@erogol.com

⚡ Machine Learns issue 48 is out

🚀 dKV-Cache accelerates diffusion models up to 10x faster
🔐 OpenAI's authentication play (think OAuth for AI)
🎯 PaTH Attention beats RoPE on long-context tasks
🤖 Humanoid Robot fights became real

open.substack.com/pub/erogol/p...

Machine Learns #48

OpenAI's 'Sign in with ChatGPT', Meta's AGI ambitions, new models like Gemma 3 & MAGI-1, research breakthroughs in KV caching for diffusion & PaTH Attention, and fresh open-source releases.

open.substack.com

May 28, 2025 at 12:25 PM

erogol.com

@erogol.com

Following the bread crumbs, implemented PLE from Gemma3n.

It gave a significant performance boost and resulted in a new best model with almost no compute overhead.

github.com/erogol/BlaGPT

GitHub - erogol/BlaGPT: Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible experimentation and exploration.

Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible experimentation and exploration. - erogol/BlaGPT

github.com

May 27, 2025 at 9:36 AM

erogol.com

@erogol.com

My paper notes on 2 new papers

- Model Merging in Pre-training of Large Language Models,
- Do Not Let Low-Probability Tokens Over-Dominate in RL,

open.substack.com/pub/erogol/p...

Paper check: Merging LLMs at Pre-training, Considering Token Probabilities at RL

🔬Two papers in scope: "Model Merging in Pre-training for LLMs" and "Do Not Let Low-Probability Tokens Over-Dominate in RL"

open.substack.com

May 21, 2025 at 12:10 PM

erogol.com

@erogol.com

muon really works. got best results in BlaGPT

```
torchrun --standalone --nproc_per_node=8 train.py --run_name best_model --model_name best
```

github.com/erogol/BlaGPT

GitHub - erogol/BlaGPT: Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible experimentation and exploration.

Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible experimentation and exploration. - erogol/BlaGPT

github.com

May 8, 2025 at 1:14 PM

erogol.com

@erogol.com

🧵 Here is a small thread with my notes about some of the recent Transformer papers.

- Softpick: an alternative to softmax in Attention
- Canon Layers: mixing states with conv1d
- Parallel Transformer blocks

May 6, 2025 at 12:11 PM

erogol.com

@erogol.com

Machine learns #45 - no fluff AI newsletter - is out!

I normally share bi-weekly but last week was full enough so here we go

open.substack.com/pub/erogol/p...

Machine Learns #45

OpenAI's social network & GPT-4.1, China launches $8.2B AI fund, NVIDIA's US manufacturing push, new GLM-4 & MineWorld models, C3PO expert pathways optimization, GigaTok's 3B visual tokenizer...

open.substack.com

April 16, 2025 at 1:54 PM

erogol.com

@erogol.com

Updated my LLM usage and cancelled ChatGPT sub for now

Coding - Claude, Gemini 2.5
Reading papers - Claude
Research - Gemini 2.5
Daily - Gemini 2.5
Search - Gemini 2.5

erogol.com @erogol.com · Mar 8

Here is my use of LLMs

Coding - Claude (best by far), QwenChat
Reading papers - Claude
Research - ChatGPT (best UI,UX), Gemini (better results)
Daily - ChatGPT
Search - ChatGPT

I'd love to try searching with Claude, but not there yet.

Any suggestions for change?

April 11, 2025 at 9:06 PM

erogol.com

@erogol.com

Machine Learns #44 is out !!

click for no fluff AI newsletter

erogol.substack.com/p/machine-le...

Machine Learns #44

Praxis Sam Altman's tech utopia, Amazon launches Nova Sonic voice AI, Midjourney returns with V7, Llama 4 models debut amid controversy, new brain-to-voice model, NoProp learning ...

erogol.substack.com

April 9, 2025 at 2:18 PM

erogol.com

@erogol.com

Next big thing is Brain-LLMs.

Imagine an LLM compressing all world knowledge attached to your brain and ready to serve your thoughts and questions.

You also update it over internet and pay for sub. I don't want to think about the ad business :)

April 1, 2025 at 1:26 PM

erogol.com

@erogol.com

“If these results generalize to real-world software tasks, extrapolation of this trend predicts that within 5 years, AI systems will be capable of automating many software tasks that currently take humans a month.”

arxiv.org/abs/2503.14499

Measuring AI Ability to Complete Long Tasks

Despite rapid progress on AI benchmarks, the real-world meaning of benchmark performance remains unclear. To quantify the capabilities of AI systems in terms of human capabilities, we propose a new me...

arxiv.org

March 21, 2025 at 10:05 AM

erogol.com

@erogol.com

It’s crazy that Gemma3 held up for only about three days

March 18, 2025 at 2:10 PM

erogol.com

@erogol.com

Here is my no fuzz newsletter

open.substack.com/pub/erogol/p...

March 12, 2025 at 1:49 PM

erogol.com

@erogol.com

Here is my use of LLMs

Coding - Claude (best by far), QwenChat
Reading papers - Claude
Research - ChatGPT (best UI,UX), Gemini (better results)
Daily - ChatGPT
Search - ChatGPT

I'd love to try searching with Claude, but not there yet.

Any suggestions for change?

March 8, 2025 at 1:55 PM

erogol.com

@erogol.com

I think diffusion-based LLMs (LLdMs) are better suited as next-generation LLMs

- multiple outputs per iter: faster output generation
- no causal masking: bidirectional attention
- multiple diff steps: reasoning at inference time and revising poor outputs

March 3, 2025 at 9:57 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news