erogol.com
erogol.substack.com
github.com/erogol
open.substack.com/pub/erogol/p...
🔥 Trained on 100M+ hours and shows emergent few-shot learning:
• Voice conversion
• Emotion transfer• Speech translation
• Cross-modal reasoning
⚡ Key finding: Speech follows same scaling laws as text LLMs
open.substack.com/pub/erogol/p...
🔥 Trained on 100M+ hours and shows emergent few-shot learning:
• Voice conversion
• Emotion transfer• Speech translation
• Cross-modal reasoning
⚡ Key finding: Speech follows same scaling laws as text LLMs
open.substack.com/pub/erogol/p...
open.substack.com/pub/erogol/p...
open.substack.com/pub/erogol/p...
open.substack.com/pub/erogol/p...
You can create long form convos and podcasts with 4 distinct voice
huggingface.co/microsoft/Vi...
You can create long form convos and podcasts with 4 distinct voice
huggingface.co/microsoft/Vi...
220ms latency, 10-second voice cloning, 32 concurrent users on single GPU.
No more waiting for complete sentences.
Full analysis: erogol.substack.com/p/model-chec...
220ms latency, 10-second voice cloning, 32 concurrent users on single GPU.
No more waiting for complete sentences.
Full analysis: erogol.substack.com/p/model-chec...
Paper: arxiv.org/abs/2506.06105
Code: github.com/SakanaAI/Tex...
gemini cause frequent syntax errors
openai does not even understand the task at hand
gemini cause frequent syntax errors
openai does not even understand the task at hand
so far I've not achieved comparable results to AR models but its a good start
github.com/erogol/BlaGP...
so far I've not achieved comparable results to AR models but its a good start
github.com/erogol/BlaGP...
Paper: arxiv.org/abs/2505.22954
Code: github.com/jennyzzt/dgm
Paper: arxiv.org/abs/2505.22954
Code: github.com/jennyzzt/dgm
🚀 dKV-Cache accelerates diffusion models up to 10x faster
🔐 OpenAI's authentication play (think OAuth for AI)
🎯 PaTH Attention beats RoPE on long-context tasks
🤖 Humanoid Robot fights became real
open.substack.com/pub/erogol/p...
🚀 dKV-Cache accelerates diffusion models up to 10x faster
🔐 OpenAI's authentication play (think OAuth for AI)
🎯 PaTH Attention beats RoPE on long-context tasks
🤖 Humanoid Robot fights became real
open.substack.com/pub/erogol/p...
It gave a significant performance boost and resulted in a new best model with almost no compute overhead.
github.com/erogol/BlaGPT
It gave a significant performance boost and resulted in a new best model with almost no compute overhead.
github.com/erogol/BlaGPT
- Model Merging in Pre-training of Large Language Models,
- Do Not Let Low-Probability Tokens Over-Dominate in RL,
open.substack.com/pub/erogol/p...
- Model Merging in Pre-training of Large Language Models,
- Do Not Let Low-Probability Tokens Over-Dominate in RL,
open.substack.com/pub/erogol/p...
```
torchrun --standalone --nproc_per_node=8 train.py --run_name best_model --model_name best
```
github.com/erogol/BlaGPT
```
torchrun --standalone --nproc_per_node=8 train.py --run_name best_model --model_name best
```
github.com/erogol/BlaGPT
- Softpick: an alternative to softmax in Attention
- Canon Layers: mixing states with conv1d
- Parallel Transformer blocks
- Softpick: an alternative to softmax in Attention
- Canon Layers: mixing states with conv1d
- Parallel Transformer blocks
I normally share bi-weekly but last week was full enough so here we go
open.substack.com/pub/erogol/p...
I normally share bi-weekly but last week was full enough so here we go
open.substack.com/pub/erogol/p...
Coding - Claude, Gemini 2.5
Reading papers - Claude
Research - Gemini 2.5
Daily - Gemini 2.5
Search - Gemini 2.5
Coding - Claude (best by far), QwenChat
Reading papers - Claude
Research - ChatGPT (best UI,UX), Gemini (better results)
Daily - ChatGPT
Search - ChatGPT
I'd love to try searching with Claude, but not there yet.
Any suggestions for change?
Coding - Claude, Gemini 2.5
Reading papers - Claude
Research - Gemini 2.5
Daily - Gemini 2.5
Search - Gemini 2.5
Imagine an LLM compressing all world knowledge attached to your brain and ready to serve your thoughts and questions.
You also update it over internet and pay for sub. I don't want to think about the ad business :)
Imagine an LLM compressing all world knowledge attached to your brain and ready to serve your thoughts and questions.
You also update it over internet and pay for sub. I don't want to think about the ad business :)
arxiv.org/abs/2503.14499
arxiv.org/abs/2503.14499
Coding - Claude (best by far), QwenChat
Reading papers - Claude
Research - ChatGPT (best UI,UX), Gemini (better results)
Daily - ChatGPT
Search - ChatGPT
I'd love to try searching with Claude, but not there yet.
Any suggestions for change?
Coding - Claude (best by far), QwenChat
Reading papers - Claude
Research - ChatGPT (best UI,UX), Gemini (better results)
Daily - ChatGPT
Search - ChatGPT
I'd love to try searching with Claude, but not there yet.
Any suggestions for change?
- multiple outputs per iter: faster output generation
- no causal masking: bidirectional attention
- multiple diff steps: reasoning at inference time and revising poor outputs
- multiple outputs per iter: faster output generation
- no causal masking: bidirectional attention
- multiple diff steps: reasoning at inference time and revising poor outputs