Python survivor, book lover and weird music enjoyer.
Goodfire is at it again!
They developed a method similar to PCA that measures how much of an LLM’s weights are dedicated to memorization
www.goodfire.ai/research/und...
Goodfire is at it again!
They developed a method similar to PCA that measures how much of an LLM’s weights are dedicated to memorization
www.goodfire.ai/research/und...
it’s called Composer, it’s an extremely fast model that was previously available under code name Cheetah
it’s an MoE trained in fp8, RL’d on Cursor Agent traces
cursor.com/blog/composer
it’s called Composer, it’s an extremely fast model that was previously available under code name Cheetah
it’s an MoE trained in fp8, RL’d on Cursor Agent traces
cursor.com/blog/composer
Nvidia just dropped 1700 hours of public driving data on HuggingFace from over 2500 cities:
huggingface.co/datasets/nvi...
The founding team has impressive Rust credentials. They're targeting a wide range of usecases, not just ML.
The founding team has impressive Rust credentials. They're targeting a wide range of usecases, not just ML.
Better than Opus 4.1 on almost every benchmark
Still the classic Sonnet prices, $3/$15
Better than Opus 4.1 on almost every benchmark
Still the classic Sonnet prices, $3/$15
The flagship model Qwen3-VL-235B-A22B is released as open-weight and available in both Instruct and Thinking versions
✅ Instruct outperforms Gemini 2.5 Pro on key vision benchmarks
✅ Thinking achieves state-of-the-art (SOTA) performance on multimodal reasoning tasks
The flagship model Qwen3-VL-235B-A22B is released as open-weight and available in both Instruct and Thinking versions
✅ Instruct outperforms Gemini 2.5 Pro on key vision benchmarks
✅ Thinking achieves state-of-the-art (SOTA) performance on multimodal reasoning tasks
🏆 SOTA on 22/36 audio & AV benchmarks
🌍 119L text / 19L speech in / 10L speech out
⚡ 211ms latency | 🎧 30-min audio understanding
🎨 Fully customizable via system prompts
www.spectrallabs.ai/research/SGS-1
www.spectrallabs.ai/research/SGS-1
@vikhyat.net cooked
moondream.ai/blog/moondre...
x.com/vikhyatk/sta...
@vikhyat.net cooked
moondream.ai/blog/moondre...
x.com/vikhyatk/sta...
www.youtube.com/watch?v=tzvM...
www.youtube.com/watch?v=tzvM...
What's this about? No clue
open.spotify.com/track/5B4lRo...
What's this about? No clue
open.spotify.com/track/5B4lRo...
Supported Languages:
Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish, Russian, Ukrainian
- parakeet-tdt-0.6b-v3: blazing fast and accurate ASR inference with PnC and timestamps
huggingface.co/nvidia/parak...
Supported Languages:
Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish, Russian, Ukrainian
the link to the slide deck in the reply.
the link to the slide deck in the reply.
Shame all the juicy details are locked down tight...
Genie 3 is a new frontier for world models: its environments remain largely consistent for several minutes, with visual memory extending as far back as 1min. These limitations will only decrease with time.
Welcome to the future.🙌
deepmind.google/discover/blo...
Shame all the juicy details are locked down tight...
120B & 20B variants, both MoE with 4 experts active
openai.com/index/introd...
120B & 20B variants, both MoE with 4 experts active
openai.com/index/introd...