@aiwlwlttok.bsky.social
My take on #deepseek

The techniques DeepSeek is using, such as the MoE, aren't new. OpenAI, DeepMind, & others already explored them.
OpenAI itself experimented with MoE models (but likely opted for dense models due to performance trade-offs, scalability, & inference efficiency considerations.
January 28, 2025 at 6:32 PM
So...

- DeepSeek is impressive. It’s a strong move, but it doesn’t fundamentally alter the AI trajectory

- The AI arms race is intensifying. China’s entry into competitive AI is significant (and not unexpected).

- The market’s panic is short-term noise. AI compute demand is not disappearing.
January 28, 2025 at 2:25 PM
4. Market Overreaction

The stock market often overreacts to perceived competitive threats, but the reality is more nuanced. Open-source AI models and new players shake things up, but they don’t suddenly eliminate the need for high-end infrastructure.
January 28, 2025 at 2:25 PM
3. MoE Has Trade-Offs

MoE models are more efficient in training but introduce complexity in inference. Not every workload benefits from them, and they require more sophisticated routing mechanisms.
January 28, 2025 at 2:25 PM
2. Compute and Energy Demand Will Continue to Grow

Even if DeepSeek’s approach offers a temporary efficiency boost, AI workloads are expanding rapidly. As models get more powerful, the demand for GPUs will still increase. Reducing per-token costs is good, but the overall demand will remain high.
January 28, 2025 at 2:25 PM
1. Established players can always use any new techniques

OpenAI, Anthropic, and Google can implement similar optimisations if they prove to be superior. The industry is fluid, and breakthroughs spread quickly.
January 28, 2025 at 2:25 PM