John Zila
jzila.com
John Zila
@jzila.com
The market today is oversold on the DeepSeek news.

DeepSeek has made some incredible innovations in model efficiency. But the order-of-magnitude gains are primarily due to their MoE architecture, which scales more favorably in both training and inference when compared to a dense model.
January 27, 2025 at 8:23 PM