You can also find me at threads: @sung.kim.mw
ByteDance unveils China’s most affordable AI coding agent at just US$1.30 a month
www.scmp.com/tech/big-tec...
ByteDance unveils China’s most affordable AI coding agent at just US$1.30 a month
www.scmp.com/tech/big-tec...
"Dojo 3 chip production is now distributed between TSMC and Samsung Electronics, with packaging operations handled at Intel's Arizona facility."
www.digitimes.com/news/a202511...
"Dojo 3 chip production is now distributed between TSMC and Samsung Electronics, with packaging operations handled at Intel's Arizona facility."
www.digitimes.com/news/a202511...
🚀 Performance: Highly competitive on AIME24/25 & HMMT25 — surpasses DeepSeek R1-0120 on math, and outperforms same-size models in competitive coding.
🚀 Performance: Highly competitive on AIME24/25 & HMMT25 — surpasses DeepSeek R1-0120 on math, and outperforms same-size models in competitive coding.
vLLM demonstrates bitwise consistent on-policy RL with TorchTitan (training) + vLLM (inference) — the first open-source run where training and inference numerics match exactly.
vLLM demonstrates bitwise consistent on-policy RL with TorchTitan (training) + vLLM (inference) — the first open-source run where training and inference numerics match exactly.
- 60+ arch., up to 2B params
- 10+ datasets
- in-domain training (>DINOv3)
- corr(train loss, test perf)=95%
- 60+ arch., up to 2B params
- 10+ datasets
- in-domain training (>DINOv3)
- corr(train loss, test perf)=95%
ernie.baidu.com
ernie.baidu.com
Currently, we have an unhealthy ‘upright pyramid’ AI industry structure
- Application Layer
- Model Layer
- Chip Layer
They are shifting to a healthy AI industry structure, which is an ‘inverted pyramid'
- Application Layer
- Model Layer
- Chip Layer
Currently, we have an unhealthy ‘upright pyramid’ AI industry structure
- Application Layer
- Model Layer
- Chip Layer
They are shifting to a healthy AI industry structure, which is an ‘inverted pyramid'
- Application Layer
- Model Layer
- Chip Layer
The paper formalizes a Bayesian framework for model control: altering a model's "beliefs" over which persona or data source it's emulating. Context (prompting) and internal representations (steering)
The paper formalizes a Bayesian framework for model control: altering a model's "beliefs" over which persona or data source it's emulating. Context (prompting) and internal representations (steering)
Here are a few optimizations that they did
- MuP-like scaling
- MQA + SWA
- Clamping everywhere to control activation
- KV Cache sharing
Here are a few optimizations that they did
- MuP-like scaling
- MQA + SWA
- Clamping everywhere to control activation
- KV Cache sharing
Multi-Vector Retrieval via Fixed Dimensional Encodings is an interesting approach by Google Research. It transforms multi-vector representations into single fixed-size vectors (fixed dimensional encodings).
Multi-Vector Retrieval via Fixed Dimensional Encodings is an interesting approach by Google Research. It transforms multi-vector representations into single fixed-size vectors (fixed dimensional encodings).
Available now:
- Dataset of 100K summaries
- 2 fine-tuned LLMs
- 3d visualizer
Available now:
- Dataset of 100K summaries
- 2 fine-tuned LLMs
- 3d visualizer
“If you have one bucket that holds 2 gallons and another bucket that holds 5 gallons, how many buckets do you have?”
The red indicates the percentage of people who got it right. See page 49.
“If you have one bucket that holds 2 gallons and another bucket that holds 5 gallons, how many buckets do you have?”
The red indicates the percentage of people who got it right. See page 49.
Read more about it here: senate.ucsd.edu/media/740347...
Read more about it here: senate.ucsd.edu/media/740347...
You can buy 8 of these GPUs and cluster them for 192GB of VRAM for under $5,000.
blog.vllm.ai/2025/11/11/i...
You can buy 8 of these GPUs and cluster them for 192GB of VRAM for under $5,000.
blog.vllm.ai/2025/11/11/i...
It is only available in China and it is supposed to be pretty good.
In Chinese: exp.volcengine.com/ark?model=do...
It is only available in China and it is supposed to be pretty good.
In Chinese: exp.volcengine.com/ark?model=do...
These systems aren’t engineered for milliseconds — they’re tuned for microseconds, even nanoseconds. Every component, from the network card to the FPGA bitstream, is obsessed with one goal: shaving latency down to the bare minimum.
These systems aren’t engineered for milliseconds — they’re tuned for microseconds, even nanoseconds. Every component, from the network card to the FPGA bitstream, is obsessed with one goal: shaving latency down to the bare minimum.
- 3B active parameters with enhanced semantic alignment between visual and language modalities
- "Thinking with Images" feature that enables zooming in and out to capture finer details
- Apache License 2.0
- 3B active parameters with enhanced semantic alignment between visual and language modalities
- "Thinking with Images" feature that enables zooming in and out to capture finer details
- Apache License 2.0
Meta’s chief artificial intelligence scientist, Yann LeCun, has reportedly told associates he plans to leave the Silicon Valley company in the coming months.
www.ft.com/content/c586...
Meta’s chief artificial intelligence scientist, Yann LeCun, has reportedly told associates he plans to leave the Silicon Valley company in the coming months.
www.ft.com/content/c586...