Sung Kim
sungkim.bsky.social
Sung Kim
@sungkim.bsky.social
A business analyst at heart who enjoys delving into AI, ML, data engineering, data science, data analytics, and modeling. My views are my own.

You can also find me at threads: @sung.kim.mw
"Familiarity with advanced packaging techniques (EMIB, FCBGA), test and measurement equipment, and yield enhancement strategies."

careers.qualcomm.com/careers?quer...
Qualcomm Careers | Engineering Jobs and More | Qualcomm
Search open positions at Qualcomm. Learn more about how our culture of collaboration and robust benefits program allow our employees to live well and exceed their potential.
careers.qualcomm.com
November 13, 2025 at 6:32 AM
⚡ Efficiency: Only 1.5B params — 100-600× smaller than giants like Kimi K2 & DeepSeek R1.
💰 Cost: Full post-training for just $7.8K — 30-60× cheaper than DeepSeek R1 or MiniMax-M1.

Model : huggingface.co/WeiboAI/Vibe...
Github: github.com/WeiboAI/Vibe...
Arxiv : arxiv.org/abs/2511.06221
WeiboAI/VibeThinker-1.5B · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
November 13, 2025 at 6:23 AM
1️⃣ Make vLLM batch-invariant (same seq → same output regardless of batching)
2️⃣ Ensure forward passes in training use identical kernels as inference
3️⃣ Add custom backward passes in PyTorch

blog.vllm.ai/2025/11/10/b...
No More Train-Inference Mismatch: Bitwise Consistent On-Policy Reinforcement Learning with vLLM and TorchTitan
We demonstrate an open-source bitwise consistent on-policy RL run with TorchTitan as the training engine and vLLM as the inference engine. Built on top of vLLM’s recent work on batch-invariant inferen...
blog.vllm.ai
November 13, 2025 at 6:21 AM
The core contribution lies in finally being able to answer a few fundamental questions theoretically:
- what distribution to impose on your embeddings
- how to do distribution matching in high-dim

Paper: arxiv.org/abs/2511.08544
Code: github.com/rbalestr-lab...
November 13, 2025 at 6:19 AM
🤷‍♂️
November 13, 2025 at 3:10 AM
You should ask them, not me.
November 12, 2025 at 5:58 AM
offer dual mechanisms to alter those "beliefs".

Paper: Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering ( arxiv.org/abs/2511.00617 )
Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering
Large language models (LLMs) can be controlled at inference time through prompts (in-context learning) and internal activations (activation steering). Different accounts have been proposed to explain ...
arxiv.org
November 12, 2025 at 5:42 AM
- Relu^2 activation function
- FSDP + TP + SP
- Int6 gradient communication
- Quantization Aware Training (QAT)

blog.character.ai/technical/in...
Inside Kaiju - building conversational models at scale
What made Character.ai's early models so engaging? Before open-source models became the norm, our team built Kaiju - a family of in-house LLMs designed to power millions of fast, expressive conversati...
blog.character.ai
November 12, 2025 at 5:39 AM