Feels like Claude 3.7 while Claude 3.7 feels like GPT-4.5.
Feels like Claude 3.7 while Claude 3.7 feels like GPT-4.5.
I don't think OpenAI wants people using GPT-4.5.
I don't think OpenAI wants people using GPT-4.5.
A minimal GPU in Verilog optimized for learning about how GPUs work from the ground up.
Built with <15 files of fully documented Verilog, complete documentation on architecture & ISA, working matrix addition/multiplication kernels, and full support for kernel simulation & execution traces
A minimal GPU in Verilog optimized for learning about how GPUs work from the ground up.
Built with <15 files of fully documented Verilog, complete documentation on architecture & ISA, working matrix addition/multiplication kernels, and full support for kernel simulation & execution traces
Unlike r1, which was trained to "think" in a readable, kinda charming way, r1-zero is the self-trained reasoner that had the *aha moment* about math & produces "thoughts" that are not human readable
Unlike r1, which was trained to "think" in a readable, kinda charming way, r1-zero is the self-trained reasoner that had the *aha moment* about math & produces "thoughts" that are not human readable
veRL is a flexible, efficient and production-ready RL training framework designed for large language models (LLMs).
github.com/volcengine/v...
veRL is a flexible, efficient and production-ready RL training framework designed for large language models (LLMs).
github.com/volcengine/v...
DeepSeek R1 is just the tip of the ice berg of rapid progress.
People underestimate the long-term potential of “reasoning.”
Model: huggingface.co/deepseek-ai/...
They have also released two Janus Pro models as well.
Model 1B: huggingface.co/deepseek-ai/...
Model 7B: huggingface.co/deepseek-ai/...
Model: huggingface.co/deepseek-ai/...
They have also released two Janus Pro models as well.
Model 1B: huggingface.co/deepseek-ai/...
Model 7B: huggingface.co/deepseek-ai/...
⚡ Performance on par with OpenAI-o1
📖 Fully open-weight model & technical report
🏆 MIT licensed: Distill & commercialize freely!
🌐 Website & API are live now!
Demo: chat.deepseek.com
Models: huggingface.co/deepseek-ai
⚡ Performance on par with OpenAI-o1
📖 Fully open-weight model & technical report
🏆 MIT licensed: Distill & commercialize freely!
🌐 Website & API are live now!
Demo: chat.deepseek.com
Models: huggingface.co/deepseek-ai
Here are my notes on the new models, plus how I ran DeepSeek-R1-Distill-Llama-8B on my Mac using Ollama and LLM
simonwillison.net/2025/Jan/20/...
Here are my notes on the new models, plus how I ran DeepSeek-R1-Distill-Llama-8B on my Mac using Ollama and LLM
simonwillison.net/2025/Jan/20/...
All evals should move to agentic evals in 2025 in my opinion.
We’re just leaving so much capabilities of our models on the table.
Benchmarked with smolagents: github.com/huggingface/...
All evals should move to agentic evals in 2025 in my opinion.
We’re just leaving so much capabilities of our models on the table.
Benchmarked with smolagents: github.com/huggingface/...
Customer Service: 💎+❤️ = Empathetic Gemma😊
Marketing: 💎+💡 = Idea Generator Gemma🚀
Coding: 💎+💻 = Code Guru Gemma👩💻
Multiple LoRA adapters on the same GCP endpoint!
Customize your AI and maximize your resources
medium.com/google-cloud...
Customer Service: 💎+❤️ = Empathetic Gemma😊
Marketing: 💎+💡 = Idea Generator Gemma🚀
Coding: 💎+💻 = Code Guru Gemma👩💻
Multiple LoRA adapters on the same GCP endpoint!
Customize your AI and maximize your resources
medium.com/google-cloud...
Link: arxiv.org/pdf/2401.023...
#ReinforcementLearning #ICLR2025 #ACL2025 #NAACL2025 #NeurIPS2024 #ICML2025 #DeepRL #DeepReinforcementLearning
Link: arxiv.org/pdf/2401.023...
#ReinforcementLearning #ICLR2025 #ACL2025 #NAACL2025 #NeurIPS2024 #ICML2025 #DeepRL #DeepReinforcementLearning
So an hour of streaming Netflix is equivalent to 70-90,000 65B tokens. arxiv.org/pdf/2310.03003
So an hour of streaming Netflix is equivalent to 70-90,000 65B tokens. arxiv.org/pdf/2310.03003