hifromv.bsky.social
@hifromv.bsky.social
ML/AI, web and app dev
Reposted
OpenAI launches new tools to help developers build AI agents
OpenAI launches new tools to help developers build AI agents
OpenAI’s new Responses API comes with web search, the ability to look through files, and computer use out of the box.
buff.ly
March 11, 2025 at 5:10 PM
Reposted
Wrote up my first impressions of the new GPT-4.5 - it's quite slow, VERY expensive and doesn't appear to be a notable leap forward from GPT-4o or o3-mini simonwillison.net/2025/Feb/27/...
Introducing GPT-4.5
GPT-4.5 is out today as a "research preview" - it's available to OpenAI Pro ($200/month) customers but also to developers with an API key. OpenAI also published [a GPT-4.5 system …
simonwillison.net
February 27, 2025 at 9:26 PM
Reposted
Been using GPT-4.5 for a few days and it is a very odd and interesting model. It can write beautifully, is very creative, and is occasionally oddly lazy on complex projects.

Feels like Claude 3.7 while Claude 3.7 feels like GPT-4.5.
February 27, 2025 at 8:30 PM
Reposted
OpenAI GPT API pricing.

I don't think OpenAI wants people using GPT-4.5.
February 27, 2025 at 8:45 PM
Reposted
Some initial notes on Claude 3.7 Sonnet, including SVGs of pelicans on bicycles (which it does very well) simonwillison.net/2025/Feb/24/...
Claude 3.7 Sonnet and Claude Code
Anthropic released **Claude 3.7 Sonnet** today - skipping the name "Claude 3.6" because the Anthropic user community had already started using that as the unofficial name for their [October update …
simonwillison.net
February 24, 2025 at 8:51 PM
Reposted
Claude 3.7, the latest model from Anthropic, can be instructed to engage in a specific amount of reasoning to solve hard problems.
Anthropic Launches the World’s First ‘Hybrid Reasoning’ AI Model
Claude 3.7, the latest model from Anthropic, can be instructed to engage in a specific amount of reasoning to solve hard problems.
buff.ly
February 24, 2025 at 6:50 PM
Reposted
Claude 3.7 Sonnet
February 24, 2025 at 6:57 PM
Reposted
tiny-gpu

A minimal GPU in Verilog optimized for learning about how GPUs work from the ground up.

Built with <15 files of fully documented Verilog, complete documentation on architecture & ISA, working matrix addition/multiplication kernels, and full support for kernel simulation & execution traces
February 11, 2025 at 5:22 AM
Reposted
Surprised we haven't seen more about Deepseek r1-zero (no one seems to host it?)

Unlike r1, which was trained to "think" in a readable, kinda charming way, r1-zero is the self-trained reasoner that had the *aha moment* about math & produces "thoughts" that are not human readable
January 31, 2025 at 2:17 AM
Reposted
ByteDance's veRL: Volcano Engine Reinforcement Learning for LLM

veRL is a flexible, efficient and production-ready RL training framework designed for large language models (LLMs).

github.com/volcengine/v...
GitHub - volcengine/verl: veRL: Volcano Engine Reinforcement Learning for LLM
veRL: Volcano Engine Reinforcement Learning for LLM - volcengine/verl
github.com
January 30, 2025 at 5:10 AM
Reposted
The story Nathan tells here is more nuanced than the headline implies. He thinks chain-of-thought abilities learned from RL will generalize beyond domains like math and code that are easy to verify. But generalization might be uneven and might require bootstrapping new verifiers. +
Why reasoning models will generalize
DeepSeek R1 is just the tip of the ice berg of rapid progress.
People underestimate the long-term potential of “reasoning.”
Why reasoning models will generalize
People underestimate the long-term potential of “reasoning.”
buff.ly
January 29, 2025 at 1:42 PM
Reposted
DeepSeek has released the Janus model.

Model: huggingface.co/deepseek-ai/...

They have also released two Janus Pro models as well.

Model 1B: huggingface.co/deepseek-ai/...
Model 7B: huggingface.co/deepseek-ai/...
January 27, 2025 at 5:57 PM
Reposted
DeepSeek R1 appears to be a VERY strong model for coding - examples for both C and Python here: simonwillison.net/2025/Jan/27/...
ggml : x2 speed for WASM by optimizing SIMD
PR by Xuan-Son Nguyen for `llama.cpp`: > This PR provides a big jump in speed for WASM by leveraging SIMD instructions for `qX_K_q8_K` and `qX_0_q8_0` dot product functions. > > …
simonwillison.net
January 27, 2025 at 6:33 PM
Reposted
OpenAI's Canvas feature got a big upgrade today, turning it into a direct competitor for Anthropic's excellent Claude Artifacts feature - my notes here: simonwillison.net/2025/Jan/25/...
OpenAI Canvas gets a huge upgrade
[Canvas](https://openai.com/index/introducing-canvas/) is the ChatGPT feature where ChatGPT can open up a shared editing environment and collaborate with the user on creating a document or piece of co...
simonwillison.net
January 25, 2025 at 1:26 AM
Reposted
DeepSeek-R1!

⚡ Performance on par with OpenAI-o1
📖 Fully open-weight model & technical report
🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now!
Demo: chat.deepseek.com
Models: huggingface.co/deepseek-ai
January 20, 2025 at 3:12 PM
Reposted
DeepSeek released a whole family of inference-scaling / "reasoning" models today, including distilled variants based on Llama and Qwen

Here are my notes on the new models, plus how I ran DeepSeek-R1-Distill-Llama-8B on my Mac using Ollama and LLM

simonwillison.net/2025/Jan/20/...
DeepSeek-R1 and exploring DeepSeek-R1-Distill-Llama-8B
DeepSeek are the Chinese AI lab who dropped the best currently available open weights LLM on Christmas day, DeepSeek v3. That model was trained in part using their unreleased R1 …
simonwillison.net
January 20, 2025 at 3:22 PM
Reposted
Mind blown 👇 when people ask whether you need an agent framework at all!

All evals should move to agentic evals in 2025 in my opinion.

We’re just leaving so much capabilities of our models on the table.

Benchmarked with smolagents: github.com/huggingface/...
January 15, 2025 at 10:40 AM
Reposted
Unlock a universe of AI personalities with ONE 💎 Gemma model! 🤯

Customer Service: 💎+❤️ = Empathetic Gemma😊
Marketing: 💎+💡 = Idea Generator Gemma🚀
Coding: 💎+💻 = Code Guru Gemma👩‍💻

Multiple LoRA adapters on the same GCP endpoint!
Customize your AI and maximize your resources

medium.com/google-cloud...
Open Models on Vertex AI with Hugging Face: Serving multiple LoRA Adapters on Vertex AI
This blog post provides a practical example of how to deploy a Gemma 2 model with multiple LoRA adapters on Vertex AI using custom…
medium.com
January 14, 2025 at 9:53 AM
Reposted
January 14, 2025 at 10:55 PM
Reposted
My notes so far on Codestral 25.01 - the new code-focused API-only LLM released today by @mistralai.bsky.social simonwillison.net/2025/Jan/13/...
Codestral 25.01
Brand new code-focused model from Mistral. Unlike [the first Codestral](https://simonwillison.net/2024/May/30/codestral/) this one isn't ([yet](https://twitter.com/sophiamyang/status/18789084748114046...
simonwillison.net
January 13, 2025 at 9:45 PM
Reposted
I wrote a recent survey about deep reinforcement learning. The paper is a compact guide to understand some of the key concepts in reinforcement learning.

Link: arxiv.org/pdf/2401.023...

#ReinforcementLearning #ICLR2025 #ACL2025 #NAACL2025 #NeurIPS2024 #ICML2025 #DeepRL #DeepReinforcementLearning
January 12, 2025 at 4:21 PM
Reposted
Most of the talk around AI and energy use refer to an older 2020 estimate of GPT-3 energy consumption, but a more recent paper directly measures energy use of Llama 65B as 3-4 joules per decoded token.

So an hour of streaming Netflix is equivalent to 70-90,000 65B tokens. arxiv.org/pdf/2310.03003
January 13, 2025 at 2:43 AM
Reposted
If you have 15 minutes to read and learn Something important regarding #Energy and #AI... Please read this Christmas present from @mliebreich.bsky.social to the world
Liebreich: Generative AI – The Power and the Glory | BloombergNEF
This year will go down in history as the year the energy sector woke up to AI. This is also the year AI woke up to energy. Is the data center power frenzy just the latest of a long line of energy sect...
about.bnef.com
January 12, 2025 at 9:00 AM
Reposted
Nvidia announces $3,000 personal AI supercomputer called Digits
Nvidia announces $3,000 personal AI supercomputer called Digits
It’s the size of a desktop.
buff.ly
January 7, 2025 at 5:40 AM