www.futureofai.mit.edu
www.futureofai.mit.edu
This is the best result so far!
This is the best result so far!
Evals include:
1. AIME 2024, 2025
2. CODEFORCES CODE COMPLETION
3. GPQA DIAMOND SCIENCE QUESTIONS
4. SWE BENCH
5. AIDER POLYGOT
And more..
Evals include:
1. AIME 2024, 2025
2. CODEFORCES CODE COMPLETION
3. GPQA DIAMOND SCIENCE QUESTIONS
4. SWE BENCH
5. AIDER POLYGOT
And more..
This happened during the testing of a pre-release version of the o3 model by Transluce AI.
(Complete details 👇)
This happened during the testing of a pre-release version of the o3 model by Transluce AI.
(Complete details 👇)
Posted in reddit by u/lucid_sky_
Posted in reddit by u/lucid_sky_
- smarter, lower latency
- way better at coding
- stronger instruction following
- 1M long context support
with 4.1 (smart & effective)
- smarter, lower latency
- way better at coding
- stronger instruction following
- 1M long context support
with 4.1 (smart & effective)
Google just launched DolphinGemma: a foundational AI model trained to generate audio-to-audio for dolphin communication!
Google just launched DolphinGemma: a foundational AI model trained to generate audio-to-audio for dolphin communication!
Pretty impressive.
Pretty impressive.
Impressive results in coding btw
Looks like @GoogleDeepMind is cooking,
livebench.ai/#/
aistudio.google.com/prompts/new_chat
Impressive results in coding btw
Looks like @GoogleDeepMind is cooking,
livebench.ai/#/
aistudio.google.com/prompts/new_chat
It scores pretty impressive btw in 17 out of 20 benchmarks compared to Openai's GPT-4o, 9 out of 20 against claude 3.5 sonnetv2, and 19 out of 20 againts gemini 1.5 pro.
It scores pretty impressive btw in 17 out of 20 benchmarks compared to Openai's GPT-4o, 9 out of 20 against claude 3.5 sonnetv2, and 19 out of 20 againts gemini 1.5 pro.
When done, it’ll be five times bigger than the one used for Claude 3.5 Sonnet, featuring hundreds of thousands of Amazon’s latest AI chip, Trainium 2.
Source: www.wired.com/story/amazon...
When done, it’ll be five times bigger than the one used for Claude 3.5 Sonnet, featuring hundreds of thousands of Amazon’s latest AI chip, Trainium 2.
Source: www.wired.com/story/amazon...
His concern is that Bad actors could fine-tune such models for all sorts of bad things.
His concern is that Bad actors could fine-tune such models for all sorts of bad things.
It allows you to apply different motions to various subjects within the same scene, offering greater control, smooth camera movements, and quality.
It allows you to apply different motions to various subjects within the same scene, offering greater control, smooth camera movements, and quality.
However, i think the model's censorship is quite strange.
However, i think the model's censorship is quite strange.
The San Francisco-based company, called /dev/agents plans to build a cloud-based operating system for AI Agents that can work across phones, laptops, and even cars.
#AI
The San Francisco-based company, called /dev/agents plans to build a cloud-based operating system for AI Agents that can work across phones, laptops, and even cars.
#AI
Qwen's QwQ-32B-Preview.
Qwen's QwQ-32B-Preview.