AIME
banner
aime-hq.bsky.social
AIME
@aime-hq.bsky.social
AIME provides GPU cloud compute and develops AI-machines for deep learning and model inference (Multi-GPU workstations & HPC servers). We are in Berlin, Germany.
🚀 Baidu just released **ERNIE-4.5-VL-28B-A3B-Thinking** — open-source (Apache 2.0)!

✅ 3B active params
✅ 100% multimodal reasoning
✅ Visual reasoning, STEM, video understanding & “Thinking with Images”
✅ Tool use, precise grounding, dynamic zoom & search

👉 github.com/PaddlePaddle...
GitHub - PaddlePaddle/ERNIE: The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.
The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle. - PaddlePaddle/ERNIE
github.com
November 14, 2025 at 11:48 AM
MotionStream manipulates AI-generated videos in real time.

This is an exciting step towards a more intuitive, responsive, and creative future of AI content creation.

joonghyuk.com/motionstream...
MotionStream: Real-Time Video Generation with Interactive Motion Controls
MotionStream is a real-time, motion-controlled video generation system that enables streaming generation of arbitrarily long videos for interactive applications.
joonghyuk.com
November 11, 2025 at 1:38 PM
DeepSeek-OCR (new, LLM-centric, research-focused) vs. PaddleOCR (established, production-ready, multilingual).

Two different approaches to Document AI. Check them out!

github.com/deepseek-ai/...
GitHub - deepseek-ai/DeepSeek-OCR: Contexts Optical Compression
Contexts Optical Compression. Contribute to deepseek-ai/DeepSeek-OCR development by creating an account on GitHub.
github.com
October 22, 2025 at 11:06 AM
🚀 China’s InclusionAI (Ant Group/Alibaba) drops Ling-1T—a trillion-parameter open-source LLM with only 50B active per token!

✅ Beats Kimi-K2 & DeepSeek-V3
✅ Top in math (AIME’25)
✅ Efficient MoE design
✅ Strong multimodal & tool-use (~70% BFCL V3)

github.com/inclusionAI/Ling-V2
ZenMux
The Enterprise LLM Platform. Get a Unified API for all models, intelligent routing, and AI Model Insurance to eliminate hallucination risk.
zenmux.ai
October 21, 2025 at 1:16 PM
Samsung released TRM: Tiny Recursion Model (TRM), a Parameter‑Efficient Approach to Recursive Reasoning

👉 Key Insight: Demonstrates that high‑level reasoning on challenging tasks can be attained without large‑scale foundational models.

github.com/SamsungSAILM...
GitHub - SamsungSAILMontreal/TinyRecursiveModels
Contribute to SamsungSAILMontreal/TinyRecursiveModels development by creating an account on GitHub.
github.com
October 9, 2025 at 4:58 PM
Human3R is a unified, feed-forward framework for online 4D human-scene reconstruction, in the world frame, from casually captured monocular videos.

fanegg.github.io/Human3R/
Human3R: Everyone Everywhere All at Once
Human3R: Everyone Everywhere All at Once
fanegg.github.io
October 8, 2025 at 12:35 PM
Deepseek AI released DeepSeek-V3.2-Exp, an experimental version of their LLM, built upon V3.1-Terminus by introducing DeepSeek Sparse Attention - designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.

github.com/deepseek-ai/...
GitHub - deepseek-ai/DeepSeek-V3.2-Exp
Contribute to deepseek-ai/DeepSeek-V3.2-Exp development by creating an account on GitHub.
github.com
September 29, 2025 at 1:28 PM
Tencent released HunyuanImage-3.0, a powerful native multimodal model for image generation.

The model has 80 billion parameters and is currently the most powerful and largest open‑source image‑generation model available.

github.com/Tencent-Huny...
GitHub - Tencent-Hunyuan/HunyuanImage-3.0: HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation - Tencent-Hunyuan/HunyuanImage-3.0
github.com
September 29, 2025 at 10:47 AM
The Chinese research group BICLab has announced what it describes as the world’s first “brain‑like” large language model - an AI system built to consume less power, deliver higher performance, and run without relying on Nvidia hardware.

arxiv.org/abs/2509.05276
SpikingBrain Technical Report: Spiking Brain-inspired Large Models
Mainstream Transformer-based large language models face major efficiency bottlenecks: training computation scales quadratically with sequence length, and inference memory grows linearly, limiting long...
arxiv.org
September 25, 2025 at 4:06 PM
Wan-Animate is a unified framework for character animation and replacement.

humanaigc.github.io/wan-animate/
Wan-Animate
Wan-Animate: Unified Character Animation and Replacement with Holistic Replication
humanaigc.github.io
September 25, 2025 at 3:33 PM
Lucy Edit Dev is the first open-source instruction-guided video editing model that performs instruction-guided edits on videos using free-text prompts.

github.com/DecartAI/luc...
GitHub - DecartAI/Lucy-Edit-ComfyUI
Contribute to DecartAI/Lucy-Edit-ComfyUI development by creating an account on GitHub.
github.com
September 25, 2025 at 3:23 PM
Meta released MapAnything: Universal Feed-Forward Metric

3D Reconstruction, a simple, end-to-end trained transformer model that directly regresses the factored metric 3D geometry of a scene given various types of inputs (images, calibration, poses, or depth).

github.com/facebookrese...
GitHub - facebookresearch/map-anything: MapAnything: Universal Feed-Forward Metric 3D Reconstruction
MapAnything: Universal Feed-Forward Metric 3D Reconstruction - facebookresearch/map-anything
github.com
September 25, 2025 at 3:18 PM
Meta released "Code World Model" (CWM), a 32-billion-parameter open-weights LLM designed to advance research on code generation with world models.

The release includes model weights, technical report, model card, and starter code.

github.com/facebookrese...
GitHub - facebookresearch/cwm: Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.
Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation. - facebookresearch/cwm
github.com
September 25, 2025 at 3:16 PM
Alibaba released Qwen3-Omni, a natively end-to-end multilingual omni-modal (text, images, audio, video) foundation model, responding as real-time stream in both text and natural speech, available under open-source license.

github.com/QwenLM/Qwen3...
GitHub - QwenLM/Qwen3-Omni: Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generat...
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time. ...
github.com
September 23, 2025 at 11:25 AM
Which GPU is best suited for AI models?
Find a brief breakdown of current GPU types, sorted by performance in our blog article: www.aime.info/blog/en/deep...
September 11, 2025 at 2:00 PM
Looking for an entry-level AI workstation to run LLMs locally?

👉 The AIME G500E is designed as maintainable efficient multi-GPU workstation with enough cooling and PSU capacity to host up to four high-end GPUs.

📺 Have a look: www.aime.info/en/shop/prod...
September 11, 2025 at 10:00 AM
🇨🇭 EPFL, ETH Zurich, & the Swiss National Supercomputing Centre (CSCS) released Apertus, Switzerland’s first large-scale open, multilingual LLM.

Link to Paper: raw.githubusercontent.com/swiss-ai/ape...

Link to GitHub: github.com/swiss-ai/

Link to weights: huggingface.co/collections/...
September 11, 2025 at 8:44 AM
DeepSeek V3.1 is out, advancing Artificial Intelligence.

It is a transformer-based architecture with 560 billion parameters and a 1 million token context window. Its multi-modal capabilities includes text, code, and image understanding and supports over 100 languages.

deepseek.ai/blog/deepsee...
DeepSeek AI | Leading AI Language Models & Solutions
DeepSeek AI is the leading provider of advanced AI language models and enterprise solutions. Experience state-of-the-art artificial intelligence technology for your business needs.
deepseek.ai
September 3, 2025 at 4:11 PM
The chinese company Z.ai released their model GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

The models are vLLM- and SGlang-ready!

github.com/zai-org/GLM-V/
Chat with Z.ai - Free AI Chatbot powered by GLM-4.5
Start a free chat with your AI expert for code and smart tools. Tell Z.ai what you need—a complete full-stack application, a stunning presentation, or professional-grade writing—and get instant result...
Z.ai
September 3, 2025 at 4:09 PM
Alibaba Cloud released Qwen Image, a 20B MMDiT image foundation model that achieves significant advances in complex text renderingand precise image editing.

The model is now natively supported in ComfyUI.

It’s said to outperform FLUX and comparable models.

github.com/QwenLM/Qwen-...
GitHub - QwenLM/Qwen-Image: Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing. - QwenLM/Qwen-Image
github.com
September 3, 2025 at 4:08 PM
The AIME G500 is now even more powerful, supporting the Threadripper Pro 99xxWX CPUs!

www.aime.info/de/shop/prod...
AIME G500 - Multi GPU Workstation | AIME
AIME G500 - Workstation Der AIME G500 ist als wartungsfreundlich High-End-GPU Workstation konzipiert, mit eine herausragende Kühlleistung und Netzteil-Kapazität, um bis zu vier High-End-GPUs zu betrei...
www.aime.info
August 1, 2025 at 4:26 PM
Chinese company Z.ai released their model GLM-4.5 open source, a series models are foundation models designed for intelligent agents.

👉 GLM-4.5: 355B total / 32B active parameters

👉 GLM-4.5-Air: 106B total / 12B active parameters

github.com/zai-org/GLM-...
Chat with Z.ai - Free AI for Presentations, Writing & Coding
Start a free chat with your AI assistant. Tell Z.ai what you need—a stunning presentation, professional-grade writing, or a complex code script—and get instant results.
Z.ai
July 29, 2025 at 12:39 PM
The chinese Company MoonshotAI released Kimi K2 as a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters.

github.com/MoonshotAI/K...
GitHub - MoonshotAI/Kimi-K2: Kimi K2 is the large language model series developed by Moonshot AI team
Kimi K2 is the large language model series developed by Moonshot AI team - MoonshotAI/Kimi-K2
github.com
July 15, 2025 at 9:08 AM
TotalSegmentator is a tool for segmentation of most major anatomical structures in any CT or MR image, created by the department of Research and Analysis at University Hospital Basel.

github.com/wasserth/Tot...
GitHub - wasserth/TotalSegmentator: Tool for robust segmentation of >100 important anatomical structures in CT and MR images
Tool for robust segmentation of >100 important anatomical structures in CT and MR images - wasserth/TotalSegmentator
github.com
July 1, 2025 at 9:04 AM
Black Forest Lab released FLUX.1 Kontext [dev], which delivers proprietary-level image editing performance in a 12B parameter model that can run on consumer hardware.

bfl.ai/announcement...
Black Forest Labs - Frontier AI Lab
Amazing AI models from the Black Forest.
bfl.ai
June 27, 2025 at 9:06 AM