Lightnews — Scholar-powered news

MLCommons

@mlcommons.org

MLCommons is an AI engineering consortium, built on a philosophy of open collaboration to improve AI systems. Through our collective engineering efforts, we continually measure and improve AI technologies' accuracy, safety, speed, and efficiency.

Posts Replies Media Videos

MLCommons

@mlcommons.org

The next generation of AI won't just be innovative—it'll be resilient.

Access the benchmark and full findings: mlcommons.org/ailuminate/j...

Join the conversation!
6/6
#AIRiskandReliability #AISecurity

October 15, 2025 at 7:33 PM

MLCommons

@mlcommons.org

Why this matters:
→ Developers get standardized metrics to find and fix vulnerabilities
→ Policymakers get transparent, reproducible data
→ Users get systems they can actually trust

We're making hidden risks visible and measurable.
5/6

October 15, 2025 at 7:33 PM

MLCommons

@mlcommons.org

The Jailbreak Benchmark v0.5 tests AI resilience across:
-Text-to-text scenarios
-Multimodal scenarios
-12 hazard categories (violent crimes, CBRNE, child exploitation, suicide/self-harm, and more)

Built on our AILuminate safety benchmark methodology.
4/6

October 15, 2025 at 7:33 PM

MLCommons

@mlcommons.org

What is jailbreaking?

It's when users manipulate AI systems to bypass safety filters and produce harmful, unintended, or policy-violating content.

It's not theoretical. It's happening now.
3/6

October 15, 2025 at 7:33 PM

MLCommons

@mlcommons.org

The gap between AI safety and security is real—and dangerous.

89% of models showed degraded safety performance when exposed to common jailbreak techniques.

As AI powers healthcare, finance, and critical infrastructure, this vulnerability can't be ignored.
2/6

October 15, 2025 at 7:33 PM

MLCommons

@mlcommons.org

Nebius, NVIDIA, Oracle, Quanta Cloud Technology, Red Hat, Supermicro, TheStage.AI, University of Florida, Vultr

Results:
Datacenter: mlcommons.org/benchmarks/i...
Edge: mlcommons.org/benchmarks/i...
#MLPerf

September 9, 2025 at 6:15 PM

MLCommons

@mlcommons.org

Llama 2 70B shows remarkable progress - best systems now 5x faster than v4.0.
Thanks to all submitters AMD, Amitash Nanda, ASUSTeK, Broadcom, Cisco, CoreWeave, Dell, GATEOverflow, GigaComputing, Google, HPE, Intel, KRAI, Lambda, Lenovo, MangoBoost, Microsoft Azure, MiTAC,

September 9, 2025 at 6:15 PM

MLCommons

@mlcommons.org

6/6
Congrats to all contributors and working group members for advancing industry benchmarking! #MLPerf #Automotive #ADAS #AutonomousVehicles #AI #Cognata #Motional #NVIDIA #AVCC #MLCommons

August 27, 2025 at 6:15 PM

MLCommons

@mlcommons.org

5/6
The results are designed to help OEMs, suppliers, and the whole ecosystem make informed decisions for next-generation, safety-critical automotive AI systems. See results: mlcommons.org/benchmarks/mlperf-automotive/

Benchmark MLPerf Autotmotive MLCommons V0.5

The MLPerf Automotive benchmark suite measures the performance of computers intended for automotive, both for Advanced Driving Assistance System/Autonomous Driving (ADAS/AD) and In-Vehicle Infotainmen...

mlcommons.org

August 27, 2025 at 6:15 PM

MLCommons

@mlcommons.org

4/6
MLPerf Automotive v0.5 covers 2D object recognition & segmentation and 3D object recognition using high-res datasets from Cognata (8-megapixel imagery) and Motional (nuScenes).

August 27, 2025 at 6:15 PM

MLCommons

@mlcommons.org

3/6
Special thanks to submitters GateOverflow and NVIDIA, and dataset partners Cognata_Ltd and Motional for making these benchmarks possible.

August 27, 2025 at 6:15 PM

MLCommons

@mlcommons.org

2/6
This milestone is powered by collaboration across Ambarella, ARM, Bosch, C-Tuning Foundation, CeCaS, Cognata, Motional, NVIDIA, Qualcomm, Red Hat, Samsung, Siemens EDA, UC Davis, and ZF Group.

August 27, 2025 at 6:15 PM

MLCommons

@mlcommons.org

Simplyblock, TTA, UBIX, IBM, WDC, and YanRong.
Check the results here:
mlcommons.org/benchmarks/s...

#MLPerf #AI #Storage #Benchmarking #MachineLearning #MLCommons

Benchmark MLPerf Storage | MLCommons V1.1 Results

The MLPerf Storage benchmark suite measures how fast storage systems can supply training data when a model is being trained. Below is a short summary of the workloads and metrics from the latest round...

mlcommons.org

August 4, 2025 at 5:36 PM

MLCommons

@mlcommons.org

5/ Congratulations and thanks to all submitters!
Alluxio, Argonne National Lab, DDN, ExponTech, FarmGPU, H3C, Hammerspace, HPE, JNIST/Huawei, Juicedata, Kingston, KIOXIA, Lightbits Labs, MangoBoost, Micron, Nutanix, Oracle, Quanta Cloud Technology, Samsung, Sandisk,

MLCommons - Better AI for Everyone

MLCommons aims to accelerate AI innovation to benefit everyone. It's philosophy of open collaboration and collaborative engineering seeks to improve AI systems by continually measuring and improving t...

MLCommons.org

August 4, 2025 at 5:36 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news