#AgentLab
November 5, 2025 at 2:17 PM
The BrowserGym Ecosystem for Web Agent Research

Thibault Le Sellier de Chezelles, Maxime Gasse, Alexandre Lacoste et al.

Action editor: Lingpeng Kong

https://openreview.net/forum?id=5298fKGmv3

#agentlab #agent #agents
April 7, 2025 at 12:07 AM
WebArena by Zhou et al; AgentLab and Browsergym by @servicenow.bsky.social allowed us to explore the latest agents; @gradio-hf.bsky.social enabled us to design UIs for implementing our ARIA framework, whereas @hf.co provided a hosting platform for 100GB+ artifacts.

bsky.app/profile/xhlu...
Agents like OpenAI Operator can solve complex computer tasks, but what happens when users use them to cause harm, e.g. spread misinformation?

To find out, we introduce SafeArena (safearena.github.io), a benchmark to assess the capabilities of web agents to complete harmful web tasks. A thread 👇
March 10, 2025 at 5:45 PM
New #Expert Certification:

The BrowserGym Ecosystem for Web Agent Research

Thibault Le Sellier de Chezelles, Maxime Gasse, Alexandre Lacoste et al.

https://openreview.net/forum?id=5298fKGmv3

#agentlab #agent #agents
March 9, 2025 at 1:12 AM
Really glad that this work is out!

Agentlab and browsergym will be, in my opinion, very important components of web agent research and will play an important role in the toolkit of most web agent researchers.

Read the paper if you are interested in learning more about what the platform covers!
We’re really excited to release this large collaborative work for unifying web agent benchmarks under the same roof.

In this TMLR paper, we dive in-depth into #BrowserGym and #AgentLab. We also present some unexpected performances from Claude 3.5-Sonnet
December 14, 2024 at 1:16 AM
Visit our paper
📃https://arxiv.org/abs/2412.05467
Or our open-source tools:
🤖https://github.com/ServiceNow/AgentLab
💪https://github.com/ServiceNow/BrowserGym
🎯https://github.com/ServiceNow/WorkArena
December 12, 2024 at 5:55 PM
We’re really excited to release this large collaborative work for unifying web agent benchmarks under the same roof.

In this TMLR paper, we dive in-depth into #BrowserGym and #AgentLab. We also present some unexpected performances from Claude 3.5-Sonnet
December 12, 2024 at 5:55 PM
Very excited to see this work coming out from @servicenowresearch.bsky.social. Can't wait to test a trained model in #AgentLab
🎉 Excited to introduce BigDocs!
An open, transparent multimodal dataset designed for:
📄 Documents
🌐 Web content
🖥️ GUI understanding
👨‍💻 Code generation from images
We’re also launching BigDocs-Bench:
➡️ Document, Web, GUI Visual reasoning
➡️ Converting images into JSON, Markdown, LaTeX, SVG, and more!
December 10, 2024 at 10:55 PM
Oops sorry it’s actually called AgentLab but it’s from the same people.

From what I understand it’s a way to run parallel experiments using benchmarks for web agents.

I’m not familiar with agent benchmarks so I’m not sure how good the included ones are.
December 8, 2024 at 1:13 AM
Découvrez AgentLab, un framework open source pour le développement et l’évaluation des agents Web

➡️ www.actuia.com/actualite/ag... via @ActuIA
AgentLab, un framework open source pour le développement et l’évaluation des agents Web
Lancé par ServiceNow, AgentLab est un framework open source visant à faciliter le développement et l'évaluation d'agents Web. Son objectif principal est
www.actuia.com
December 7, 2024 at 9:00 AM
Thoughts on this? >> ServiceNow Releases AgentLab: A New Open-Source Python
Package for Developing and Evaluating Web Agents: Developing web agents is a challenging area of AI research that has attracted significant attention in recent… >> Comment below! #mhealth #AI #IoT #healthtech #industry40
ServiceNow Releases AgentLab: A New Open-Source Python Package for Developing and Evaluating Web Agents
Developing web agents is a challenging area of AI research that has attracted significant attention in recent years. As the web becomes more dynamic and complex, it demands advanced capabilities from agents that interact autonomously with online…
dlvr.it
December 5, 2024 at 8:17 AM
🔍 Analyse your agent's behavior using AgentLab-XRay, a custom UI allowing you to navigate all your experiments.
December 3, 2024 at 9:02 PM
AgentLab: github.com/ServiceNow/AgentLab/
🚀 Easy large-scale parallel agent experiments
🔧 Building blocks for crafting agents over BrowserGym
🤖 Unified LLM API for seamless integration
🔁 Reproducibility features for consistent results
🏆 Unified Leaderboard across multiple benchmarks
December 3, 2024 at 9:02 PM
🧵-1
We are thrilled to release #AgentLab, a new open-source package for developing and evaluating web agents. This builds on the new #BrowserGym package which supports 10 different benchmarks, including #WebArena.
December 3, 2024 at 9:02 PM