II.
Benchmark overview
Reference · livebenchmark:webarena
WebArena overview
Realistic web environment benchmark for autonomous agents — e-commerce, gitlab, reddit, and CMS sites.
Attributes
displayName
WebArena
homepageUrl
kind
web-agent
targetsKind
AgentVersion
description
Realistic web environment benchmark for autonomous agents — e-commerce, gitlab, reddit, and CMS sites.
Outgoing edges
applies_to1
- domain:web-development·DomainWeb Development
covers1
- skill-area:browser-automation·SkillAreaBrowser Automation
Incoming edges
bounds_subject1
- scope-boundary:webarena.scope·ScopeBoundary