II.
Benchmark overview
Reference · livebenchmark:the-agent-company
TheAgentCompany overview
CMU benchmark simulating a real software-company environment (Gitea, RocketChat, Plane, OwnCloud, etc.) where agents complete consequential workplace tasks across tools.
Attributes
displayName
TheAgentCompany
homepageUrl
kind
full-stack
targetsKind
AgentVersion
description
CMU benchmark simulating a real software-company environment
(Gitea, RocketChat, Plane, OwnCloud, etc.) where agents complete
consequential workplace tasks across tools.
Outgoing edges
applies_to2
- domain:software-engineering·DomainSoftware Engineering
- domain:operations·DomainOperations
covers2
- skill-area:multi-app-orchestration·SkillAreaMulti-App Orchestration
- skill-area:bug-fixing-from-issues·SkillAreaBug Fixing from Issue Descriptions
Incoming edges
None.