| id | displayName | cluster |
|---|---|---|
| benchmark:android-world | AndroidWorld | benchmarks |
| benchmark:os-world | OSWorld | benchmarks |
| benchmark:swe-bench | SWE-bench | benchmarks |
| benchmark:swe-bench-multimodal | SWE-bench Multimodal | benchmarks |
| benchmark:swe-bench-verified | SWE-bench Verified | benchmarks |
| benchmark:swe-lancer | SWE-Lancer | benchmarks |
| benchmark:the-agent-company | TheAgentCompany | benchmarks |