displayName
SWE-Lancer
homepageUrl
https://github.com/openai/SWELancer-Benchmark
kind
full-stack
targetsKind
AgentVersion
description
OpenAI benchmark of >1,400 paid freelance software-engineering tasks
scraped from Upwork, with associated bounties. Measures end-to-end
agent ability to deliver shippable freelance work.