Agentic AI Atlasby a5c.ai
OverviewWikiGraphFor AgentsEdgesSearchWorkspace
/
GitHubDocsDiscord
iiRecord
Agentic AI Atlas · SWE-bench Verified
benchmark:swe-bench-verifieda5c.ai
Search record views/
Record · tabs

Available views

II.Record viewspp. 1 - 1
overviewjsongraph
II.
Benchmark overview

benchmark:swe-bench-verified

Reference · live

SWE-bench Verified overview

Human-verified subset of SWE-bench (500 cases) with cleaned task statements and verified-solvable issues.

BenchmarkOutgoing · 2Incoming · 29

Attributes

displayName
SWE-bench Verified
homepageUrl
https://www.swebench.com/
kind
full-stack
targetsKind
AgentVersion
description
Human-verified subset of SWE-bench (500 cases) with cleaned task statements and verified-solvable issues.

Outgoing edges

covers1
  • skill-area:bug-fixing-from-issues·SkillAreaBug Fixing from Issue Descriptions
refines1
  • benchmark:swe-bench·BenchmarkSWE-bench

Incoming edges

belongs_to_benchmark1
  • test-set:swe-bench-verified-2024-12·TestSetSWE-bench Verified 2024-12
bounds_subject1
  • scope-boundary:swe-bench-verified.scope·ScopeBoundary
for_benchmark12
  • eval-run:swe-bench-verified.claude-haiku-4-5.2025-10·EvalRun
  • eval-run:swe-bench.deepseek-v3.2024-12·EvalRun
  • eval-run:swe-bench-verified.gemini-2-5-flash.2025-06·EvalRun
  • eval-run:swe-bench-verified.llama-4-405b.2024-07·EvalRun
  • eval-run:swe-bench.llama-3-1-405b.2024-07·EvalRun
  • eval-run:swe-bench-verified.claude-opus-4-5.2025-09·EvalRun
  • eval-run:swe-bench-verified.claude-opus-4-7.2026-01·EvalRun
  • eval-run:swe-bench-verified.o3.2025-04·EvalRun
  • eval-run:swe-bench-verified.gemini-2-5-pro.2025-06·EvalRun
  • eval-run:swe-bench.claude-code@1.x.2025-04-29·EvalRun
  • eval-run:swe-bench-verified.claude-sonnet-4-5.2025-09·EvalRun
  • eval-run:swe-bench-verified.gpt-5.2025-08·EvalRun
scored_against15
  • eval-result:swe-bench-verified.claude-haiku-4-5.001·EvalResult
  • eval-result:swe-bench.deepseek-v3.001·EvalResult
  • eval-result:swe-bench-verified.gemini-2-5-flash.001·EvalResult
  • eval-result:swe-bench-verified.llama-4-405b.001·EvalResult
  • eval-result:swe-bench.llama-3-1-405b.001·EvalResult
  • eval-result:swe-bench-verified.claude-opus-4-5.001·EvalResult
  • eval-result:swe-bench-verified.claude-opus-4-7.001·EvalResult
  • eval-result:swe-bench-verified.gpt-5.headline·EvalResult
  • eval-result:swe-bench-verified.o3.001·EvalResult
  • eval-result:swe-bench-verified.gemini-2-5-pro.001·EvalResult
  • eval-result:swe-bench.claude-code.001·EvalResult
  • eval-result:swe-bench-verified.claude-sonnet-4-5.high-compute.001·EvalResult
  • eval-result:swe-bench-verified.claude-sonnet-4-5.001·EvalResult
  • eval-result:swe-bench-verified.gpt-5.headline.001·EvalResult
  • eval-result:swe-bench-verified.gpt-5.001·EvalResult

Related pages

No related wiki pages for this record.

Shortcuts

Open in graph
Browse node kind