Atlas Graph Explorer
Wiki
Graph
Edges
Home
TestSet
test-set:swe-bench-verified-2024-12
SWE-bench Verified 2024-12
test-set:swe-bench-verified-2024-12
TestSet
benchmarks/test-sets/swe-bench-verified-2024-12.yaml
·
Open in Graph →
overview
json
graph
Attributes
displayName
SWE-bench Verified 2024-12
benchmarkId
benchmark:swe-bench-verified
caseCount
500
releasedAt
2024-12-01
description
The December 2024 release of the SWE-bench Verified test set.
Outgoing edges
(1)
belongs_to_benchmark
1
benchmark:swe-bench-verified
·
Benchmark
SWE-bench Verified
Incoming edges
(12)
uses_test_set
12
eval-run:swe-bench-verified.claude-haiku-4-5.2025-10
·
EvalRun
eval-run:swe-bench.deepseek-v3.2024-12
·
EvalRun
eval-run:swe-bench-verified.gemini-2-5-flash.2025-06
·
EvalRun
eval-run:swe-bench-verified.llama-4-405b.2024-07
·
EvalRun
eval-run:swe-bench.llama-3-1-405b.2024-07
·
EvalRun
eval-run:swe-bench-verified.claude-opus-4-5.2025-09
·
EvalRun
eval-run:swe-bench-verified.claude-opus-4-7.2026-01
·
EvalRun
eval-run:swe-bench-verified.o3.2025-04
·
EvalRun
eval-run:swe-bench-verified.gemini-2-5-pro.2025-06
·
EvalRun
eval-run:swe-bench.claude-code@1.x.2025-04-29
·
EvalRun
eval-run:swe-bench-verified.claude-sonnet-4-5.2025-09
·
EvalRun
eval-run:swe-bench-verified.gpt-5.2025-08
·
EvalRun