Atlas Graph Explorer
Wiki
Graph
Edges
Home
EvalRun
eval-run:gpqa-diamond.gpt-5-4-mini.2026-03-17
eval-run:gpqa-diamond.gpt-5-4-mini.2026-03-17
eval-run:gpqa-diamond.gpt-5-4-mini.2026-03-17
EvalRun
benchmarks/eval-runs/eval-runs-openai.yaml
·
Open in Graph →
overview
json
graph
Attributes
target
model:gpt-5.4-mini@current
benchmarkId
benchmark:gpqa
testSetId
test-set:gpqa-diamond-2024
targetId
model:gpt-5.4-mini@current
runAt
2026-03-17T00:00:00Z
runBy
openai
configHash
sha256:openai-gpt-5-4-mini-gpqa-diamond-2026-03-17
Outgoing edges
(3)
evaluates_target
1
model:gpt-5.4-mini@current
·
ModelVersion
for_benchmark
1
benchmark:gpqa
·
Benchmark
GPQA
uses_test_set
1
test-set:gpqa-diamond-2024
·
TestSet
GPQA Diamond — 2024 release
Incoming edges
(1)
belongs_to_eval_run
1
eval-result:gpqa-diamond.gpt-5-4-mini.2026-03-17.accuracy
·
EvalResult