Atlas Graph Explorer
Wiki
Graph
Edges
Home
EvalRun
eval-run:gpqa-diamond.gemini-3-pro.2025-11-18
eval-run:gpqa-diamond.gemini-3-pro.2025-11-18
eval-run:gpqa-diamond.gemini-3-pro.2025-11-18
EvalRun
benchmarks/eval-runs/eval-runs-google.yaml
·
Open in Graph →
overview
json
graph
Attributes
target
model:gemini-3-pro@current
benchmarkId
benchmark:gpqa
testSetId
test-set:gpqa-diamond-2024
targetId
model:gemini-3-pro@current
runAt
2025-11-18T00:00:00Z
runBy
google-deepmind
configHash
sha256:google-gemini-3-pro-gpqa-diamond-2025-11-18
Outgoing edges
(3)
evaluates_target
1
model:gemini-3-pro@current
·
ModelVersion
for_benchmark
1
benchmark:gpqa
·
Benchmark
GPQA
uses_test_set
1
test-set:gpqa-diamond-2024
·
TestSet
GPQA Diamond — 2024 release
Incoming edges
(1)
belongs_to_eval_run
1
eval-result:gpqa-diamond.gemini-3-pro.2025-11-18.accuracy
·
EvalResult