Atlas Graph Explorer
Wiki
Graph
Edges
Home
Benchmark
benchmark:gaia
GAIA
benchmark:gaia
Benchmark
benchmarks/benchmarks/gaia.yaml
·
Open in Graph →
overview
json
graph
Attributes
displayName
GAIA
homepageUrl
https://huggingface.co/gaia-benchmark
kind
agent-reasoning
targetsKind
AgentVersion
description
General AI Assistants benchmark — real-world agent reasoning tasks.
Outgoing edges
(1)
covers
1
skill-area:agentic-loops
·
SkillArea
Agentic Loops
Incoming edges
(4)
bounds_subject
1
scope-boundary:gaia.scope
·
ScopeBoundary
evaluated_by
1
eval-run:gaia.claude-code.2025
·
EvalRun
for_benchmark
1
eval-run:gaia.claude-code.2025
·
EvalRun
split_of
1
test-set:gaia-validation
·
TestSet
GAIA validation split