iiRecord
Agentic AI Atlas · GSM8K test split
test-set:gsm8k-testa5c.ai
II.
TestSet overview

test-set:gsm8k-test

Reference · live

GSM8K test split overview

Canonical GSM8K test split used in nearly every published reasoning eval since 2022.

TestSetOutgoing · 1Incoming · 0

Attributes

displayName
GSM8K test split
benchmarkId
caseCount
1319
releasedAt
2021-10-27
composition
The held-out test split of GSM8K — 1,319 grade-school math word problems requiring 2-8 reasoning steps. Standard split published alongside the OpenAI GSM8K release.
homepageUrl
description
Canonical GSM8K test split used in nearly every published reasoning eval since 2022.

Outgoing edges

belongs_to_benchmark1

Incoming edges

None.