displayName
BIG-Bench Hard (BBH)
homepageUrl
https://github.com/suzgunmirac/BIG-Bench-Hard
kind
reasoning
targetsKind
ModelVersion
description
23 challenging tasks from BIG-Bench where prior LLMs underperformed
average human raters; widely used as a model reasoning benchmark.