displayName
EleutherAI lm-evaluation-harness
harnessKind
lm-eval-harness
homepageUrl
https://github.com/EleutherAI/lm-evaluation-harness
description
Reference harness for benchmark suites such as MMLU, ARC, HellaSwag,
GSM8K. Drives the Open LLM Leaderboard.