Agentic AI Atlas

II.

EvalHarness JSON

eval-harness:lm-eval-harness

Structured · live

EleutherAI lm-evaluation-harness json

Inspect the normalized record payload exactly as the atlas UI reads it.

File · benchmarks/eval-harnesses/eval-harnesses.yamlCluster · benchmarks

Record JSON

{
  "id": "eval-harness:lm-eval-harness",
  "_kind": "EvalHarness",
  "_file": "benchmarks/eval-harnesses/eval-harnesses.yaml",
  "_cluster": "benchmarks",
  "attributes": {
    "displayName": "EleutherAI lm-evaluation-harness",
    "harnessKind": "lm-eval-harness",
    "homepageUrl": "https://github.com/EleutherAI/lm-evaluation-harness",
    "description": "Reference harness for benchmark suites such as MMLU, ARC, HellaSwag,\nGSM8K. Drives the Open LLM Leaderboard.\n"
  },
  "outgoingEdges": [],
  "incomingEdges": [
    {
      "from": "eval-run:gaia.claude-code.2025",
      "to": "eval-harness:lm-eval-harness",
      "kind": "uses_harness",
      "attributes": {}
    }
  ]
}