displayName
Evaluation (meta)
clusterNumber
11
scope
Conceptual aggregation of NodeKinds covering benchmark / evaluation machinery: Benchmark and TestSet (the test scaffold), EvalRun (one execution against a target), EvalHarness (the runnable harness side), Judge (LLM / human / programmatic grader), Rubric (scoring criteria), and SkillArea (the named expertise area benchmarks are commonly bound against). Members live mainly in editorial cluster 11-benchmarks, with SkillArea in 9-domain; each MetaNodeKind records the truthful editorial slug.
parentClusterId
null