displayName
Code-quality rubric
scaleKind
likert
criteria
name
correctness
description
Does the code implement the intended behavior?
scale
1-5
weight
0.4
name
readability
description
Is the code clear and well-named?
scale
1-5
weight
0.2
name
idiomaticity
description
Idiomatic for the target language.
scale
1-5
weight
0.2
name
performance
description
No obvious algorithmic issues.
scale
1-5
weight
0.2
description
Four-dimension code rubric for SWE-bench-style judging.