Agentic AI Atlas

II.

TestSet overview

test-set:bigcode-evalplus

Reference · live

BigCode EvalPlus overview

Canonical EvalPlus HumanEval+ release used in many post-2023 code-LLM evaluations.

TestSetOutgoing · 1Incoming · 0

Attributes

displayName

BigCode EvalPlus

benchmarkId

benchmark:bigcode-evalplus

caseCount

164

releasedAt

2023-05-08

composition

EvalPlus extends HumanEval and MBPP with ~80x more test cases generated via type-aware mutation, exposing functional bugs that pass the original tests but fail under stricter scrutiny. This entry represents the HumanEval+ portion.

homepageUrl

https://github.com/evalplus/evalplus

description

Canonical EvalPlus HumanEval+ release used in many post-2023 code-LLM evaluations.

Outgoing edges

belongs_to_benchmark1

benchmark:bigcode-evalplus·BenchmarkEvalPlus

Incoming edges

None.