subjectId
inScope
Repository-level code completion benchmark across Python and Java; tests retrieval and completion within multi-file project context.
outOfScope
Single-file code completion (use HumanEval/MBPP), agentic tool-use evaluations, and runtime test-pass scoring of full repositories.
outOfScopeReasonIds