Agentic AI Atlasby a5c.ai
OverviewWikiGraphFor AgentsEdgesSearchWorkspace
/
GitHubDocsDiscord
iiRecord
Agentic AI Atlas · eval-harness
lib-skill:shared--eval-harnessa5c.ai
Search record views/
Record · tabs

Available views

II.Record viewspp. 1 - 1
overviewjsongraph
II.
LibrarySkill overview

lib-skill:shared--eval-harness

Reference · live

eval-harness overview

Evaluation harness for testing agent and skill quality through structured benchmarks, regression tests, and quality scoring.

LibrarySkillOutgoing · 7Incoming · 0

Attributes

displayName
eval-harness
description
Evaluation harness for testing agent and skill quality through structured benchmarks, regression tests, and quality scoring.
libraryPath
library/methodologies/everything-claude-code/skills/eval-harness/SKILL.md
contentSummary
- Define test cases with known-correct outputs - Run agent against each test case - Score: accuracy, completeness, relevance - Compare against baseline performance - Track performance over time ### 2. Skill Quality Testing - Verify skill instructions produce expected outcomes - Test edge ca

Outgoing edges

lib_applies_to_domain1
  • domain:software-engineering·DomainSoftware Engineering
lib_covers_topic1
  • topic:developer-experience·TopicDeveloper Experience (DX)
lib_implements_workflow1
  • workflow:feature-development·Workflow
lib_involves_role2
  • role:tech-lead·RoleTech Lead
  • role:backend-engineer·RoleBackend Engineer
lib_requires_skill_area2
  • skill-area:agentic-loops·SkillAreaAgentic Loops
  • skill-area:orchestration-loop·SkillAreaOrchestration Loop Engineering

Incoming edges

None.

Related pages

No related wiki pages for this record.

Shortcuts

Open in graph
Browse node kind