Agentic AI Atlasby a5c.ai
OverviewWikiGraphFor AgentsEdgesSearchWorkspace
/
GitHubDocsDiscord
iiRecord
Agentic AI Atlas · AI Evaluation
skill-area:ai-evaluationa5c.ai
Search record views/
Record · tabs

Available views

II.Record viewspp. 1 - 1
overviewjsongraph
II.
SkillArea overview

skill-area:ai-evaluation

Reference · live

AI Evaluation overview

Systematic evaluation of AI model outputs — benchmark design, human preference collection, automated scoring pipelines, and red-teaming for quality, safety, and alignment assessment.

SkillAreaOutgoing · 1Incoming · 12

Attributes

displayName
AI Evaluation
description
Systematic evaluation of AI model outputs — benchmark design, human preference collection, automated scoring pipelines, and red-teaming for quality, safety, and alignment assessment.
domains
  • domain:ml-ops
expertiseLevels
  • intermediate
  • expert

Outgoing edges

applies_to1
  • domain:ml-ops·DomainMLOps

Incoming edges

prerequisite_for_learning1
  • skill-area:ai-agent-development·SkillAreaAI Agent Development
requires_skill_area1
  • stack-profile:prompt-engineering-workbench·StackProfilePrompt Engineering Workbench (TypeScript, React, PostgreSQL, LLM APIs, Redis)
tool_used_by4
  • tool:skillachi·ToolSkillachi
  • tool:langsmith·ToolLangSmith
  • tool:langfuse·ToolLangfuse
  • tool:ragas·ToolRagas
used_for6
  • tool:jupyter·ToolJupyter
  • tool:vllm·ToolvLLM
  • tool:tensorrt·ToolTensorRT
  • tool:triton-inference·ToolTriton Inference Server
  • tool:onnx-runtime·ToolONNX Runtime
  • tool:ragas·ToolRagas

Related pages

No related wiki pages for this record.

Shortcuts

Open in graph
Browse node kind