II.
SkillArea overview
Reference · liveskill-area:retrieval-evaluation
Retrieval Evaluation overview
Measuring and improving RAG pipeline quality — evaluation metrics (faithfulness, answer relevance, context precision, context recall), evaluation frameworks (Ragas, DeepEval, TruLens), building golden evaluation datasets, A/B testing retrieval configurations, monitoring retrieval quality in production, and the distinction between component-level evaluation (retriever quality) and end-to-end evaluation (final answer quality).
Attributes
displayName
Retrieval Evaluation
description
Measuring and improving RAG pipeline quality — evaluation metrics
(faithfulness, answer relevance, context precision, context recall),
evaluation frameworks (Ragas, DeepEval, TruLens), building golden
evaluation datasets, A/B testing retrieval configurations, monitoring
retrieval quality in production, and the distinction between
component-level evaluation (retriever quality) and end-to-end
evaluation (final answer quality).
domains
expertiseLevels
- intermediate
- expert
Outgoing edges
applies_to1
- specialization:ai-agents-conversational·Specialization
uses_tool1
- tool:haystack·ToolHaystack
Incoming edges
prerequisite_for_learning1
- skill-area:embedding-optimization·SkillAreaEmbedding Optimization
tool_used_by1
- tool:skillachi·ToolSkillachi