displayName
LLM Evaluation Pipeline
workflowKind
data
triggerType
scheduled
typicalCadence
weekly
complexity
cross-team
description
Operates the continuous evaluation pipeline for LLM-powered features —
maintaining eval datasets, running benchmark suites across model versions,
tracking quality trends, and producing comparative reports. Excludes model fine-tuning.