II.
Responsibility overview
Reference · liveresponsibility:inference-latency-sla
Inference latency SLA overview
Ensure ML model inference meets latency targets — monitor P50/P99 response times, optimize serving infrastructure, and enforce performance budgets for model endpoints.
Attributes
displayName
Inference latency SLA
cadence
continuous
description
Ensure ML model inference meets latency targets — monitor P50/P99
response times, optimize serving infrastructure, and enforce
performance budgets for model endpoints.
Outgoing edges
held_by2
- role:ml-engineer·RoleMachine Learning Engineer
- role:machine-learning-ops-engineer·RoleMachine Learning Ops Engineer
requires_expertise2
- skill-area:model-serving·SkillAreaModel Serving
- skill-area:inference-optimization·SkillAreaInference Optimization
Incoming edges
holds_responsibility2
- role:machine-learning-ops-engineer·RoleMachine Learning Ops Engineer
- role:speech-engineer·RoleSpeech Engineer