displayName
Full-Stack System Reliability Review
workflowKind
governance
triggerType
scheduled
typicalCadence
quarterly
complexity
cross-team
description
Reviews end-to-end system reliability across the full technology
stack -- analyzing frontend error rates, API latency percentiles,
and database query performance together as correlated signals,
evaluating SLO attainment across web, API, and data tiers, reviewing
cloud infrastructure capacity headroom and auto-scaling policy
effectiveness, auditing observability coverage for blind spots in
tracing, logging, and metrics pipelines, assessing CI/CD pipeline
reliability and deployment success rates, stress-testing failover
procedures across availability zones, and correlating incident
frequency with recent change velocity. Produces cross-stack
reliability scorecard, SLO attainment report, and prioritized
hardening backlog. Excludes feature development.