II.
Domain overview
Reference · livedomain:site-reliability
Site Reliability Engineering overview
Reliability practices at scale — SLO/SLI definition, error budgets, incident management, toil reduction, and production readiness reviews. Bridges software engineering and operations for highly available services.
Attributes
displayName
Site Reliability Engineering
description
Reliability practices at scale — SLO/SLI definition, error budgets,
incident management, toil reduction, and production readiness reviews.
Bridges software engineering and operations for highly available services.
Outgoing edges
contains2
- specialization:sre·Specialization
- specialization:devops-sre-platform·Specialization
Incoming edges
applies_to14
- methodology:Google-SRE-methodology·MethodologyGoogle SRE Methodology
- methodology:chaos-engineering-methodology·MethodologyChaos Engineering
- skill-area:SLI-definition·SkillAreaSLI Definition
- skill-area:SLO-policy-design·SkillAreaSLO Policy Design
- skill-area:error-budget-management·SkillAreaError Budget Management
- skill-area:runbook-automation·SkillAreaRunbook Automation
- skill-area:on-call-optimization·SkillAreaOn-Call Optimization
- skill-area:post-incident-review·SkillAreaPost-Incident Review
- skill-area:incident-communication·SkillAreaIncident Communication
- skill-area:status-page-management·SkillAreaStatus Page Management
- skill-area:testing-in-production·SkillAreaTesting in Production
- skill-area:synthetic-monitoring·SkillAreaSynthetic Monitoring
- skill-area:blameless-postmortem-facilitation·SkillAreaBlameless Postmortem Facilitation
- role:chaos-engineer·RoleChaos Engineer
applies_to_domain6
- workflow:chaos-experiment·WorkflowChaos Experiment
- workflow:on-call-handoff·WorkflowOn-Call Handoff
- workflow:rollback-execution·WorkflowRollback Execution
- workflow:load-testing-pipeline·WorkflowLoad Testing Pipeline
- workflow:post-mortem·WorkflowPost-Mortem Review
- workflow:disaster-recovery-drill·WorkflowDisaster Recovery Drill