II.
StackProfile overview
Reference · livestack-profile:disaster-recovery
Disaster Recovery (Terraform, Kubernetes, Prometheus, PostgreSQL, S3) overview
A disaster recovery and business continuity stack that uses Terraform to provision standby infrastructure across multiple cloud regions. Kubernetes workloads are backed up via declarative state snapshots stored in S3, with PostgreSQL point-in-time recovery configured for database resilience. Prometheus and Grafana monitor replication lag and failover readiness, while automated runbooks handle DNS cutover and traffic rerouting during actual incidents. Targeted at operations teams responsible for RPO/RTO SLAs on critical production systems. The tradeoff is the cost of maintaining warm standby resources and the complexity of testing failover procedures without impacting production.
Attributes
displayName
Disaster Recovery (Terraform, Kubernetes, Prometheus, PostgreSQL, S3)
description
A disaster recovery and business continuity stack that uses Terraform to
provision standby infrastructure across multiple cloud regions. Kubernetes
workloads are backed up via declarative state snapshots stored in S3,
with PostgreSQL point-in-time recovery configured for database resilience.
Prometheus and Grafana monitor replication lag and failover readiness,
while automated runbooks handle DNS cutover and traffic rerouting during
actual incidents. Targeted at operations teams responsible for RPO/RTO
SLAs on critical production systems. The tradeoff is the cost of
maintaining warm standby resources and the complexity of testing failover
procedures without impacting production.
composes
Outgoing edges
applies_to2
- domain:cloud-infra·DomainCloud Infrastructure
- domain:infrastructure·DomainInfrastructure
composed_of8
- tool:terraform·ToolTerraform
- tool:kubernetes·ToolKubernetes
- tool:prometheus·ToolPrometheus
- tool:grafana·ToolGrafana
- language:hcl·LanguageHCL
- language:yaml·LanguageYAML
- library:boto3·LibraryBoto3
- language:python·LanguagePython
follows_workflow2
- workflow:disaster-recovery-failover-drill·WorkflowDisaster Recovery Failover Drill
- workflow:backup-recovery-drill·WorkflowBackup Recovery Drill
requires_skill_area5
- skill-area:cloud-infrastructure·SkillAreaCloud Infrastructure
- skill-area:incident-response·SkillAreaIncident Response
- skill-area:capacity-planning-ops·SkillAreaCapacity Planning
- skill-area:runbook-authoring·SkillAreaRunbook Authoring
- skill-area:terraform-infrastructure·SkillAreaTerraform Infrastructure as Code
used_by_role3
- role:sre·Role
- role:platform-engineer·Role
- role:cloud-architect·Role
Incoming edges
None.