Please enable JavaScript.
Coggle requires JavaScript to display documents.
Disaster Recovery (RD) - Coggle Diagram
Important Metrics
Recovery Point Objective (RPO)
- The last backup point before the disaster happened
- The closer it is to real-time, more cost
Recovery Time Objective (RTO)
- The time it takes for a failed system to be operational again
- The faster the recovery, the more cost
Recovery Options
Backup & Restore
- Keep only the replication of the backup
- If the primary infrastructure is down, the new infrastructure must be configured from sketch
- Infrastructure as Code (IaC) can be helpful in this scenario
Pilot Light
- Core workload infrastructure is replicated and provisioned
- All the compute instances can be turned off while not in use
- It's matter of turning on the turned off-ed instances and redirect traffic to the secondary infrastructure
Warm Standby
- Core workload infrastructure is replicated and provisioned
- instead of turning off all instances, only the minimal number of instances are running
- Recovery is fast as just the traffic has to shift
Multi-site Active/Active
- The full copy of the core workload infrastructure is provisioned and active to serve traffic
- If one is down, the other will take over while the failed infrastructure recover
EC2 Recovery
- If CloudWatch Alarm detects error of EC2 instances, some state of the instance can recover immediately
- IP, Metadata, and Placement Group