Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 11: High Availability - Coggle Diagram
Chapter 11: High Availability
High Availability
A system is considered to be available when it is in operation. A system has high availability exceeds service level agreements
Measuring High Availability
A requirement that a database be available 24/7/365 must be put in the context of thecost in deploying and maintaining such a solution. An examination of the complexityand cost of very high levels of availability will sometimes lead to compromises thatreduce requirements for this level of availability.
The System Stack and Availability
There are many different causes of unplanned downtime. You can prevent some veryeasily, while others require significant investments in site infrastructure, including soft‐ware, servers, storage, networks, and appropriate employee skills.
Server Hardware, Storage, and Database Instance Failure
server may crash because of hardware problems, such as thefailure of a power supply, or because of software problems, such as a process that beginsto consume all the machine’s CPU resources.
Protecting Against System Failure
Providing component redundancy
Deploying Data Guard to provide an alternate site in case of primary site failure
Deploying Real Application Clusters for database continuity in the event of failureof an instance
Deploying application continuity or Transparent Application Failover softwareservices
Component Redundancy: The major system componentsthat should have redundancy include the following:
Disk drives
Disk controllers
Flash memory
CPUs
Power supplies
Cooling fans
Network cards
Instance Recovery
All committed transactions will be recovered.
In-flight transactions will be rolled back or undone.
Phases of Instance Recovery
Roll forward
Roll back
Site and Computer Server Failover: Protection from primary site failure involves monitoring of and redundancy controlsfor the following:
Data center power supply
Data center climate control facilities
Database server redundancy
Database redundancy
Data redundancy
Oracle Data Guard and Site Failures
There are three possible causes of lost data in the event of primary site failure whendeploying a physical standby database:
Archived redo logs have not been shipped to the standby site.
Filled online redo logs have not been archived yet.
The current online redo log is not a candidate for archiving until a log switch occurs.
Recovery Manager
Backs up one or more datafiles to disk or tape
Backs up archived redo logs to disk or tape
Backs up automatically the control file and server parameter file to disk or tape
Backs up CDBs and PDBs to disk or tape
Restores datafiles from disk or tape
Restores and applies archived redo logs to perform recovery
Restores the control file and server parameter file
Restores CDBs and PDBs from disk or tape
Guard
Oracle Active Data Guard and Zero Data Loss
Oracle GoldenGate and Replication
Real Application Clusters and Instance Failures
The phases of Real Application Clusters recovery are the following:
Cluster reorganization
Lock database rebuild
Instance recovery
Oracle Transparent Application Failover
The high availability benefits of TAF include the following:
Transparent reconnection
Automatic resubmission of queries
Callback functions
Failover-aware applications
Flashback
Flashback Query
Flashback Version Query
Flashback Transaction
Flashback Transaction Query
Flashback Drop
Flashback Table
Flashback Restore Points
Flashback Database
Flashback Logs and Block Media Recovery
Flashback Data Archive