Please enable JavaScript.
Coggle requires JavaScript to display documents.
C376 IT Compliance and Risk Management Business Continuity Management…
C376 IT Compliance and Risk Management
Business Continuity Management (BCM):
P12 & P13
P12: Business Continuity Management (BCM) Bootcamp
Business Continuity Management (BCM)
BUSINESS CONTINUITY (BC)
The capability of the organisation to
continue delivery of products
or services
At
acceptable predefined levels
Following a
disaster event
DISASTER
A sudden event that causes
damage
and destruction
And carries the risk of forcing the company to
stop operations permanently
BCM is a
management process
that
identifies potential threats and impacts
to an organisation
It provides a
framework
for building
organisational resilience
With capability of an
effective response
that s
afeguards the interests of its key stakeholders
, reputation, brand and value-creating activities
BCM planning should be a
continuous process
IMAGE HERE
BCM also helps during a disaster event
By providing guidance on how the company should
repair itself
and
recover to a state of normal operations
BCM should include:
Business Impact Analysis
Risk Asessment
Business Impact Analysis (BIA)
Purpose: To identify the
criticality of each business function
Steps to Perform BIA:
Form a team which has experienced staff that know the company's operations well
List down the
business functions, dependencies, and information assets
Determine the
maximum tolerable downtime (MTD)
for each business function
Reorder the business functions
Shortest MTD first
(as they are the most critical)
#
Why Shortest MTD First?
They are the most critical!
Failure to respond to them will
kill the company the fastest
BCM Strategy
Should be formed after BIA
Strategy should include:
Protection of high priority activities
Continuation of those activities and supporting resources
Mitigating impact
Include approve RTO, RPO
Resources need to implement strategy must be considered
Measures needed to mitigate risks be specified to
Lower possibility of disruption
Lower time of disruption
Lower impact of disruption
Exercise and Testing
BC should be tested to ensure consistency with objectives
Testing should ensure that
RTO is met
Be consistent with scope of BCM
Based on
relevant scenarios
, over all BCM arrangements
Lower chance of disrupting current activities
Incorporate post-test reviews with aim of continual improvement
Be conducted regularly, or when significant change to business
International Standards and Guidelines
SS ISO 22301:2012 Societal Security - Business Continuity
SS 540:2008 Singapore Standard for Business Continuity Management
MAS Guidelines for BCM
BC Organisation Chart
This chart defines the people responsible for BC
Some roles include:
BCM Chairman
Disaster Declaration Officer
BCM Manager
Having a defined organisation chart is important
A lack of it will cause confusion when disaster strikes
Criteria for Activation
There should be a clearly defined metric to determine when to activate the BC Plan
An unclear criteria will cause confusion when disaster happens
-----------------------------------------------
Example
The criterion for activation of this BC Plan is an event/incident that may result in the consequence of denial access or potential denial access of more than 24 hours of RP's operating site
BCM in Business and IT Strategy
BCM is part of business risk strategy
And IT risk strategy supports business risk strategy
Image
Disaster Recovery Plan (DRP)
The DRP is a
subset
of the overarching BCP
After a disaster is formally declared, personnel involved are activated to
recover the most critical business functions
in a timely manner
This is to keep the company alive in the midst of the disaster
According to approved documentations and procedures
DRP and BCP focus on...
Human safety
Avoid losses
Plan and prepare the disaster response so that the company
will not be unrecoverable
RECOVERY SOLUTIONS
Solution for people to work is required to continue performing the critical business functions during disaster recovery
LEASED SITES
HOT SITE
Can begin operations in a
few hours, max up to a day
All IT equipment (network, laptops, and servers) are in place
MOST EXPENSIVE site
WARM SITE
Can begin operations
in a few days
Much but not all
IT equipment are in place
Less costly than hot site
COLD SITE
Can begin operations in a few weeks or longer
Has
almost no IT equipment
; an empty building with
basic facilities
such as water, power, and communications)
CHEAPEST site
RECIPROCAL AGREEMENT
Agreement between to companies
If one of the two companies suffer the loss of its facility,
the other company will provide its space
within its compound
ISSUES WITH RA
Agreement is not an legal agreement (no enforcement)
Either company can default the agreement and not help the other party when disaster strikes
If both companies are suffering from the same disaster, the business needs of both companies cannot be fulfilled
RPO, RTO, MTD…WTH?!
RECOVERY POINT OBJECTIVE (RPO)
The
maximum amount of data the company is willing/can tolerate to lose
.
The value is measured in
time
The time between "Business as Usual" (where all data is backed up then) and "Disaster Strikes" (where all unsaved data is lost)
Example
If a company does daily full backups, the RPO for the company is 1 day
Having
real-time copies
is one technique to continuously update the backup copy
Having
scheduled daily backups
is one way of periodically creating backup copies
RECOVERY TIME OBJECTIVE (RTO)
The amount of
time required to restore backup data and critical systems
The time between "Disaster Strikes" and "Recovery (Completed)"
RTO must be aligned with MTD
This is to ensure that there is sufficient acceptable time to recover the lost data
If the MTD is shorter than the RTO, then there will be insufficient
acceptable
time to recover the data
MAXIMUM TOLERABLE DOWNTIME (MTD)
The
total maximum time that a business operation can be disrupted without causing irreparable damage
Typical MTDs
Critical :: Minutes to hours
Urgent :: Up to 1 day
Important :: Up to 3 days
Normal :: Up to 1 week
Non-essential :: Up to 30 days (or more)
P13 BCM Readiness
BCM Plans Testing
BC and DR plans should be tested
at least once a year
This will ensure that they are accurate, effective, and completeness, even when the environment changes
There are various ways to test the plans
Call-Tree Testing :phone:
A call tree is a predefined list
containing the contacts of key personnel
responsible for disaster recovery
This is important because each personnel has an important role to play and they need to be contactable
How the Call Tree Generally Works
BCP Coordinator activates department call tree
BCP coordinator provides the following information:
'H' hour
Declaration of disaster
Activation of BC Plan
Other special instructions
Start contacting key personnel
Record start and end timing of call tree testing
Review that the
time taken to reach everyone
is
within the pre-objective test target
If
failed
to meet target, come up with
countermeasures and post-test review
The 'H' hour is the
time when the incident occurred
Challenge of Call Tree Testing
Organisations may have call trees that contain several personnel
Imagine one person calling 50 other people in the company #ggwp
Will take too long for it to be efficient
Suggestion
Divide the workload
BCP Coordinator calls the respective department heads
Department heads will then call their subordinates
Once done, dept. heads will report back to BCP Coordinator
Testing the Plans
CONSISTENCY TESTING
Test to ensure that the BCP
is effective and efficient for responding to disaster
Steps to Conduct Consistency Testing
BCP Coordinator splits the BCP document into
various sections
BCP Coordinator sends different sections to the r
espective departments in charge
of that particular section
Department heads
reviews their sections and make amendments (if any)
, and send it back to the BCP Coordinator
BCP Coordinators
reassembles
the updated plan
Benefit
:
Inexpensive
and low risk to conduct
---
Limitation
:
No assurance
that the BCP will actually work (until disaster really happens)
STRUCTURED WALKTHROUGH
(TABLE-TOP TESTING)
Testing the recovery plan
without actually carrying out the actions
Steps to Conduct Tabletop Testing
Coordinator calls for a meeting with team members and
gives a disaster scenario
Team members then
describe/walkthrough
with the others his/her responsibilities and tasks
Benefit
: Any
integration issues between teams will start to surface
(Good because they can be resolved now, and not when the disaster
actually
happens)
---
Limitation
:
More costly
as personnel are taken away from their daily obligations and activities
SIMULATION TESTING
Testing the recovery plan
by
actually
doing them
but
without restoring data
Steps to Conduct Simulation Testing
Coordinator calls for a meeting with team members and
gives a disaster scenario
Team members then go and
carry out their respective tasks
Task may include going to secondary facility and booting up the secondary servers
Restoration of data is NOT done
PARALLEL TESTING
Testing the recovery plan
by
acutally doing them
and
restoring data
BUT...
restoring data is only done
on an alternate (secondary site)
Steps to Conduct Parallel Testing
Simulation testing is carried out
However, disaster recovery procedure (including data restoration) is done in secondary facility
Procedure may include initialising of systems before allowing it to be used by users
FULL INTERRUPTION TESTING
Similar to parallel testing
, but
this time, the
primary facility is shut down
This is to
really
test and
verify that the
secondary site is capable of carrying out
critical business functions
and keeping the company operational
Benefit
: Test is almost as close as real disaster, best way to verify reliability of DRP
---
Limitation
: Test is the
costliest and riskiest type
among the 5 types
Grab Bag/Ready Bag
This bag is normally a small bag or sealed envelope
The contents of the bag/envelope typically contain essential items for a disaster, including...
Up to date BCP document
Recovery procedures
Contact lists
This bag is distributed to key members of the company for them to
take home
This bag is
important to prepare beforehand
to ensure that when disaster happens,
everything
important is already in a centralised area and ready to go
Bring home the bag
in case the primary site is inaccessible
during disaster
One-Page Wallet Plan
The wallet plan is a piece of paper that can be kept in your wallet
The plan is meant to be
easily accessible
when it's needed
Key items in the wallet plan include:
Who you are to contact
during call tree activation
Key contacts details
BCP checklist
(steps to take immediately on being notified of a potential incident)
Recovery site
directions
(map)
The wallet plan is important because it will give you
important details to refer to
, and r
emove the burden of having to remember these information
in a time of
distress and panic
.
Training
Staff should be
regularly trained
To ensure that they are prepared to carry out their BC responsibilities
They should also be cross-trained in additional areas
So that they can cover for their fellow teammates in case they MIA
Staff should be made aware of the BC policy and their roles
The BCP should be readily and easily accessible (e.g. store on a website, Google Drive, Sharepoint, or something)
Post-Incident Analysis
Analysing a major disruption is critical to
determine how well your organisation responded and recovered from the event
.
Benefits of PIA
Provides
comprehensive record
of an incident
Assessment of
RTO
Assessment of
communications
to internal and external stakeholders
Assessment of
training needs
for personnel
Assessment of
alternate work sites
and
strategies
PIA Report
The report should cover the following areas:
Date and time
of disaster
Location
of disaster
Type
of disaster
Situation upon activation of the BCP process
Include a
brief description
of the situation encountered by the first personnel who assessed or arrived at the scene
Final outcome of the disaster
List the
extent of damage
to business operations
Also include personnel
injuries
or casualties (touchwood)
Strategy - List down the strategies chosen to respond and recover
Common
obstacles
faced
Recommendations - List down any recommendations for corrections or reduction of these obstacles
continued
What operations worked well and why?
List out the procedures that were successful
so that they may be applied to similar situations in the future
Stakeholder communications
Were communications adequate, proactive, and regularly updated?
Once completed, the report has to be
reviewed and approved
To ensure
accuracy and comprehensiveness
Recommended to complete the report within
14 days of the disaster
Note that this report
should not be made to discipline anyone or criticise any actions taken
during the event
Should be written and read as an
opportunity to learn from the incident
---
#lookwhatyoumademedo