Discussion 3
Khader7884Principles of Incident Response and Disaster Recovery, 2nd Edition
Chapter 03
Contingency Strategies for
IR/DR/BC
1
1
Objectives
Discuss the relationships between the overall use of contingency planning and the subordinate elements of incident response, business resumption, disaster recovery, and business continuity planning
Describe the techniques used for data and application backup and recovery
Explain the strategies employed for resumption of critical business processes at alternate and recovered sites
Principles of Incident Response and Disaster Recovery, 2nd Edition
2
2
Introduction
Contingency planning (CP)
Preparing for the unexpected
Keeping the business alive
Incident response (IR) process
Detecting, evaluating, and reacting to an incident
Keeping business functioning if physical plant destroyed or unavailable
Business resumption plan
Used when IR process cannot contain and resolve an incident
Principles of Incident Response and Disaster Recovery, 2nd Edition
3
3
Introduction (cont’d.)
Business resumption plan (BR plan) elements
Disaster recovery plan (DR plan)
Lists and describes efforts to resume normal operations at primary business places
Business continuity plan (BC plan)
Steps for implementing critical business functions until normal operations resume at primary site
Primary site
Location(s) where organization executes its functions
Principles of Incident Response and Disaster Recovery, 2nd Edition
4
4
Introduction (cont’d.)
BRP, DRP and BCP
Distinct place, role, timing, and planning requirements
Principles of Incident Response and Disaster Recovery, 2nd Edition
5
5
Principles of Incident Response and Disaster Recovery, 2nd Edition
6
6
Introduction (cont’d.)
Organizations require:
Reliable method of restoring information and reestablishing all operations
Five key procedural mechanisms
Delayed protection
Real-time protection
Server recovery
Application recovery
Site recovery
Principles of Incident Response and Disaster Recovery, 2nd Edition
7
7
Data and Application Resumption
Data backup: recovery from an incident
Snap-shot of data from a specific point in time
Data considered volatile and subject to change
Online backup, disk backup, and tape backup
Archive: recovery from threat to on-site backups
Long-term document or data file storage
Usually for legal or regulatory purposes
Data backup policy
Data files and critical system: daily
Nonessential files: weekly
Principles of Incident Response and Disaster Recovery, 2nd Edition
8
8
Data and Application Resumption (cont’d.)
Retention schedule
Guides replacement frequency and storage duration
May be dictated by law
Routine critical data
Retain one or two most recent daily backup copies
Retain at least one off-site copy
Full backups of entire systems
Store at least one copy in a secure location
NIST backup and recovery strategies
Alternatives should be considered
Principles of Incident Response and Disaster Recovery, 2nd Edition
9
9
Data and Application Resumption (cont’d.)
Principles of Incident Response and Disaster Recovery, 2nd Edition
10
10
Online Backups and the Cloud
Online backup to third-party data storage vendor
Referred to as data storage “in the cloud”
Commonly associated with leasing resources
Raises security challenges
Descriptions
Software as a Service (SaaS)
Platform as a Service (PaaS)
Infrastructure as a Service (IaaS)
Cloud deployment
Public cloud, community cloud, private cloud
Principles of Incident Response and Disaster Recovery, 2nd Edition
11
11
Disk to Disk to Other: Delayed Protection
Organizations create massive arrays
Independent, large-capacity drives
Store information at least temporarily
Example: home users
Add external USB-mounted SATA 1–2 terabyte drives
Advantages
Precludes time-consuming nature of tape backup
Avoids tape costs and implementation challenges
At the individual-user level
Allows quick and easy recovery
Principles of Incident Response and Disaster Recovery, 2nd Edition
12
12
Disk to Disk to Tape
Solves problem with massively connected storage area networks
Lack of redundancy if both online and backup versions fail
Uses secondary disk series to avoid the need to take the primary set offline for duplication
Reduces resource usage on the primary systems
Disk-to-disk initial copies
Can be made efficiently and simultaneously with other system processes
Principles of Incident Response and Disaster Recovery, 2nd Edition
13
13
Disk to Disk to Cloud
Also called disk-to-disk-to-online
Aggregate all local backups to a central repository
Then back up repository to an online vendor
Benefits
Reduced risk of corruption to the confidentiality, integrity, availability of stored online data
Users can back up their data to a central location
Most providers use an encryption process
Can easily access data from Internet
Can automate the cloud backup process
Principles of Incident Response and Disaster Recovery, 2nd Edition
14
14
Types of Backup
Full: complete system backup
Differential: files changed or added since full backup
Incremental: archive files modified since last backup
Requires less space and time than differential
Copy: set of specified files
Daily: only files modified on that day
All on-site and off-site storage must be secured
Fireproof safes or filing cabinets to store tapes
Encryption to protect online or cloud data storage
Principles of Incident Response and Disaster Recovery, 2nd Edition
15
15
Tape Backups and Recovery: General Strategies
Traditional: cost-effective for large data quantities
Digital audio tapes (DATs), quarter-inch cartridge (QIC) drives, 8-mm tape, digital linear tape (DLT)
Tape-based backup and recovery process
Schedule backup coupled with storage arrangement
Six-tape rotation method: media used in rotation
Grandparent/Parent/Child method: retains four full weekly (Friday) backups and adds a full monthly backup
Drawbacks: equipment cost and time
Principles of Incident Response and Disaster Recovery, 2nd Edition
16
16
Tape Backups and Recovery: General Strategies (cont’d.)
Principles of Incident Response and Disaster Recovery, 2nd Edition
17
17
Redundancy-Based Backup and Recovery Using RAID
Redundant array of independent drives (RAID)
Uses multiple hard drives to store information
Provides operational redundancy by spreading out data and using checksums
RAID implementations
Failure Resistant Disk Systems (FRDSs)
Failure Tolerant Disk Systems (FTDSs)
Disaster Tolerant Disk Systems (DTDSs)
Does not address need for off-site storage
Principles of Incident Response and Disaster Recovery, 2nd Edition
18
18
Principles of Incident Response and Disaster Recovery, 2nd Edition
19
19
Redundancy-Based Backup and Recovery Using RAID (cont’d.)
RAID Level 0
Not a form of redundant storage
Creates one larger logical volume across several available hard disk drives
Disk striping
Data segments written in turn to each disk drive in the array
Disk striping without parity
Occurs when multiple drives combined in order to gain large capacity without data redundancy
Increased risk: losing data from a single drive failure
Principles of Incident Response and Disaster Recovery, 2nd Edition
20
20
Redundancy-Based Backup and Recovery Using RAID (cont’d.)
RAID Level 1
Disk mirroring
Uses twin drives in a computer system
Computer records data to both drives simultaneously
Provides a backup if the primary drive fails
Expensive and inefficient media use
Same drive controller manages both drives
Disk duplexing
Each drive has its own controller
Can create mirrors and splits disk pairs to create highly available copies of critical system drives
Principles of Incident Response and Disaster Recovery, 2nd Edition
21
21
Redundancy-Based Backup and Recovery Using RAID (cont’d.)
RAID Level 2
Specialized form of disk striping with parity
Uses the Hamming code
Specialized parity coding mechanism
Stores stripes of data on multiple data drives
Stores corresponding redundant error correction on separate error-correcting drives
Allows data reconstruction
If some data or redundant parity information lost
No commercial implementations
Not widely used
Principles of Incident Response and Disaster Recovery, 2nd Edition
22
22
Redundancy-Based Backup and Recovery Using RAID (cont’d.)
RAID Levels 3 and 4
RAID 3 uses byte-level striping of data
RAID 4 uses block-level striping of data
Data segments stored on dedicated data drives
Parity information stored on a separate drive
One large volume used for data
Parity drive operates independently
Provides error recovery
Principles of Incident Response and Disaster Recovery, 2nd Edition
23
23
Redundancy-Based Backup and Recovery Using RAID (cont’d.)
RAID Level 5
Balances safety and redundancy
Against costs of acquiring and operating systems
Similar to RAID 3 and 4 striping data across drives
Difference: no dedicated parity drive
Data segments interleaved with parity data
Written across all drives in the set
RAID 5 drives can be hot swapped
Replaced without taking entire system down
Principles of Incident Response and Disaster Recovery, 2nd Edition
24
24
Redundancy-Based Backup and Recovery Using RAID (cont’d.)
RAID Level 6
Combination of RAID 1 and RAID 5
Block-level striping with double-distributed parity
Systems recover from two simultaneous drive failures
RAID Level 7 (proprietary)
Array works as a single virtual drive
May run special software over RAID 5 hardware
RAID Level 0+1
RAID 0 for performance; RAID 1 for fault tolerance
Striping, then mirroring
Principles of Incident Response and Disaster Recovery, 2nd Edition
25
25
Redundancy-Based Backup and Recovery Using RAID (cont’d.)
RAID Level 1+0
RAID 0 for performance; RAID 1 for fault tolerance
Mirroring, then striping
RAID Level 5+1
Raid 5 used for robustness
Adds a separate data parity drive not found in RAID 5
Also known RAID 53
Principles of Incident Response and Disaster Recovery, 2nd Edition
26
26
Principles of Incident Response and Disaster Recovery, 2nd Edition
27
27
Database Backups
Considerations
May or may not back up using operating system utilities
May or may interrupt database use
Must properly safeguard database
Special journal file requirements: run-unit journals or after-image journals
Applications to protect databases in near real time
Legacy backup applications (lock and copy)
Online backup applications (to online vendor)
Continuous database protection (near real time)
Principles of Incident Response and Disaster Recovery, 2nd Edition
28
28
Application Backups
Applications using file systems and databases
Some may invalidate customary backup and recovery
Include application support and development team members
In the planning process, and in training, testing, and rehearsal activities
Advances in cloud computing
Example: an organization leasing SaaS
Using applications on someone else’s systems
Service agreement should include recovery contingencies
Principles of Incident Response and Disaster Recovery, 2nd Edition
29
29
Backup and Recovery Plans
Backups must successfully restore systems
To an operational state
Backup and recovery settings
Provide with complete recovery plans
Periodically
Develop plans
Test plan
Rehearse plans
Principles of Incident Response and Disaster Recovery, 2nd Edition
30
30
Backup and Recovery Plans (cont’d.)
Developing backup and recovery plans
How and when will backups be created?
Who will be responsible for creation of the backups?
How and when will backups be verified so that they are known to be correct and reliable?
Who is responsible for the verification of the backup?
Where will backups be stored and for how long?
How often will the backup plan be tested?
When will the plan be reviewed and revised?
How often will the plan be rehearsed, and who will take part in the rehearsal?
Principles of Incident Response and Disaster Recovery, 2nd Edition
31
31
Real-Time Protection, Server Recovery, and Application Recovery
Mirroring
Provides real-time protection and data backup
Duplicates server data using multiple volumes
RAID level 1 achieved with software or hardware
Can write to drives located on other systems
Can be extended to vaulting and journaling
Hot, warm, and cold servers
Hot server provides services to support operations
Warm server provides services if primary busy/down
Cold server used for administrator’s test platform
Principles of Incident Response and Disaster Recovery, 2nd Edition
32
32
Real-Time Protection, Server Recovery, and Application Recovery (cont’d.)
Bare metal recovery technologies
Replace failed operating systems and services
Reboot affected system from CD-ROM or other remote drive
Quickly restore operating system
Providing images backed up from known stable state
Linux and UNIX versions abound
Windows just developing stand-alone bootable CD
Windows 7 can create a system repair disk
Windows systems can use setup disk to facilitate recovery and restoration
Principles of Incident Response and Disaster Recovery, 2nd Edition
33
33
Real-Time Protection, Server Recovery, and Application Recovery (cont’d.)
Application recovery or clustering plus replication
Software replication provides increased protection against data loss
Clustering services and application recovery
Similar to hot, warm, and cold redundant server model
Common to install applications on multiple servers
Application recovery software
Detects primary application server failure
Activates secondary application server
Vaulting and journaling
Dramatically increase protection
Principles of Incident Response and Disaster Recovery, 2nd Edition
34
34
Electronic Vaulting
Bulk transfer of data in batches to an off-site facility
Via leased lines or data communications services
Primary selection criteria
Service costs, bandwidth, stored data security, recovery, and continuity
Data transfer without affecting other operations
Scale purchases according to needs
Vendor managed solutions use software agent
Initiate full backup; continuously copies data
Data accessed via Web interface or software
Principles of Incident Response and Disaster Recovery, 2nd Edition
35
35
Principles of Incident Response and Disaster Recovery, 2nd Edition
36
36
Remote Journaling
Transfers live transactions to an off-site facility
Only transactions transferred (not archived data)
Transfer performed online; much closer to real time
Involves online activities on a systems level
Data written to two locations simultaneously
Can be performed asynchronously
Facilitates key transaction recovery in near real time
Journaling may be enabled for an object
Operating system creates record of object’s behavior
Stored in a journal receiver
Principles of Incident Response and Disaster Recovery, 2nd Edition
37
37
Principles of Incident Response and Disaster Recovery, 2nd Edition
38
38
Database Shadowing
Combines e-vaulting with RJ
Writes multiple database copies simultaneously in two separate locations
Used with multiple databases on a single drive in a single system or with databases in remote locations, across a public or private carrier
Generally used for immediate data recovery
Works well for read-only functions
Data warehousing and mining, batch reporting cycles, complex SQL queries, local online access at the shadow site, load balancing
Principles of Incident Response and Disaster Recovery, 2nd Edition
39
39
Principles of Incident Response and Disaster Recovery, 2nd Edition
40
40
Database Shadowing (cont’d.)
Database replication
Backup of multiple copies of the database for recovery purposes
Three types
Snapshot replication
Merger replication
Transaction replication
E-vaulting, RJ, and database shadowing
Quickly becoming functions of various backup applications rather than services unto themselves
Organizations increasingly focus on availability
Principles of Incident Response and Disaster Recovery, 2nd Edition
41
41
Network-Attached Storage and Storage Area Networks
NAS uses single device or server attached to a network with common communications methods to provides online storage environment
Good for general file sharing or data backup use
SANs uses fiber-channel direct connections between systems needing additional storage and storage devices themselves
Good for high-speed and higher-security solutions
Principles of Incident Response and Disaster Recovery, 2nd Edition
42
42
Principles of Incident Response and Disaster Recovery, 2nd Edition
43
43
Network-Attached Storage and Storage Area Networks (cont’d.)
Principles of Incident Response and Disaster Recovery, 2nd Edition
44
44
Virtualization
Development and deployment of virtual rather than physical systems and services implementations
“Virtual machine”
Virtualized environment operating in or on a host platform
Host platform (host machine)
Physical server (and operating system)
Virtualization application and all virtual machines run on it
Principles of Incident Response and Disaster Recovery, 2nd Edition
45
45
Virtualization (cont’d.)
Virtual machine (guest)
Hosted operating system or platform running on the host machine
Hypervisor or virtual machine monitor
Specialized software that enables the virtual machine to operate on the host platform
Types
Hardware-level virtualization
Operating system-level virtualization
Application-level virtualization
Principles of Incident Response and Disaster Recovery, 2nd Edition
46
46
Virtualization (cont’d.)
Three applications dominate virtualization market
Microsoft’s Virtual Server
VMware’s VMware Server
Oracle VM VirtualBox
Virtualization is important to contingency planning
Allows easily and accurate entire system backup
Can create snapshot backups, load into a new host running the same virtualization application
No need to purchase and set up multiple pieces of hardware
Principles of Incident Response and Disaster Recovery, 2nd Edition
47
47
Site Resumption Strategies
Items requiring alternate processing capability
Disaster recovery plan implemented because primary site temporarily unavailable
Business continuity strategy to institute operations at an alternate site
Contingency management planning team (CPMT)
Chooses strategy often based on cost
Exclusive control options
Hot sites, warm sites, and cold sites
Popular shared-use options
Timeshare, service bureaus, and mutual agreements
Principles of Incident Response and Disaster Recovery, 2nd Edition
48
48
Exclusive Site Resumption Strategies
Principles of Incident Response and Disaster Recovery, 2nd Edition
49
49
Hot Sites
Fully configured computer facilities with all services, communications links, and physical plant operations
Can establish operations at a moment’s notice
Can be staffed around the clock to transfer control almost instantaneously
Requires e-vaulting, RJ, or data shadowing
Disadvantages: most expensive alternative
Must provide maintenance for all systems, equipment
Ultimate hot site: mirrored site identical to primary site
Principles of Incident Response and Disaster Recovery, 2nd Edition
50
50
Warm Sites
Provide similar services and options as a hot site
Software applications not included, installed, or configured
Frequently includes computing equipment and peripherals with servers; no client workstations
Has connections to facilitate quick data recovery
Some advantages of a hot site, but at a lower cost
May require hours, perhaps days for full functionality
Customized costs
Range upward of several thousand dollars per month
Principles of Incident Response and Disaster Recovery, 2nd Edition
51
51
Cold Sites
Provide only rudimentary services and facilities
No computer hardware or peripherals provided
All communication services must be installed after site occupied
No quick recovery or data duplication functions
Empty room with standard heating, air conditioning, and electrical service
Advantages
Better than nothing; reduced contention for floor space
Cost: few thousand dollars per month
Principles of Incident Response and Disaster Recovery, 2nd Edition
52
52
Mobile Sites and Other Options
Rolling mobile sites
Storing resources externally
Rental storage area containing duplicate or second-generation equipment can be used
Similar to Prepositioning of Overseas Materiel Configured to Unit Sets (POM-CUS) Cold War sites
Might arrange with a prefabricated building contractor
Provide immediate, temporary facilities (mobile offices) on site in the event of a disaster
Principles of Incident Response and Disaster Recovery, 2nd Edition
53
53
Shared-Site Resumption Strategies
Time-share
Operates like hot/warm/cold site
Leased in conjunction with a business partner or sister organization
Provides DR/BC option while reducing overall cost
Disadvantages
Facility made be needed simultaneously
Need to stock facility with equipment and data from all involved organizations
Complex negotiating
Party may exit agreement or sublease their options
Principles of Incident Response and Disaster Recovery, 2nd Edition
54
54
Shared-Site Resumption Strategies (cont’d.)
Principles of Incident Response and Disaster Recovery, 2nd Edition
Service bureaus
Service agency that provides a service for a fee
Service in the case of DR/CP
Provision of physical facilities in the event of a disaster
Agencies frequently provide off-site data storage (fee)
Service bureaus contracts
Specify exactly what the organization needs under what circumstances; guarantees space when needed
Disadvantages
Expensive option
Must be renegotiated periodically
55
55
Shared-Site Resumption Strategies (cont’d.)
Mutual agreements
Contract between two organizations
Assist the other in the event of a disaster
Obligation to provide necessary facilities, resources, services until receiving organization recovers
Other agreements provide cost-effective solutions
Between divisions of the same parent company
Between subordinate and senior organizations
Between business partners
Memorandum of agreement (MOA)
Defined expectations and capabilities for alternate site
Principles of Incident Response and Disaster Recovery, 2nd Edition
56
56
Service Agreements
Contractual documents guaranteeing certain minimum levels of service provided by vendors
Must be reviewed and, in some cases, mandated to support incident, disaster, and continuity planning
Should contain information on:
What the provider is promising
How the provider will deliver on those promises
Who will measure delivery and how
What happens if provider fails to deliver as promised
How the SLA will change over time
Refer to sample at end of chapter
Principles of Incident Response and Disaster Recovery, 2nd Edition
57
57
Definition of Applicable Parties
Introductory paragraph in any legal document
Serves to identify to whom the document applies
Contractual legal documents
Long formal names of the parties may be replaced with abbreviated names
Example: “the Client” “the Vendor” or “the Service Provider”
Principles of Incident Response and Disaster Recovery, 2nd Edition
58
58
Services to be Provided by the Vendor
Vendor or service provider specifies exactly what the client receives in exchange for payment
If not explicitly identified, vendor not required to provide it
Verbal agreements, compromises, or special arrangements must be fully documented
Specifies protection and restoration of services if incident or disaster occurs
May include contingency operations
Refer to Sample Service Agreement at end of chapter
Principles of Incident Response and Disaster Recovery, 2nd Edition
59
59
Fees and Payments for These Services
Indicates what vendor receives in exchange for the services rendered
Most common exchange: financial
May see exchange of services, goods, or other securities
Contract terms and any special fees specified
Common inclusion: “2/10 net 30”
Two percent discount if paid within 10 days
Net payment due in 30 days
Usually for shipped goods paid by invoice
Principles of Incident Response and Disaster Recovery, 2nd Edition
60
60
Statements of Indemnification
Statements indicating vendor not liable for actions taken by the client
If vendor incurs financial liability based on use of the vendor’s services
Client responsible for those costs
Failure to include such statements
May result in additional legal fees from both parties as vendor sues to recoup its losses
Principles of Incident Response and Disaster Recovery, 2nd Edition
61
61
Nondisclosure Agreements and Intellectual Property Assurances
Covers information confidentiality from everyone unless court mandated
Vendors certify document validity
Provides information as required
Client and vendor must formalize expectations
Regarding protection of confidentiality of the services and business information to be shared
Laws permit provider to view clients’ system contents in routine business conduct and maintenance
Principles of Incident Response and Disaster Recovery, 2nd Edition
62
62
Noncompetitive Agreements (Covenant Not to Compete)
Not essential to a service agreement
Customary client agreements
Not to use the vendor’s services to compete directly with the vendor
Not to use vendor information to gain a better deal with another vendor
Principles of Incident Response and Disaster Recovery, 2nd Edition
63
63
Chapter Summary
CP: prepare for the unexpected, keep business alive
Business resumption (BR) elements: DR, BC plans
Components come into play at specific times
Five key procedural mechanisms
Delayed protection, real-time protection, server recovery, application recovery, and site recovery
Backup plan is essential
Types: full, differential, and incremental
Determine how long data should be stored
RAID systems overcome tape backup limitations
Principles of Incident Response and Disaster Recovery, 2nd Edition
64
64
Chapter Summary (cont’d.)
Cloud backups ensure data availability for quick restoration
Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS)
Databases require special considerations when planning backup and recovery procedures
Must restore system to operational state
Server support features
Mirroring and duplication of server data storage with RAID techniques
Principles of Incident Response and Disaster Recovery, 2nd Edition
65
65
Chapter Summary (cont’d.)
Methods of transferring data off-site
Electronic vaulting (e-vaulting), remote journaling, and database shadowing
Business resumption strategies
Hot sites, warm sites, cold sites, time-share, service bureaus, and mutual agreements
Service agreements
Contractual documents guaranteeing certain minimum service levels provided by vendors
Principles of Incident Response and Disaster Recovery, 2nd Edition
66
66