In-Depth
Business Continuity and Disaster Recovery: Four Key Questions
We examine the approaches for business continuity and disaster recovery and what your data-driven enterprises can do to ensure better business continuity in the event of an outage
On Thursday, August 14, 2003, at 4:10 p.m., a massive power outage swept across the northeastern United States and parts of Canada. Minutes later, Jack Wolf, chief information officer for Montefiore Medical Center, sat back in his chair, looked up at the lights, and breathed a sigh of relief. His hospital didn’t skip a beat. Wolf and his team had built redundancy into everything Montefiore did—from the power and network to the infrastructure and data. All of that planning and preparation, with elements built into the system to prevent unforeseen outages, had certainly paid off. While the blackout lasted for nearly 72 hours, Montefiore didn’t lose a single communication link or patient record. What could have been a crippling disaster costing billions of dollars, was easily avoided thanks to careful planning—a shining example of a successful business continuity and disaster recovery plan in action.
New governance regulations, a growing number of corporate acquisitions, and increased competition have made business continuity and disaster recovery a “must have” in today’s enterprise. In addition, terrorism threats, wide-area power outages, and natural disasters have put the global spotlight squarely on disaster recovery.
Analyst firm IDC predicts that companies will increase security and business continuity spending twice as fast as the growth of IT budgets, with spending to balloon from $70 billion this year to over $116 billion in 2007. With organizations taking a closer look at business continuity, what does a solid, successful plan such as the one deployed by Montefiore look like?
First, we must clarify the difference between “business continuity” and “disaster recovery,” as the terms are often used interchangeably. There are distinct differences between the two. Business continuity focuses on keeping everyday operations running with minimal interruptions, while disaster recovery focuses on measures to restore operations after an interruption occurs. A well-executed business continuity plan incorporates and defines the business requirements for the disaster recovery plan. Both must be carefully planned and consistently practiced within an enterprise. An organization must establish priorities and operational parameters; then it can establish goals for business continuity and examine the approaches to best meet its objectives.
A key prerequisite for disaster-recovery planning is establishing priorities and operational parameters. As part of this process, an organization needs to ask itself the following four key questions:
- What data is most important to business operations?
Not all data is created equal. Decide which data is most important to survival. For example, a bank’s ATM and debit transactional data is considerably more crucial to maintain than HR information, as the data is the lifeblood of its business. Within an enterprise, data is often kept in structured (stored in databases) and unstructured formats (stored as text, sound, image, and video files.) Structured data is grouped with (and related to) other data. Its validity depends on these groupings and structures. Disaster recovery of structured data has to consider the overall integrity of data.
- How much downtime is acceptable without affecting the business?
Determine the expected availability and mean time to failure (MTTF). Every system faces failure. The MTTF, or the average time before failure of a system or device occurs, is a useful measure of uptime and an essential metric for calculating expected availability of a system. Likewise, mean time to recovery (MTTR) is the average time taken to resolve most hardware or software problems for a given device. IT departments should always strive to improve MTTF. However, for purposes of disaster recovery, reduction in MTTR is more important than MTTF and expected availability.
- How much data can be potentially lost in the event of a failure?
Be realistic and consider all the possible ways data can be lost and how each event could affect your business. Loss of equipment or technology failure, inaccessibility to the facility and loss of region, or widespread outage should all be assigned a realistic view of the potential duration of downtime and amount of data that can be lost so that an appropriate contingency plan can be developed.
- How long will it take to recover?
There are several high-availability and disaster-recovery technologies, including tape backups, hard disk backups, electronic vaulting, and replication technologies (data written on two separate systems to their respective storage devices concurrently). Replication technologies are typically inadequate for recovery since many of the traditional solutions fail to meet stringent recovery time and recovery point objectives (RPOs). The solution must provide for elaborate recovery or must be much faster—preferably sub-second—and significantly improve RPO or reduce data loss.
Best Practices for Disaster Recovery
In addition to selecting an appropriate high-availability and disaster-recovery solution, consider some of the following best practices for creating an effective disaster recovery plan:
- Standardize on backup products
- Focus on recovery
- Build a dedicated data management organization
- Define service level agreements for data recovery
- Develop backup process-deployment standards
- Document procedures for backup operations
- Centralize backup-process monitoring and administration
- Document procedures for recovery operations
- Integrate with change control and application development cycles
- Test recovery
Conclusion
Today’s businesses are more dependent on information technology than ever before, making business continuity and disaster recovery a core concern. Disruptions to IT infrastructure and unpredictable events can have catastrophic consequences that threaten the bottom line.
We have insurance for everything from houses to pets, so why shouldn’t businesses invest in a plan that protects data so crucial to the survival of the business? A successful business continuity and disaster-recovery plan that incorporates mission-critical business requirements and is continually tested and updated can mean the difference between life and death for an organization. Now isn’t that an insurance policy worth having?
About the Author
Tim Rathbun is executive vice president of marketing and product management for GoldenGate Software, a transactional data management solutions vendor. The company provides technology for capturing, transforming, and moving transactional data in real time across major databases and environments, helping organizations mitigate risk, reduce costs, and increase revenues.