Disaster Recovery Planning: Lessons From the Recent Past
In terms of disasters, it's been an eventful few years with hurricanes, floods, earthquakes, and winter storms. However, the silver lining of every dark cloud is a lesson that can help you prepare for the next incident. Experts share valuable advice to bolster your business continuity plans and face the inevitable force majeure that could cripple your company.
by Lamont Wood
Though compliance regulations such as Sarbanes-Oxley do not explicitly address disaster recovery, auditors increasingly expect a workable business continuity and recovery plan to be part of your internal controls. Experts in the field reveal some painful yet valuable lessons. Some are as simple as not trusting your cell phone, or as basic as assigning a specific person to a designated site during a disaster to perform a precise function.
Not Seeing the Obvious
Obviously, having both primary data storage and backup sites in the same city is a mistake, especially if that city is New Orleans. Less obvious was the situation of a Florida client of Philip Jan Rothstein, head of Rothstein Associates in Brookfield, CT, and a fellow of the Business Continuity Institute. During one hurricane, they discovered the hard way that their primary and backup data facilities were in the same flood plane. "We are getting a lot of inquiries about how far away a secondary site should be," agrees Stephanie Balaouras, an analyst at Forrester Research.
But nearsightedness can also be subtle. Rothstein had a client who compiled a list of specific threats for which they were preparing. But nowhere on the list was there mention of the eight-inch gas pipeline that ran through the computer room and into the parking lot where cars were parked right against it, threatening catastrophe. "What I don't see is people taking preventative measures," adds Balaouras. "The most common causes of downtime are not disasters but mundane things like power outages, human error, software failures, and hardware failures—all preventable."
Taking Infrastructure for Granted
During Hurricane Katrina, "People were going to rely on cell phones, that didn't work, and satellite phones, that turned out not to work indoors," recalls Steve Davis, head of All Hands Consulting. But the social structure of the community is also part of the infrastructure, and as was seen in parts of New Orleans after Katrina, the social structure, too, can rapidly fail. "Now security is part of standard planning—there are plenty of contractors who supply security," says Davis. "But it is something that has to be pre-arranged, not something you do the day that the storm hits."
"The first priority is to get everyone safe," adds said Kristen Noakes-Fry, an analyst at Gartner. "When people are dying, asking when the hard drive was backed up is not the conversation to be having." She suggests using an automated, multimodal emergency notification system to communicate with the staff, and also to call roll.
Flexibility is Key
Pandemic planning has brought the need for flexible work arrangements to the forefront, notes Noakes-Fry. Not only does an enterprise need to arrange for its employees to be able to work from home, but managers must know which employees have similar skills so they can cover for each other.
Authorities are likely to close subways to reduce personal contact, and to close bridges to seal off infected areas, she notes, preventing key people from getting to work.
Beyond that, "You need to make sure that the function you want carried out can be handled by people who are in a state of shock," cautions Noakes-Fry. "Some people can respond first and process their emotional reactions later, but I have seen English majors in a crisis forget how to read. You never know how an individual is going to act."
Planning Must Go Beyond IT Recovery
"A big mistake was the failure to see the broader issues," says Michael Rasmussen at the Yankee Group. "You might cross all your Ts, but if a business partner doesn't have adequate plans, when they fail you fail. You need provisions in your contracts that allow you to audit their continuity plans."
Actually, doing any meaningful planning may put you ahead of the pack, so to speak. "Maybe 60 percent of enterprises do have a formal continuity plan and in place and 40 percent don't, but even for those with a plan in place the plan may just be a Word document that was written once and never updated," says Balaouras. "You need to do an impact analysis and then a local threat assessment, and then select the most appropriate and cost-effective technology," says Balaouras. "Don't just call up a provider and pay them to house a disaster recovery site.
"At least I'm no longer hearing from people who say, 'We were told to do disaster recovery planning—what is it?'" adds Noakes-Fry.
Not Exercising the Plan
Rothstein distinguishes between testing a plan (making sure components work) and exercising a plan (walking through it to make sure it serves its purpose.) After 9-11, many operations in lower Manhattan had disaster plans that called for them to recover to hot sites in New Jersey, Long Island, or Pennsylvania, he recalls. But that required key personnel to leave Manhattan and go to those places, which they couldn't do because the bridges and subways were closed to vehicular traffic. Exercising their plans might have revealed those flaws and allowed them to be rectified, he indicates.
Thinking the Plan is Done
"The disasters of the last five years show that you are never done, that disaster planning needs to be an organic process that changes with the times," says Alex Tabb, partner with the Tabb Group. "Threats change, technologies change, and business procedures change." "Continuity needs to be built into the overall business process," continues Tabb. "After all, most businesses these days would be lost without access to information technology."
"There is a general assumption that a continuity plan has to make the enterprise whole again, but that is both a costly and typically inappropriate assumption," Rothstein says. "Planners must decide a scope for their plans, and there will be events both inside and outside that scope." He describes a multinational bank that concluded a nuclear attack on New York City was within the scope of their planning. But he also encountered an explosives maker that did not bother to plan for accidental explosions, since they weren't considered disasters.
But just like other compliance initiatives, setting the scope requires senior management support and involvement, and should always be an executive decision, rather than something decided by the IT department.
Lamont Wood is a freelance writer based in San Antonio, Texas, who has beencovering the technology arena for more than two decades. He can be reachedat firstname.lastname@example.org