On The Road To Recovery Planning
<P><I>Today's High-Availability Requirements Expand The Traditional Hot-Site Model</I></P><P>Recovery time for one 24-hour outage could be a week or more. Data generated between your last backup and that outage may be gone forever. Until that data is recovered it's unlikely that daily production can resume. Advanced recovery programs may be what's needed.</P>
Ten years ago, you had the luxury of several hours after the close of business to dobatch processing and make system changes. But not today. No way.
Think about it: Do you know what the impact would be to your business if your ITsystems were unavailable for just one day? How current would your data be once the systemsare restored? How long would it take to manually re-enter all of the data generated duringthat 24-hour period? Would your business survive? Would you?
If you've come to believe that the "transaction is the data," thenunderstanding the Recovery Time Objective (RTO) and Recovery Point Objective (RPO) forprotecting IT systems and data is absolutely essential in identifying what systems requireadvanced recovery solutions. And also, what advanced recovery alternatives are the bestfor your organization. If your systems are not available for any given amount of time,establishing the RTO and RPO requires a realistic assessment and quantification of theroles as well as the impact that technologies and applications have on your business aswell as the costs involved in your particular business model.
Quick Recoveries
Numerous advanced recovery solutions exist and the following options are all availablefor HP environments. Of course, each solution has its own particular benefits depending onyour organization's business goals; for example, better customer service or shippingproducts on time, or especially in e-commerce.
Remote Vaulting. For organizations with a recovery site located far from their homesite or those that have thousands of backup volumes supporting Very Large Databases(VLDB), remote vaulting of tape-based data to an automated tape library can substantiallyreduce recovery time. Vaulting to a tape library also aids with the recovery point bypositioning backups off-site immediately, not hours later when the off-site storagedelivery service arrives.
Standby Operating Systems. Maintaining a remote copy of the operating system on diskthat is directly attachable to the recovery processor provides an organization the abilityto bring systems up immediately at the time of test or disaster at the recovery site. Thestandby operating system solution is best used in conjunction with other options, such aswith a standby database, otherwise the system is up and available but has to wait forother resources before the recovery can continue.
Remote Journaling. If you're concerned about improving RPO rather than RTO, considerremote journaling. This solution includes intercepting the writes to a local log orjournal and transmitting a copy of those writes off-site in real-time, providing forrecovery to a point extremely close to the point of failure.
Database Shadowing. Database shadowing is the combination of a point-in-time copy of adatabase on disk (standby database), remote journaling and the regular intervalapplication of the log/journal updates to the database. Database shadowing is a flexibleoption for managing to an application-specific RTO, allowing application updates to beshadowed as often as required to meet the RTO.
Remote Mirroring. Remote mirroring is one of the hottest topics in advanced recoverytoday. With this solution, another copy of an organization's data is maintained at aremote location. Organizations with the most to gain from remote mirroring are those thatcan easily segregate critical applications or those with applications critical enough towarrant the added expense of remote mirroring for the entire enterprise.
There are two methods to achieve remote mirroring: a software solution or byhardware/microcode. A major advantage that hardware mirroring has over other advancedrecovery techniques is that a single solution can protect several platforms and any datatype that can be stored on disk.
Because IS personnel only have to manage a single solution in this scenario, it'slikely to require fewer resources and generate savings that should be calculated whenweighing the costs.
System Replication. System replication provides a continuous operating environment byduplicating systems, data and network at a remote location. System replication is the mostcomprehensive solution for addressing RTO and RPO. However, it is also the most costly.
Hot Network Node. Establishing network communications at the time of a disaster can becomplex and time consuming; pre-staging of the configuration eliminates error and excessrecovery time impact. One way to do this is to locate a hot network production node in thesame location as the recovery capability. The hot network node is continually monitoredand in use, thereby minimizing the failure potential.
On Again Off Again
It's important to note that each of the above solutions requires an off-site location.On-site solutions are also available. For example, HP's MetroCluster and MC ServiceGuardboth help ensure high-availability. However, these solutions have distance-limitations,meaning they can only be co-located or located very near to the production environment.
If you're evaluating on-site alternatives, be aware that while they can be oftremendous help in restoring after hardware or software failures, an organization stillremains highly vulnerable. Should a fire, flood or other sudden disaster make the primarysite or general area of the primary location inaccessible, these solutions probably alsobecome inaccessible.
Hanging In The Balance
As organizations increasingly rely on distributed systems and the cost of downtimebecomes higher, understanding and implementing the right balance of traditional andadvanced recovery solutions based on the organization's recovery time and point objectivesbecomes critical to ensuring business continuity. After all, how can you begin businesstoday, if you don't know where you ended yesterday?
--Steve Turner is program manager, distributed availability solutions,for Comdisco Inc.
in Pittsburgh, Pa.