Special Report: Enterprise Storage<br>Sans SANs, Backup and Recovery is an Uphill Battle

It's no secret: We are creating and storing more information than we can effectively back up and recover. Terabytes of data are at risk because they are spread across countless systems in countless islands across the enterprise. Add to this a tidal wave of Internet and data warehousing applications that are opening new floodgates of data that need to be backed up. "The amount of data that users are collecting has grown exponentially over the last couple of years, and there's no end in sight," says Robert Abraham, president of Freeman Reports (www.freeman.com), a storage analyst firm. "It's not going to stop, it's not going to slow down, and it's not going to plateau."

"End-users and department managers are simply too narrowly focused on business processes to have the time to fully assess and manage the risks being taken," says Robert Gray, analyst with IDC (www.idc.com).

This is particularly true in many Windows NT shops, where backups for disaster recovery typically take place through server LAN or WAN connections. And because of its distributed nature, Windows NT is notoriously inefficient in terms of storage management in general. A survey by GartnerGroup's Dataquest (www.dataquest.com) division finds that 56 percent of IT executives at Windows NT sites are unable to use most of their organization's RAID capacity, and that Windows NT storage consumes twice the staff time as Unix configurations and eight times the staff time as OS/390 sites.

"You're forced to schedule backups at low-usage times, or risk clogging up the network trying to push that information from server to server," says Paul Williams, product marketing manager at Inrange Corp. (www.inrange.com).

plus, if your company is entering the e-business realm, that adds more difficulties. Since most Web traffic from remote sites and business partners connect to LANs and WANs, these networks are already being taxed to their limits, Williams adds. "Think how much increased traffic backups and storage recovery adds."

This has not gone unnoticed. About 56 percent of executives participating in a recent survey sponsored by EMC Corp. (www.emc.com) say disaster recovery is a key Internet management challenge, second only to security -- 77 percent. These executives agree that behind transaction processing, backup and recovery are the leading drivers of data movement.

Logical Choice

Enter storage area networks (SANs), which promise to move data at up to 100 MB per second over Fibre Channel networks separate from server networks. SANs are a compelling solution for disaster recovery and business continuity challenges, by enhancing the efficiencies of remote vaulting. Acceptance of the technology is growing: IDC forecasts that more than $11 billion in SAN-based storage arrays will be purchased by 2003. All vendors in the storage arena now either offer SAN-enabled products or have alliances to offer connectivity to SANs. Backup and recovery through a SAN provides enhanced data security, scalability, and performance. Backups from SAN-based intelligent storage servers facilitate disaster recovery from an application server failure.

"With Fibre Channel SANs, you can take your storage off of the LAN and put it on its own network," Williams says. "A SAN will separate network traffic and put backups into a separate storage area network." Connections to storage are made through a switch, rather than to a server, he notes.

In backups, "large amounts of data must go over the same network topology that your daily transactions are going over," agrees Randy Settergren, manager for product marketing at Storage Technology Corp. (StorageTek, www.storagetek.com). Backup jobs flood the network with data. SAN moves backup into its own back-end storage network."

SANs enable companies to simultaneously develop both fault tolerant storage and disaster recovery solutions, says Edward Broderick, senior research analyst at Robert Francis Group (RFG, www.rfgonline.com). While a fault tolerant storage system addresses data corruption, data loss, and application failure, disaster recovery includes hot-standby systems, multiple backups or multilevel mirroring, remote vaulting, off-site storage, and accessibility to a hot stand-by site.

Because SANs can cover a wider distance than SCSI-attachments -- 10 kilometers for a SAN, 25 meters for SCSI -- SANs enable the use of storage devices with removable media to be located at remote sites and connected electronically through high-speed lines and switches to a data center. With Fibre Channel networks and switching, backup and recovery solutions can be spread over a wide area. "If you're in a metropolitan area where you can vault your data either to another internal site, or to a service provider, the costs are much lower over Fibre Channel," says Donna Scott, analyst at GartnerGroup (www.gartner.com). While this type of solution is still rare, many companies will be looking at such high-speed connections between remote servers over the next decade.

SANs can address disaster recovery solutions that are particularly tedious in Windows NT environments. "NT tends to have multiple points of backup," Settergren says. "Each server might use its own, third-party or direct type system to do the backup."

Disaster recovery becomes an unmanageable process as a result. "Imagine going to a hotsite to restore all different media types -- floppy drives, 4mm tape, 8mm tape, and high-speed tape," Settergren says. SANs simplify the procedure by providing a common point of backup and a common interface to manage those backups.

SANs can also address islands of backups that exist within Windows NT organizations. Most Windows NT disaster recovery scenarios result from minor, everyday occurrences, such as head crashes, hardware failures, file corruptions, and bad database loads, Settergren explains. "It takes too long to get that information from the off-site vault back to a primary facility. To get around this, many people have created localized sets of backups for these operational kinds of failures."

Sans SANs

It appears, however, that it's going to take some time before the marriage between SAN technology and disaster recovery takes place. Many CIOs look at SAN and Fibre Channel implementations as a huge investment. Since disaster recovery only represents a part of data movement and storage needs, most SAN implementations will be driven by high-availability and fault-tolerance requirements. Companies have only begun to consider the role of SANs in facilitating disaster recovery.

"SANs have yet to meet the promise of disaster recovery because they are still point solutions that don't address the complete enterprise infrastructure issue," says Philip J. Tsihlis, solutions marketing manager for enterprise storage networks at EMC. SAN vendors still tend to have separate backup and remote mirroring schemes for the different platforms they cover. "Therefore, if I have multiple SANs deployed within multiple computing environments, I will have multiple disaster recovery schemes -- one for each SAN and its associated computing platform," Tsihlis states. "This approach is too complex."

The estimated total cost of ownership for a SAN implementation at a typical site is about $300,000, according to IDC. This accounts for hardware costs, minus downtime costs.

An EMC survey finds that 13 percent of companies are implementing SAN technology, however, another 44 percent are evaluating SANs for the future.

As with disaster recovery in general, SAN implementation involves significant management challenges. The same management issues that stand in the way of disaster recovery with SCSI-attached devices also may slow down backup and recovery solutions with SANs, Gartner's Scott says. "The process is still the same. You have to plan for recovery of applications and prioritize," she explains. "As you get more SANs out there, with terabytes worth of data, you can't just back up all the volumes as it's done today. You have to back data up application by application. That calls for prioritizing which applications and data are critical and need to be recovered first."

It will take time for the technical bugs to be worked out in SAN implementations, as well. "We're at the infant stage of the industry," Abraham says. "While standards are generally in place, there still are glitches. There will be for years, until an industry infrastructure has settled down and matured."

The evolution from SCSI-attached backup to a more high-performance SAN will be slow for many organizations. As organizations move from traditional storage systems to SANs, it may take some time before performance increases. Native SCSI tape drives are attached to SANs through SCSI-to-Fibre-Channel bridges and routers, for example, will be slow since throughput is limited to SCSI data rates. Larger sites will need to invest in SAN-based intelligent storage servers to speed up this evolution, Gray says. The alternative, however, is keep application servers in their roles as expensive data movers. Fortunately, SAN usage is expected to grow, competition will increase, and prices will plummet. "We will see a day when SAN is as common an implementation as Windows NT," Abraham says. "That won't be for another five years or so. Then, price points will drop dramatically and hardware and functions will be integrated."