In-Depth

Networks: How to Avoid the “SAN Trap”

Information has become the new-world currency, emerging as the most important asset for a new breed of companies shaping the information economy. The successful players in this new business environment will be the organizations that can keep up with rapidly changing market requirements and customer demands.

The e-driven business environment, however, is unforgiving; not only generating a deluge of information, both incoming and outgoing, but also gobbling up storage capacity and network bandwidth at a frightening pace.

These information-intensive requirements are raising a growing chorus of demands for more storage and more bandwidth. Organizations are racing to provide fast, reliable, continuously available information access. At the same time, they are struggling to accommodate the growing volumes of bulk data that must be moved across fast-growing, often global enterprise networks. These pressures are driving organizations to adopt Storage Area Networks (SANs).

State of the SAN

The promise of SANs is to eliminate isolated "islands" of server-bound storage and consolidate them into a single, heterogeneous storage infrastructure. The plan is to simplify data management and provide unlimited access to the data by multiple servers and operating systems through a dedicated, high-speed enterprise network. Additionally, SANs promise dynamic allocation and reallocation of storage assets, as needed, throughout the enterprise. SANs also reduce much of the bulk server-to-server data movement that has been clogging the corporate data network.

At the most basic level, however, SANs are falling short of mainstream enterprise adoption due to the fact that most SANs still remain server-specific. Limited interoperability among multiple server platforms and operating systems mitigate the anticipated benefits. Organizations may find themselves wrestling with islands of homogeneous SANs - disconnected pools of networked storage - instead of islands of server-bound storage. Given these limitations, SANs can further complicate, rather than simplify, management of the data storage infrastructure.

Also, SANs fail to include Network Attached Storage (NAS) in the promise of a single storage infrastructure. NAS enables data from applications contained within storage to be delivered directly onto the data network over IP, eliminating the need to attach the storage to a server, or a dedicated Fibre Channel storage network. NAS is quickly moving to become a critical application infrastructure component. Organizations are likely to deploy both SAN and NAS, ultimately forcing the enterprise to encompass both in a single, easily managed storage infrastructure.

Enterprise Storage Networks

The Enterprise Storage Network (ESN) fully delivers on the promised benefits of SANs by providing a single infrastructure that exploits the power of information, regardless of its location. ESNs are completely altering the relationship between servers and storage and, by extension, redefining the relationship between users and their information. ESNs begin with - and go far beyond - the current limitations of SANs.

ESNs comprise a dedicated, high-speed network that uses standard NAS capabilities, such as IP, and standard channel technology, such as Fibre Channel, SCSI and ESCON, to weave together heterogeneous storage devices, switches, hubs and servers into a single, easily managed, centralized information infrastructure. ESNs consolidate distributed information from departments and business units throughout the enterprise, as well as from remote locations over wide area networks.

ESNs were designed with the enterprise infrastructure in mind from its inception, unlike the SAN, which began as a way to consolidate distributed servers and their attached storage. As a result, an ESN is able to support highly heterogeneous enterprise environments, which include everything from proprietary mainframe systems to distributed UNIX and Windows NT servers from different vendors. With an ESN, organizations avoid the problem of islands of SANs, where each platform and operating system requires a separate SAN of its own. The ESN encompasses both SAN and NAS (see "Demystifying SANs and NAS") technologies, enabling organizations to meet increasingly challenging information delivery needs. Finally, ESNs deliver on the enterprisewide need for a single, easily managed storage infrastructure.

The enterprise environment is highly heterogeneous. ESNs combine a myriad of components - NAS and SAN devices, hubs and switches, host bus adapters, servers, storage devices, all from different vendors (made possible only through comprehensive interoperability testing). SANs, by comparison, usually work with only a limited range of devices and systems certified as interoperable.

For IT managers, effective storage management ultimately may prove to be the biggest advantage when storage is consolidated on its own network. Storage management traditionally has been a black hole sucking up inordinate amounts of time and resources. Administrators must juggle myriad servers and their attached storage devices. And not just any administrator will do. Some need mainframe skills. Some need Windows NT skills. Others require UNIX skills. Others need networking skills.

With its enterprise roots, ESNs excel in storage resource management. The ESN is centrally managed, enabling companies to avoid the wasted resources and costly expense of managing information across multiple operating environments. Enterprise management simplifies the operation of what will quickly evolve into a highly complex network of storage devices, servers and connectivity hardware. Such central management will reduce the cost of ownership of storage by reducing the amount of administrative work (and administrators) required to maintain a large, complex storage infrastructure.

An ESN also enables administrators to backup and restore information centrally across the ESN. This removes the backup traffic from the corporate network, immediately relieving the corporate network of a large amount of traffic. Centralized backup again reduces the cost of ownership.

Finally, the ESN's centralized management allows administrators to define and automate responses to common system conditions, which further reduces the amount of administrative support required. It also improves system performance because the system can respond to conditions before they reach the point where they impact systems, applications and users.

ESN Benefits

An ESN simplifies data storage through the consolidation of storage resources. The organization ends up with a single ESN, not islands of SANs and NAS. With the server and storage consolidated, the organization can effectively optimize its technology infrastructure.

The ESN facilitates access to all of the organization's stored information assets. When information is stored locally on servers or even when it is consolidated on a SAN or NAS, it may not be easily accessible to all users needing to access it. With an ESN, all stored information is accessible to all authorized applications and systems.

The ESN ensures the availability and recoverability of valuable corporate information assets. Through centrally managed and automated high availability capabilities and backup and recovery, the ESN ensures that critical information assets are available and protected.

The ESN effectively removes storage traffic and its associated bulk data movement from the corporate data network. This improves network performance for the critical applications that must use the corporate network.

The benefits of ESNs have proven to be compelling. The FibreAlliance, an industrywide consortium of storage and networking vendors, is coalescing around ESNs. Much of the work of the FibreAlliance, which focuses on ensuring interoperable SAN management, is being adopted by the SAN players and will likely become part of future SAN standards. As a result, the ESN will encompass both SAN and NAS and move beyond, raising both to new levels of performance.

To deliver these benefits, an ESN solution must meet several key requirements:

• Single information infrastructure

• Heterogeneous host attachment (mainframe, multiple UNIX, Windows)

• Intelligent, automated centralize storage management

• Choice of connectivity (IP, Fibre Channel, ESCON, SCSI)

• Data protection/disaster tolerance

• Security, scalability, dynamic flexibility

• Open standards adherence

• Tight integration, proven interoperability.

Mirroring Your Way to a Fault-Tolerant Storage System: Beyond RAID 5

Mirroring Your Way to a Fault-Tolerant Storage System: Beyond RAID 5

By Joel Leider

Most network and system managers prefer RAID disk arrays because they provide a measure of protection against drive failures. However, those who must attempt to keep the shop running without interruption typically require servers and storage with higher standards of fault tolerance or "no single point of failure." Unfortunately, standard RAID systems simply do not provide this measure of protection. Further, these enterprises typically run business critical database applications 24x7 that are seldom, if ever, closed - even for backup.

Mechanical failure is inevitable. Disks, power supplies, fans and computers all fail. It is the network manager's job to anticipate the costs of these predictable failures and compare them to the costs of prevention. How can you add several added elements of protection for those network server environments where the cost of data loss and downtime are very high? For these situations, the cost of extra equipment redundancy is low compared to the anticipated costs of downtime or data loss. We all know equipment will fail - we want it to fail gracefully and not take the enterprise or our data with it.

Becoming Better Than No Single Point of Failure

To reduce downtime due to component failures, you should consider one of the three different methods to mirror your RAID 5 storage systems to provide a cost-effective solution for protecting critical situations. Each method protects your data even if an entire RAID array fails. To this end, these methods offer "no single point of failure" at the storage array and optionally at the server.

Level 1: Storage Redundancy - Drives Can Fail

The first method, Level 1 - Storage Redundancy, provides a storage architecture designed to provide a full measure of fault tolerance via component redundancy. Unique to the architecture, this Level 1 system avoids the vulnerabilities to single failure points commonly found in typical storage arrays. This no single point of failure design in the RAID storage architecture uses a pair of RAID 5 systems that are each connected to a server that supports host-based mirroring. Operating systems that perform mirroring include Netware, Windows NT, Solaris, HP-UX, AIX, OpenVMS (Volume Shadowing), Digital UNIX, SGI Irix, and others.

Unlike a standard RAID 5 array, this method can withstand a minimum of three simultaneous drive failures and still continue to run properly - all transparently to system users. Each RAID 5 array can sustain a drive failure and continue operating with parity information. If a hot spare is present or a replacement drive is inserted, the system continues with either one or even both data rebuilds in progress simultaneously. During this critical time, another drive can fail, taking down an entire RAID 5 array. This would normally disable the server and all the users and risk data loss. However, this Level 1 system keeps running. One RAID 5 array with a failed drive is still sufficient to run the server and keep data continuously accessible to users. This provides ample time to fix the disabled RAID 5 array and avoid potential data loss. With optional "hot-spare" drives installed, the Level 1 array can withstand the subsequent failure of up to two additional drives.

This Level 1 array can also withstand the failure of multiple fans and multiple power supplies. Each RAID 5 array is typically connected to a separate UPS and has redundant AC connections. An AC power line to each array can fail and a UPS can fail during a power outage or even an entire RAID 5 array can completely fail and the mirrored RAID 5 arrays will continue to operate.

Level 2: Server Redundancy - Server Can Fail

The multi-host, Level 2 configuration diagram takes advantage a RAID array's multi-hosting capabilities. Each server is connected to both RAID 5 arrays. In most cases, the second server is a standby server - ready to take over if the first server fails. This configuration also withstands the failure of a bus since there are two RAID arrays connected to each server via separate buses. Bus hang-ups do occur. While rebooting can easily reset these hang-ups, this practice may prove unacceptable in many environments where users expect the system to be in full operation. Thus, the environment continues operating with the loss of a server, bus or storage element. Level 2 adds the ability to withstand a server failure and multiple host bus failures to the already formidable list of simultaneous failures covered by Level 1.

Level 3: Clustering - Hosts Share Data

Sharing data between multiple servers enables network managers to distribute workloads onto multiple servers without the need to arbitrarily decide how to split up the data. This capability of Level 3 also offers no single point of failure in the entire environment when combined with RAID 5 storage. It also eliminates the obvious idleness of the backup server. This environment is supported by Digital UNIX and OpenVMS, HP-UX, AIX and Solaris operating systems. This architecture creates a server environment with true no single point of failure fault tolerance, maximizes the utility of all installed components and promotes ease of access to shared data.

About the Author: Joel Leider is the CEO of Winchester Systems (Woburn, Mass.; www.winsys.com).

Must Read Articles