The Promise of SANs: Managing Storage-Centric Information
As business needs shift from system to data availability, information must be managed "storage-centrically," where data can be accessed on a global scale with higher levels of control.
The computing industry is going through another major change – the move toward a SAN environment. As the focus shifts from "system availability" to "data and application availability," business information must be organized and managed in a new way – a storage-centric way. In a SAN "storage-centric" environment, access to data can be achieved on a global scale, higher levels of control and management can be enforced and organizations can provide a level of availability never before achieved.
The widespread deployment of mission-critical applications, coupled with the explosive growth of corporate data, is quickly outpacing the ability of current computing architectures to efficiently handle scalability demands and overall storage management issues.
Data growth rates have exceeded backup window allowances. Bandwidth improvements are progressing, but still cannot sustain the growth of today’s organizations. Data may be stored in very large databases, but it is also likely that critical information is widely dispersed. The sharing of this information is typically handled with data feeds and replication efforts. Unmanaged data duplication results in inaccurate information, while isolated data makes it impossible to get a true business "picture" in a timely manner. A SAN provides access to isolated "islands" of data through high-speed network communications.
In today’s environment, each shared storage device, such as a disk or tape drive, is typically managed by and physically attached to a single server. Both physically and logically, clients have to go through that server to get to that particular storage device.
A SAN is a separate high-speed network that establishes a direct connection between storage devices and servers or clients. The SAN is an extended "sub-network" that can be interconnected using routers, hubs and switches found in the LAN or WAN environment. SANs enable storage to be "externalized" from the server in the LAN to this separate sub-network, bypassing traditional bottlenecks and supporting direct, high-speed data movement.
SANs provide any-to-any connectivity, which means that potentially any server can access any storage device in the SAN. This allows users to share storage resources, eases the task of centralized management, improves performance and allows for longer connectivity distances via fibre channel.
NAS vs. SAN
Network Attached Storage (NAS) is generally a disk array that connects directly to the messaging network or LAN interface, such as Ethernet using common communications protocols (TCP/IP). It often functions as a server, has a processor, an OS or micro-kernel and processes file I/O protocols.
SAN, on the other hand, is a shared storage repository directly attached to multiple servers via a high-speed interconnect, such as a fibre channel switch. Storage is separated from the LAN and is shared by many servers. This allows more control over the management of the data for backup and business continuity purposes.
Fibre channel can transfer data faster than any other commonly used storage technology. The base speed of fibre channel is 100 MB/sec. The current generation of Ultra-SCSI, in contrast, runs at 40 MB/sec. Next-generation Ultra-SCSI II runs at 80 MB/sec. When compared to SCSI, SANs deliver increased distance using fibre channel (theoretically, 10 kilometers vs. 5 meters), increased throughput (100 MB/sec vs. 50 MB/sec), similar cabling and, most importantly, the mass market focus from all storage vendors.
Fibre channel networks can have a large geographical extent, supporting very large configurations. This can be particularly appealing for disaster recovery, because it allows mirrored disks or remote tape vaults to be located in different data centers. The fibre channel standard supports distances of up to 10 kilometers with optic cabling. SCSI generally supports only 3 meters, and differential SCSI about 25 meters.
A fibre channel loop can support up to 127 devices, with a practical limit today of 512 devices on a switch. This is more than triple the devices achievable with ESCON or SCSI. Traditionally, a SCSI configuration consists of two servers sharing two SCSI buses, for a total of about 30 devices.
Hubs and Switches
At the physical level, the current technology of choice for SANs is fibre channel (ANSI X3T11), because of its high bandwidth potential and low latency (in comparison with SCSI or gigabit ethernet). If the SAN is fibre channel-based, it has the potential to be a high-performance network. Other technologies, such as SCSI and gigabit Ethernet, cost less and are more widely available than fibre channel, but they are also more limited in their performance potential.
Fibre channel was designed to support both shared loop and switched architectures, or a mixture of both. SCSI was designed for a bus architecture, and although there are workarounds, the limitations can increase installation, configuration and support costs. Non-blocking fibre channel switches provide the Primary LAN Network (TCP/IP 10MB/sec).
Shared Loop Architecture. A fibre channel loop connects up to 126 devices on a shared medium, usually created using a hub. Shared loop architectures implement a Fibre Channel Arbitrated Loop (FC-AL). FC-AL was developed with peripheral connectivity in mind. With the capability to natively map SCSI commands using the SCSI protocol over fibre channel, FC-AL becomes an ideal technology for high-speed storage connectivity. One potential problem with an arbitrated loop occurs when a connection is lost on one of the devices in the loop.
In arbitrated topologies, nodes are daisy-chained together, with each fibre channel port acting as a repeater for all other ports on the network. To circumvent the possibility of this problem, it is recommended to cable fibre channel connections through a hub. The hub "heals" the loop in the event of a node failure, bypassing the non-operational node and passing signals directly between operating nodes. Hubs are limited in comparison to switches. One limitation is that hubs generally support only 8-bit addresses, meaning that most hubs cannot support more than 256 devices. (This is not a serious limitation, since shared access to such a large number of devices on a single hub is not generally desirable.)
Switched Architecture. Switched architectures can connect multiple servers, storage devices or FC-AL rings using either area switches or fabric switches. The term "fabric" is used to mean a cross-point switched network between nodes. Both area switches and fabric switches can participate in creating a fabric. A fabric, in general, contrasts with a shared loop implementation in that it provides an any-to-any topology that gives each pair of ports a transparent or virtual point-to-point connection. This allows multiple "conversations" to take place simultaneously and results in less congestion and faster response times.
So, what does this mean for your business? SANs enable you to operate in a radically new way, especially if your company is geographically dispersed.
Higher Application Availability. Moving storage from the server to the SAN frees up server processing power and LAN/WAN networking bandwidth to business applications.
Improved Performance. Because backup and recovery processes no longer require server access, and because storage interconnects are faster, backup performance is dramatically improved.
Centralized and Consolidated Storage. SANs will play a critical role in the consolidation and management of storage assets. A SAN environment offers the opportunity to relocate backup, recovery and data replication processes tied to individual servers to a more open and sharable environment. This allows data to directly move between storage devices within the SAN.
Increased Scalability. SANs ease the burden, as enterprises will be able to scale more capacity, manage data more easily, connect to multiple devices and do it all at a lower cost.
Availability and Failover. One of the biggest advantages of SANs is that clients are not denied storage if a particular server fails; they can access shared storage through another server. Without SANs, the failing server would render any attached storage unavailable. In the future, it will be possible to configure storage as a "pool" shared by many servers. If storage fails, servers continue to function and if servers fail, storage continues access via another route.
As with any SAN environment, storage management disciplines must be enforced to ensure high data availability and data integrity. This includes tasks such as performance management, backup and recovery configuration, device and port allocation and policy-based media management.
LAN vs. SAN
A SAN is not a replacement for storage attached to servers in a LAN. In some cases, it simply does not make sense to do this. Indeed, business applications requiring large database or application server backups are a natural fit for SAN implementation and deployment. In some cases, however, PC/ workstation clients and small departmental servers may not need to be directly connected to the SAN because the data volumes do not necessitate a fibre channel connection. Although it is possible for these clients to backup directly to SAN devices, it may be more practical to continue to backup over the LAN, without putting unnecessary strain on the network bandwidth.
Given these considerations, it is likely that most organizations will have a mixed environment of interconnected LAN and SAN technologies. The architecture of the storage management software product will need to centrally manage all servers and storage devices, whether they are physically connected to the LAN, WAN or the SAN.
Ensuring High Availability. By "externalizing" the storage from large servers traditionally connected to the LAN, and placing this storage in a SAN environment, backup traffic is removed from the production networks, ensuring higher availability for critical business applications. High availability is generally achieved in multiple ways of which disk-based replication, incremental backups and online database backups are components.
Heterogeneous SAN Environments. Because the SAN concept is relatively new and somewhat immature, initial implementations will be deployed with single hardware, fibre connectivity and server configurations to ensure a high degree of integrity and interoperability.
Ultimately, SAN environments will need to accommodate heterogeneous environments, including multiple vendor servers, operating systems, disk sub-systems, tape libraries and fibre connectivity – all managed within the same SAN. Therefore, storage management software must accommodate this mixed SAN environment as well.
High-Performance Backup and Recovery. A key and early benefit of SAN is the performance boost provided by fibre channel connectivity with speeds of up to 100 MB/sec. SAN performance must be evaluated on two levels: the physical interconnect level, and the logical flow of data. Though you may think that the implementation of high-speed interconnects will increase the speed at which backup and recovery tasks occur, consideration must be given to the efficiency at which the storage devices are used.
Consolidated Administration. Centralized and consolidated administration is more important in a SAN environment. The connectivity distance that fibre channel provides allows for remote vaulting, mirroring and administration. Storage management software must take into account this way of managing information and must make it easier for administrators to deploy and manage remote sites.
Storage On-Demand. When applications run out of disk space, business operations can come to a halt. All too often, storage administrators are paged in the middle of the night and called in to work to fix a problem. In addition, SCSI-based systems may require the storage administrator to bring down the system before adding more storage capacity.
Intelligent Storage. As SAN hardware technology evolves, SANs and related clustering technology will require functionality to be able to "swap-out" and quickly replace components without impacting the environment. This will allow upgrades and maintenance tasks to occur on the fly. To achieve this level of high availability, the hardware components will become more "intelligent" and will begin to embed software functionality as part of the storage device.
SAN Planning (Assessing the Impact). The impact of implementing SAN technologies in any environment requires knowledge at multiple levels, including an understanding of the business applications and how they function, the system software in use, network and connectivity products and hardware speeds and feeds.
Interoperability between device drivers, OS levels and hardware/software configurations can also add complexity. Before implementing any SAN technology, organizations should seek the guidance of professionals experienced in integrating these technologies.
About the Author: Denise Reier is Vice President of Marketing for SCH Technologies (Cincinnati). She can be reached via e-mail at firstname.lastname@example.org.