Enterprise Storage: Let's Get Virtual: Who Said Storage Isn't Sexy?

Anyone with a teenaged daughter is doubtless familiar with an all-important and impending change for 2001. According to my resident 15-year-old, this year will witness the end of the "retro 1960s" fashion craze in favor of a resurgence of the styles that were popular "way back in the 1980s." Concurrent with this shift, I suppose we can expect a "new wave" (pun intended) of 1980s music in the coming year. Look for Olivia Newton-John's 1982 hit single "Let's Get Physical" to be remastered by Britney Spears or Christina Aguilera or *NSYNC or one of the many other artistes du jour so popular with the fickle teen music-buying set.

Producers may even rename Olivia's tune to something more appropriate for the new teen consumer, giving the new mix a catchy title like "Let's Get Virtual." When this happens, somebody is going to make a ton of money.

Then, the song will be co-opted by technology vendors, who will license it for use with their television commercials. Of course, the co-opted version of "Let's Get Virtual" will not be about physical fitness or metaphorical sexuality. Technology vendors will spin the lyrics in a completely different direction - toward storage virtualization. "Let's Get Virtual" will become the anthem of data storage technology in 2001.

Don't laugh. It is already happening.

Storage Virtualization

Based on an assortment of interviews conducted with technology vendors over the past couple of months, it is obvious that everyone who's anyone is working on a storage virtualization scheme. The problem is that most of these schemes are coming to market at approximately the same time, bringing with them considerable confusion.

Virtualization is not new, of course. Like the "new" version of Olivia's song, vendors are simply recasting old concepts for use in new roles.

In fact, many PC users made their first virtual storage purchase - often in the form of a memory card from Intel that could be configured as a virtual disk drive - about the time that Olivia's original tune came out. The virtual hard disk was a useful tool for speeding program execution and for surmounting the vicissitudes of life in the era of dual floppy disk computers.

In large-scale disk arrays, virtualization technology has been used for many years both to segregate an array into several partitions so that individual partitions can be made available to different host systems and also to create large virtual disk volumes from numerous physical drives attached in the array.

In general, these virtualization schemes relied on complex array controller microcode - an array operating system, if you will - to accomplish their goals. Over time, the controller microcode delivered to vendors an important "value-add" that helped them to discriminate their products from those of competitors.

In the world of tape, virtualization took hold first in the form of tape pooling. Pooling enabled several tape devices to be aggregated and represented to host systems as a unified tape resource. It enabled backup and restore operations to be shared among multiple devices, presumably shortening time-to-data.

More recently, tape virtualization referred to the use of a disk drive cache as a target for a tape backup operation. Data was written to the virtual tape (disk drive cache), then transferred to a physical tape cartridge once it was sufficient to use the capacity of the cartridge completely. The objective was to make the most efficient use of expensive tape units, though sales of virtual tape systems have been below most vendor expectations, apparently because tape cartridges are not regarded as a terribly expensive commodity in the first place.

Today, virtualization refers to the aggregation of physical storage devices into a smaller number of high-capacity virtual devices that can be represented to an operating system as a logical storage resource. Virtualization along these lines promises to 1) reduce the complexity of the storage infrastructure, 2) improve the manageability of the storage infrastructure, and 3) optimize the uptime, while reducing the downtime to business applications that may be attributed to storage-related issues.

Reducing Complexity

Storage virtualization seeks to reduce the complexity of the storage infrastructure. That is, it promises to take many devices and consolidate them into a lesser number of virtual resources. In the case of tape, this may translate to taking a number of tape libraries, then representing them as a single large tape drive or small number of logical drives that can be used efficiently by backup and restore applications.

StorageTek provides an excellent vehicle for such consolidation in its SN6000 Storage Domain Manager. The SN6000 provides network-based virtualization of tape. Tape devices are cabled to the SN6000, which maintains their physical addresses and configurations. The SN6000, in turn, represents logical tape devices to connected host systems that have need of tape services. The hosts are shielded from configuration complexities of the back-end tape pool: the SN6000 providing a centralized location for tape configuration management, expeditious recovery from failures in the tape pool and efficient use of tape resources overall.

In the realm of disk, a similar function is provided by both Vicom Technologies and DataCore Software. Vicom offers Storage Virtualization (SV) Routers that may be deployed in a SAN and used as virtualization points for connected disk devices. Using a software-based Storage Virtualization Engine, administrators configure the disk storage pool as desired - consolidating physical disks into larger virtual disks - and download the information to their routers. The software may then be turned off as the routers continue to represent virtual drives in the SAN for use by hosts.

DataCore's solution is also interesting. The company employs a Storage Domain Server - an NT server enabled with its SANsymphony software - to provide a centralized administration point for configuring physical storage devices in a SAN into virtual drives for use by SAN-connected hosts.

All of these products are distinguished from other approaches by virtue of their location: the virtualization services they provide are located "in the storage network," rather than on storage devices themselves or on connected hosts. This architectural choice has significant advantages from the standpoint of efficiency and manageability, not to mention scalability.

Improving Manageability

When management costs are figured into the equation, storage virtualization "in the network" is dramatically superior to virtualization via software installed on the host. To justify this view, one needs only examine the alternatives: virtualization using host-based software and virtualization using storage device controllers.

In a host-centric virtualization scheme, virtualization (or pooling) software must be installed, configured and maintained on every host system. Depending on the number of hosts sharing devices, this approach can be very cumbersome and labor intensive.

By contrast, network-based virtualization provides a means to centralize the virtualization management function to a lesser number of devices (domain managers or routers), facilitating the management of complex storage infrastructures by a limited number of IT staff personnel. This key point was anticipated by early SAN pioneers, including the authors of the Enterprise Network Storage Architecture (ENSA) white paper developed in the late 1990s that recommended the deployment of storage management servers within the SAN to facilitate management and security.

While host software-based virtualization advocates, including Veritas, might argue that multiple hosts could be maintained via a shared management utility or framework, deploying and maintaining such a framework is, itself, a time-consuming and laborious effort.

Moreover, a host software-based virtualization architecture does not shield users against the numerous problems that may develop on general purpose servers that share resources between storage virtualization and other important business applications.

Advocates of storage device-based virtualization, such as large array manufacturers, may also argue against network-based virtualization. They observe that significant investments have already been made in array controller development and that controllers, which may be managed via their own utilities, provide an established technology for managing virtualization efficiently.

In response, one could debate the wisdom of vendor dependence created by such controller technologies - companies are entirely dependent on the vendor to fix any controller problems that arise - but, dependency issues aside, the fact is that controller-based virtualization has not solved all of the important problems.

In most controller-based virtualization schemes, data must be replicated from partition to partition to permit its sharing among heterogeneous host systems. Data copy sharing is a key driver of exponential storage growth being witnessed in many companies today and a multiplier of management expense. With network-based virtualization, neither data nor storage resources need to be replicated in order to be shared. Moreover, with network-based virtualization, companies are not locked into the storage technologies offered by any one vendor. Best-of-breed products can be added to the SAN as they become available.

Improved Availability

Network-based approaches to storage virtualization also offer the advantages associated with networking generally. They take advantage of the ability to design networks for self-healing in the event of a device failure or link interruption.

StorageTek, DataCore and Vicom have all demonstrated tremendous resiliency in the face of unplanned interruptions. Domain managers and routers can be deployed in fault tolerant and failover configurations usually at a much lower price than comparable array-based virtualization solutions. In contrast to host software-based virtualization solutions, there are fewer "moving parts" in a network-based virtualization solution. Hence, there are fewer things to go wrong and less downtime due to service interruptions.

Network-based storage virtualization also has the advantage over host software-based virtualization of masking problems from end users. When a physical device fails in network-based virtual disk or tape, the problem can usually be rectified, while work proceeds using other available devices.

Let's Get Virtual

Network-based virtualization strategies comprise the first efforts to develop a sound and intelligent operating system for SANs. In a sense, these virtualization pioneers are making SANs sexier - elevating the architecture above the rudimentary concerns of plumbing, LUN masking, and block and file system protocols. Appropriately, these solutions are hitting the street at a time when the fashion trends are swinging back toward the 1980s. I can already hear the storage administrator humming that old/new tune, "Let's get virtual ... virtual ... Let me hear the storage talk ... The storage talk..." (Okay. So, I can't sing.)