Tough Choices in Enterprise Storage
Newer networked storage topologies that capitalize on the simplicity of NAS and the scalability of a SAN will soon appear in the marketplace.
Intelligent purchase decisions in the storage arena are difficult. Given the financial ups and downs of the industry lately, there's often a strong likelihood that the proprietary products you select today will be orphans tomorrow. However, the future of networked storage is emerging from the hype. With evolving standards, newer networked storage topologies will soon appear that provide the means to capitalize on the plug-and-play simplicity of NAS, in combination with the scalability of a SAN. These NAS/SAN hybrids may in fact steal some of the wind from the sails of the "SAN as universal storage pool" idea by allocating storage to specific applications in a more utilitarian, but network-based model.
Networked Storage 101
Vendors hail networked storage as a strategic storage infrastructure: a platform capable of scaling capacity beyond the server or disk array cabinet, while affording greater performance, accessibility and manageability. As a term, networked storage refers to a set of alternative technologiessome call them "competing"that are commonly referred to by their acronyms: NAS and SAN.
From a product standpoint, network-attached storage (NAS) predates storage area networks (SAN) by several years. In essence, NAS is server-attached storage in which the server uses a storage-optimized operating system kernel, sometimes called a "thin server," rather than a general-purpose server operating system or "fat server." According to vendors, NAS storage-optimized operating systems have fewer "moving parts" and fewer potential failure points than general-purpose operating systems. This decreases the likelihood of OS-related failures that limit access to the attached storage. Support for standardized protocols for file system access and retrieval across a network, such as the Network File System (NFS) and the Common Internet File System (CIFS), make NAS-based storage relatively easy to deploy and accessible to any authorized individuals or applications on the network.
SAN, by contrast, is an architecture for deploying storage devices and servers in a back-end "fabric" or loop topology. Most current SAN fabrics are created from multiple point-to-point Fibre Channel links that are switched at high speed. In theory, placing storage in its own "network" provides the benefits of limitless scalability, reduces the burden of large data transfers across front-end networks (LANs), facilitates accessibility by any server connected to the SAN, and enables performance suitable to high speed, block-level data reads and writes.
SANs are more complicated than NAS and much more difficult and costly to set up. A lack of standards impairs plug-and-play deployment and device interoperability. Organizations like the Fibre Channel Industry Association (FCIA) are working with standards organizations at the American National Standards Institute (ANSI) to add the necessary functionality required to move Fibre Channel toward a more network-like operational model.
At the same time, the Internet Engineering Task Force (IETF) IP Storage Working Group is working on developing a native SCSI application that will enable SANs to be created using any IP network operating at gigabit speed or better. Advocates of this still-evolving "iSCSI" standard tout it as an alternative to Fibre Channel that will enable the architecture to deliver on its value proposition.
Fibre Channel and iSCSI can be thought of as standards for the "plumbing" of a SAN. Facilitating SAN's allocation to applications and their management will require some virtualization above this layer of functionality. Also important is a layer of software best thought of as a "SAN operating system." This layer provides rudimentary capabilities such as data and file locking as well as other "fabric services" such as security and storage resource management functions. Ultimately, meeting the application's needs will require still other layers of intelligence be added to support the SAN's automatic recognition of applications and automatic allocation of storage.
Considerable work is still needed to reduce the implementation complexity and expense of SANs and to identify the most cost-effective strategy for SAN operation and management. Thus, you can generally regard SANs as the most costly platform for data deployment.
With corporate data growing at an estimated annual rate of 80 percent to 100 percent, the effectiveness of internal server disk and direct-attached storage arrays is under fire from both technical and business experts. Since the late 1990s, the acquisition of captive data storage platforms has been criticized as a purely tactical measure and called a short-term stopgap with higher long-term costs. Critics argued that ultimately, the economics of captive storage would prove too costly and short-sighted as an enterprise strategy. The obvious alternative was "networked storage."
SANs are intended as storage utilities that can provide whatever kind of storage that's required by an application or end user. If file system-based storage is required, a SAN is supposed to allocate block storage from its pool and provide the necessary file system support. If block-based access is required, it's supposed to serve up blocks.
For more complex applications, like databases with multiple elements, SANs are supposed to be able to recognize requirements and to apportion the right kind of storage in the right quantities (as set by policy-based management) to meet the needs of the various application elements. Similarly, if an application has a specific physical storage requirement, like a streaming multimedia system that requires only the outermost tracks of disk drives, a SAN is supposed to provide this type of storageintelligently and on the fly.
Most experts agree that such SAN intelligence is still years away. The SAN products offered by vendors today tend to be, in the words of one critic, "glorified block-level disk arrays" providing less stability, more difficult scalability and less performance than a high-end storage array. This characterization is less true of homogeneous SANs, where all products are secured from a single vendorwhich provide greater interoperability and intelligence "right out of the box." However, the glorified disk array is generally considered to be a fair view of heterogeneous SANs made up of multiple storage and switch products from multiple vendors.
A problem with homogeneous SANs, however, is that they commit an IT manager to a single vendor and its products. While this may have caused advocates of open standards-based approaches to bristle a few years ago, it's even more disconcerting today, given the financial ups-and-downs of the storage industry.
However, the future of networked storage is not all dismal. Newer networked storage topologies which capitalize on the plug-and-play simplicity of NAS in combination with the scalability of a SAN are being prepared for release into the market. These NAS/SAN hybrids may even be used to allocate storage to specific applications in a network-based model.
Will Software Vendors Call the Shots on Storage?
Microsoft Corp.'s omission of any Network-Attached Storage (NAS) platforms (including its own NAS software development kit) from the list of recommended storage platforms for use in hosting Exchange Mail 2000 [See "The View from Redmond" sidebar further downEd.] sent some shock waves through the NAS industry. Support for Exchange Mail had become a bread-and-butter offering of many NAS vendors, including industry leader Network Appliance Inc. Microsoft's move was disconcerting because vendors feared it would discourage companies that purchase storage on a "checklist" basis from buying a storage platform that wasn't on the approved list.
It's unclear whether the move by Microsoft was deliberate or not, however the issue raises an interesting question regarding the role that software vendors might play in the determination of storage platforms.
According to Chris Bennett, director of product marketing management, and Mike Alvarado, storage networking marketing manager, both with Network Appliance, the Microsoft omission has not hampered Network Appliance sales. The company is well-known for its solution in Exchange mail hosting, which delivers capabilities that aren't otherwise available, according to Tim Bosserman, a consulting research engineer at long-time Network Appliance customer, Earthlink Inc.
Bosserman says that the instability of Exchange Mail is "well known in the industry and favors Network Appliance and its SnapShot technology. Some people could not even get Exchange Mail to run until they bought Network Appliance."
In contrast with Microsoft, many application vendors, including Oracle Corp., appear to believe that the smart move is to embrace as many storage options for their software as possible. Oracle, an active participant within the Direct Access File System (DAFS) Collaborative, has added capabilities to its 9i database product to facilitate its deployment on storage platforms other than server-attached RAID arrays and other block-storage platforms.
In June, the vendor demonstrated the use of DAFS-enabled NAS platforms to host its flagship database product. The demonstration showed Oracle's 9i database system accessing data files stored on a Network Appliance Filer over a Fibre Channel link using the DAFS protocol. DAFS was transparently integrated with Oracle 9i using the new Oracle Disk Management (ODM) interface.
According to Bill Bridge, an architect for cluster and parallel storage technology for Oracle, "Oracle created the ODM interface to enable hardware and software vendors to integrate their products with the 9i database. We made the ODM spec available to the DAFS Collaborative to ensure that the DAFS protocol takes advantage of the performance and manageability features exposed by the ODM interface."
Dave Dale, co-chair of the collaborative, observes that numerous software vendors have been rallying around DAFS as a mechanism for facilitating end users with a standard method for file sharing and file access that is truly platform-agnostic. In addition to Oracle, Dale notes that IBM's DB2 database offering has also been instrumented to support diverse block and file system access-based deployment alternatives using DAFS.
Driving the Hybrid Vision Forward
SANs currently lack a standard file access protocol, according to Dave Dale, co-chair of the Direct Access File System (DAFS) Collaborative and industry evangelist for Network Appliance, which makes NAS systems. However, the 75 member companies of the DAFS Collaborative have been working hard to develop a file system, analogous to NFS in the NAS space, "to enable transport-independent heterogeneous file sharing by a broadly supported file access protocol on a SAN."
According to Dale, DAFS will level the playing field between the performance of NAS and the performance of block-level storage (whether SAN or RAID array-based) by reducing the number of instructions required to manipulate a file. Such technology comes at the right time, because more database software vendors, including Oracle Corp. and IBM Corp., are seeking to improve their ease-of-use by adopting a file access model.
"SANs lack a standard way to accomplish heterogeneous file sharing and they lack a standard file access protocol," says Dale. "And there is a need for this capability right now."
Dale says that he sees DAFS as providing the capability to integrate NAS and SAN, but he declined to speculate on product-specific implementations. Other members of the collaborative, including Tim Bosserman, a consulting research engineer with Internet Service Provider, Earthlink Inc., are less worried about politics.
Bosserman says that Earthlink uses primarily NAS products for its data storage, because "NFS and NAS are very scalable. A key advantage of the technology," according to Bosserman, is scalability with minimal downtime: "We can add more storage appliances and re-balance the load on the fly without causing an outage."
"We use NFS for distributed fault tolerance in our back-end storage, which is used for e-mail, Web pages and personal start pages. Our biggest application is e-mail, which is hosted on approximately 1.8 TB of NFS NAS consisting of 20 filers from Network Appliance," he explains, noting that Earthlink adopted NAS products in the late 1990s and was thus one of the first to do so.
However, NAS has limitations, according to Bosserman, that need to be overcome before Earthlink can vest all of its data to the platform. One sore point is latency that impacts the efficient access to an Oracle database-based "authorization" file.
"We are running into latency issues directly related to NFS file locking, and in some cases to TCP/IP, when we do our end-of-month updates to the authorization file," Bosserman says. The troublesome file will "lock up when we are trying to make a one-byte update to remove access for those customers who haven't paid their bills."
Earthlink has been seeking a way to host its request-flooded portal service applications on NFS-accessed NAS. The goal is to reduce latency and enhance performance.
Bosserman was "very impressed" by the demonstrations of DAFS capabilities at the organization's June 2001 developer conference. Side-by-side testing with NFS and DAFS showed that the DAFS processed file requests using about a tenth of the number of machine instructions required by NFS, he reports. Says Bosserman, "Throughput in megabits per second and CPU load were displayed side by side for both NFS and DAFS. I don't recall specific numbers, but the bar representing DAFS throughput was much longer than the NFS bar, while the bar representing CPU load was much shorter than the NFS bar."
To Bosserman and others, the movement of even high-performance, transaction-based applications to a file systemrather than a block-based-access modelis driving the need for DAFS. Bosserman expects the protocol to enable hybrid NAS-SAN storage products in which a DAFS-enabled NAS "head end" provides manageable, network-attached access to a scalable back-end fabric or loop of storage devices.
Even Network Appliance, he notes, currently uses a Fibre Channel Arbitrated-Loop to attach the disk drives in its cabinet. While he admits that he is "not instantly knowledgeable" about any products in the market that deliver a NAS box with a back-end SAN, he says that the companies providing demonstrations at the DAFS Developer Conference strongly suggest that such products are in development. Organizations demonstrating DAFS technology included Broadband Storage, Fujitsu Ltd., Duke University and the University of British Columbia. Current NAS and SAN technology vendors include Brocade Communications Systems Inc., Emulex Corp., IBM, Network Appliance and Troika Networks Inc.
The View from Redmond
Storage vendors are understandably concerned when a major independent software vendor like Microsoft does not "approve" their storage platforms to host its application data. Could Microsoft's decision not to certify any NAS platforms (including its own NAS software development toolkit) to host Microsoft Exchange Mail be the beginning of a trend? Will we see software companiesrather than storage vendor marketing departmentscast the deciding vote in the determination of the best storage technology for an organization?
I asked Microsoft some specific storage questions and received the following responses. The first question was addressed by David Siroky, Exchange Product Manager; the balance were handled by Phil Shigo, lead product manager in the Embedded and Appliance Platforms Group, both at Microsoft headquarters in Redmond, Wash.
Is it true that no NAS platforms are approved/certified for use by Microsoft for Exchange Server? If so, why? [It is] true. Microsoft is concerned with the best interest of its customers and therefore advises against Exchange customers placing database files on network drives or on any vendor's NAS system for purposes of ensuring data integrity. The NAS storage method is neither supported nor recommended by Microsoft due to several Exchange design specs that may affect communication with the NAS storage system. While there may be work-arounds/solutions developed by NAS storage system vendors that address these communication issues, [None of them] have been tested by Microsoft, and accordingly, Microsoft is unable to recommend or support their use of Exchange 2000 Server at this time.
Does Microsoft embrace a particular philosophy regarding data storage that will mitigate its support for NAS-based storage and SAN-based storage? Microsoft believes that all products should be tested prior to being certified for customer scenarios. Currently, there are potential issues with Exchange and NAS and therefore we cannot endorse or support such deployments. Once we have worked out the process and certified the application for NAS deployment, then the vendors with the certified product will be supported by us.
Will this philosophy be extended to other products as well? Yes, this philosophy of testing and ensuring the stability of a scenario before endorsing it for our customers is one we will apply to all situations. Microsoft's priority is to ensure that its customers install solutions that do not involve any possible known issues. Until a solution can be fully tested and proven reliable, Microsoft will not suggest it.
Does Microsoft have any intention of supporting the Direct Access File System or VI architecture with respect to its database products? Microsoft doesn't have any comment on support for DAFS or VI architecture.
The DAFS-enabled storage infrastructure is a natural outgrowth of development of next-generation servers around Virtual Interface (VI) architecture and Infiniband, the I/O infrastructure directly associated with VI, according to Jim Pappas, Intel Corp.'s director of initiative marketing. DAFS leverages the capabilities of VI, including direct memory mapping, and will be able to leverage Infiniband "as a ubiquitous feature of next generation servers to open the door for new solutions such as high performance NAS."
"DAFS is designed to work on Infiniband," Pappas says, "though it can also use other networks and interconnects to which VI has been mapped. You can buy a VI-enhanced TCP/IP network card or a VI-enhanced Fibre Channel interface card today, but in the future Infiniband will provide a better and more ubiquitous technology."