In-Depth
A Tale of Two Architectures
Storage nirvana remains the undiscovered country of the true SAN.
In a previous column (see http://www.esj.com/news/article.asp?EditorialsID=3077), I quoted an advocate of Fibre Channel over Ethernet (FCoE) to the effect that, in the data center, nobody cares about storage routing. In agreement, a reader forwarded to me this comment: "The guy who says 'nobody routes storage in the data center' is spot on. For most folks, routing is a non-issue."
The reader went on to state that he disagreed with the Emulex spokesperson about server virtualization reducing the performance of the storage interconnect. He wrote, "I think a lot of vendors are trying to sell a lot of product with this 'virtualization is driving storage' hysteria, and that's the heart of it … NPIV a good example of the hysteria in action."
For readers unfamiliar with it, N_Port ID Virtualization (NPIV) is a Fibre Channel facility allowing multiple N_Port IDs to share a single physical N_Port. This enables multiple Fibre Channel initiators to make use of a single physical port. Theoretically, this capability simplifies some hardware requirements in Fibre Channel fabric design, especially where so-called "virtual SANs" are used. NPIV, like all FC standards, is defined by the Technical Committee T11 at ANSI, which is dominated by FC fabric product vendors.
Back to the reader's e-mail: "How many sites really need individual virtual machines to have visibility into the SAN, which is what NPIV provides? In some ways, it's probably MORE secure, and definitely simpler, to have the Host OS (hypervisor) provide the storage. I can see how with NPIV it would be easier to migrate systems around to/from virtualized hosting, but beyond that what is there? For vendors it's great, because a completely new infrastructure must be installed to support it. Virtualized environments seem to run fine from a SAN when LUNs are presented to the host OS and disks are served to VMs via the generic (for example VMware) driver."
The sharp person went on to say, "To get back to [Emulex's] comment -- correct me if I'm wrong, but there's no unique 'speeds and feeds' concern that arises as a result of server virtualization. This is marketing smoke. Larger physical servers can generate more maximum I/O rates; this is nothing new. Perhaps most x86 virtualization users are not accustomed to anything larger than a 2U box, but this is not a Fibre Channel problem. Users will need to consider all aspects of the storage subsystem and size appropriately, of course."
He concluded, "FCoE WILL be compelling for most large system users simply because of the 10G data rate (and maybe the promise of cheaper hardware (but will this materialize?)), not because it addresses any concerns peculiar to virtualization with Fibre Channel."
I quote this e-mail in its entirety because it is one of the most coherent and intelligent bits of commentary about FCoE, NPIV, and server virtualization that I have read anywhere. Kudos to the writer.
However, the comments were also relevant to what I have observed lately when visiting corporate clients and chatting with attendees at the forums, seminars, and conferences where I speak. It appears that two fairly well-defined camps are developing within the storage world: those who seek to leverage storage as a system-controlled resource in a direct-attached topology and those who seek to build real storage networks. The former tend to be data center-centric in their outlook; the latter are distributed computing- or network-centric.
I suspect this duality has always existed, but it is now being brought into sharper resolution by economic factors, technology changes, and other trends. Any way you slice it, we are now hearing a tale of two storage architectures from consumers and from vendors.
Back to DASD
Several weeks ago, IBM announced the availability of the z10 mainframe. For less than a million dollars, companies could consolidate all -- or at least, 1500 or so -- of those pesky distributed servers into bullet-proof virtualized environments: logical partitions (LPARs) of a mainframe operating system.
Big Blue's announcement was a shot across the bow of would-be "virtual mainframe" enablers, such as VMware, from a performance and cost perspective. Working through their numbers, you could build up to 1500 virtual machines on the mainframe for $600 apiece, versus using 2u-servers with sufficient horsepower and throughput to host your apps (minimum $10K per server) plus $3K per server for a VMware license. IBM stopped short of formally disparaging x86 servers, but it made a clear case that, even with a generous and optimistic 20:1 server consolidation ratio using VMware, server-virtualization schemes were just too costly compared to the mainframe alternative.
Storage, too, could become much less expensive if you let the mainframe run it for you. IBM has great tools in the form of DFSMS for system-managed storage and DFHSM for system-managed hierarchical storage management that, when properly deployed, can improve resource--utilization efficiency to about 75 percent of optimum. In his own way, the reader above was saying the same thing about VMware server control over storage. Letting the VMware hypervisor and services (VMotion and VMware File System or VMFS) run storage would probably also increase storage-allocation efficiency (though VMware has no facility like DFHSM at present within its own kit).
The effect of either the IBM or the VMware approach on the storage industry would be the same. Storage vendors would see the abandonment of proprietary technology features on their gear in favor of a centralized and system-controlled management scheme. Storage would need to behave the way that the mainframe OS or server virtualization resource manager required -- or else.
Much of what you hear in the storage industry today about "VMware support" is exactly as the reader called it: marketing smoke. In the mainframe world, proprietary arrays are reduced to simple Direct Access Storage Devices (DASD) given de facto standards for attaching and configuring storage for use by the host computer OS. VMware, if you go the VMotion/VMFS route, does much the same in the virtualized x86 world. In both cases, LUNs are just LUNs regardless of the brand name on the storage box.
You can guess the reasons why these data center-centric approaches are appealing today -- when money is tight and companies are trying to insulate themselves from forklift upgrades on the vendor's schedule. A DASD-like direct attached storage modality, controlled by the operating system itself, is simple and effective and helps to guard you against the propensity of storage vendors to increase their array prices despite the fact that the core elements of their gear -- disks and memory -- keep increasing in capacity year after year while decreasing in price per gigabyte at a rate of 50 percent per year.
Dumbing down "smart storage" is, for many, the order of the day. Thin provisioning, de-duplication, compression, encryption, and other services touted by storage vendors as "value add" are increasingly viewed as "pricey options" you may not need to buy if you go to a simpler architectural model. An increasing number of the people I am talking to "don't want the undercoating" when they buy the car.
The writer also alluded to a potential foible of this approach: "migration" impediments. Replicating data is the only way to keep it safe from loss, and migration is a must-have when scaling, overhauling, or combining storage infrastructure. Though significant attention has been paid by mainframers to improving the resiliency of the hosting platform (for example, IBM offers that parallel sysplex architecture on the mainframe enables LPARs to remain stable even if a processor fails in the mainframe host -- a clear advantage over most x86 virtual machine hosting techniques, unless you double up on servers -- so much for containing server sprawl -- and cluster them together with ESX or some comparable wares), data movement is inherently constrained by direct attached storage topology.
SANs were supposed to provide the solution, but they didn't. Fibre Channel and its derivatives, including FCoE, are not network protocols and do not route data to peer endpoints. FC fabrics are DAS topologies -- just as bus and tag, ESCON, and FICON are DAS topologies. Regardless of whether they share the same switches, or even the same virtual ports on switches, FC fabrics are not networks in part because they do not route.
The Networked Storage Alternative
Now let's consider a different architecture for storage: networked storage. Despite years of talking about SANs since the concept was first introduced by Digital Equipment Corporation through Compaq in 1997, we have yet to see a real Storage Area Network in the market (with the possible exception of Zetera's UDP/IP storage).
In a true networked storage architecture, storage products are also reduced to simple DASD products. They are managed in band via network protocols or Web services. Value-add features and functions are placed as services in the network, on switches or routers or appliances, so that data can be exposed to the proper set of provisioning and protective services by virtue of the route they take to the target device.
Crossroads Systems gets this. F5 Networks gets this. Cisco used to get it -- before they started selling FC switches and publically stating that Fibre Channel was a network "because we say it is."
Separating "value-add" functions from the array deconstructs the cost components of array products themselves. Placing these functions in a network and exposing them as services enables more data to be covered by appropriate services at a lower cost and with fewer proprietary hurdles than the current modus operandi. Hosts, whether physical or virtual, are simply peers in this configuration. What is important is the data, which can be transported to appropriate storage repositories, copied to multiple repositories at the same time, and encrypted as needed, and so on.
Assuming that all components are manageable in common, the result is superior to the data center-centric approach. We achieve what the data center jocks are after: simplification, consolidation, manageability, scalability, and ease of migration. But we don't have to sell our souls to Brand X mainframe provider or Brand Y server virtualization software provider in the process.
I happen to like mainframes better as server virtualization platforms than I do x86 boxes running hypervisors. Having cut my teeth in a mainframe shop early in my career, I know the advantages. That said, I can also remember how boorish my IBM account exec was and how he tried to force march me to the next hardware/software refresh from Big Blue whether we needed it or not.
From where I'm sitting today, storage nirvana remains the undiscovered country of the true SAN. Consolidating I/O on a common cable -- whether via iSCSI, FCoE, or EIEIO -- might be a baby step in that direction, but don't mistake any of these topologies for what they truly are: the perpetuation of DAS attachment models.
Your views are welcome: jtoigo@toigopartners.com.