In-Depth
The End of the Decade in Storage
Few storage technology products invent a market. Rather, they respond to market requirements.
At this time of the year, we prepare our traditional review of storage technology developments to spotlight good ideas, and critique those that weren't so good, that made their way into the market during the preceding 12 months. Given that the end of the year and the end of the first decade of the New Millennium coincide, we take the opportunity to give our tradition a slightly different twist. Here, then, are some of the best and worst storage trends of the decade of the 'Aughties.
Let's start with an overarching criterion that guides our assessment: "good" storage technology, in our book, is that which effectively addresses outstanding consumer needs or burgeoning consumer issues. Conversely, "bad" storage technology, well, doesn't.
In reality, very few storage technology products invent a market. Rather, they respond to market requirements -- whether these bubble up from consumers or capitalize on some evolutionary trend. When products appear in the market that are not based in one or the other of these requirements-based drivers, chances are good that they will create a lot of noise initially, but eventually will fizzle out. Unfortunately, a few really good ideas have appeared in the market, but have been driven out fairly rapidly because of the vendor's inability to communicate its value proposition effectively, or because of industry inertia to embrace change, or for a combination of similar reasons. We have seen our share of each in storage this past decade.
SANs sans SAN
Few recall that the 'Aughties (2000 through 2009) actually began with a storage technology crash. The dot-com bubble and its progeny -- application service providers (ASPs) and storage server providers (SSPs) -- hit the skids within the first 14 months of the millennium, denting the prospects for storage industry revenue growth once believed unshakeable. By March 2001, the bottom had fallen out of the storage market, whose vendors had spent the previous five years pressing "storage-networks-that-are-not-technically-networks" (so-called SANs) into service.
SANs promised to provide a mechanism for pooling storage capacity without sacrificing high speed access. This was supposed to enable on-demand capacity provisioning and to simplify management of storage spindles: just what the doctor ordered to deal with all of those amassing bits.
In truth, FC fabric "SANs" were not only the most expensive way ever conceived to host data, they also served to expand a proprietary and hard-to-manage disk array model beyond the confines of a single tin cabinet stovepipe. The inside game was to convince consumers that what was being delivered to market was, in fact, what the smart folks at Digital Equipment Corporation-qua-Compaq Computers had termed "ENSA" -- enterprise network storage architecture. Only, they weren't getting ENSA peer networks at all -- but rather a channel fabric.
As such, SANs worked minimally well when all gear was purchased from a single vendor, creating a homogeneous infrastructure. However, mixing and matching rigs from different vendors -- creating a "heterogeneous SAN" -- had the effect of making pain the word of the day in most shops. The underlying problems with the FC SAN included the absence of a management protocol, common to real networks, within the FC interconnect itself. Try as they might, third-party management tools could not keep up with the proliferation of storage products seeking their place in the expensive fabric. A minor change in firmware on an array controller, a host bus adapter, or a switch caused management to fall apart and entire swaths of infrastructure to go dark.
Moreover, the dynamics of fabric switch port pricing did not parallel those of Ethernet switch ports, which tend to fall precipitously in a very short period of time. SAN CAPEX costs drove the overall cost of storage on Fibre Channel drives above $180 per GB at a time when the costs of disk drives themselves were falling at over 50 percent per year. The impact of OPEX costs on overall SAN total cost of ownership have never been effectively calculated, but given the huge administrative burden and frequent downtime associated with SANs, CAPEX-plus-OPEX may well have driven storage over the $200-per-GB mark at a time when per GB costs on disk drives themselves were falling below the $30 level.
There were several industry responses to the not-so-secret SAN cost problem. First, of course, vendors sought to make a louder marketing noise to drown out consumer grumbling. SAN speeds-and-feeds improvements were touted about every six to nine months -- despite the fact that such improvements only made sense within the context of a balanced system.
Think about it -- if applications didn't require faster switch and interconnect speeds, why upgrade switches and HBAs at all? Similarly, if the SAN couldn't be managed and FC switch ports were already being utilized to less than 15 percent efficiency, where was the value? Moreover, if speedier SANs simply failed more often (ironically, they failed much more often than servers despite being offered as a solution to server failures and resulting inaccessibility of direct attached storage), why pay for the upgrade?
These questions, addressed by this column over the decade, are only now -- at the end of the decade -- being asked in earnest. This is testimony to both the prolonged success of vendor marketecture and to the current recession-driven cost sensitivities that have entered the thinking of the mostly technology incompetent purchasing agents who buy storage in most large enterprise companies today -- and usually without consulting their own IT experts.
SANs sans Management
If noisy marketecture was the first response to the intractable problems of SANs, there were also some architectural efforts. The Storage Networking Industry Association endeavored, first in earnest, to create a management protocol for SANs -- Storage Management Initiative -- Specification or SMI-S. The effort ran afoul, of course, of proprietary vendor interests in short order. (The best metaphor for what happened is probably found in the current health care insurance reform effort.)
We documented developments here in quite a few columns before giving up: it was clear that the vendors did not want a common management paradigm that would convey, correctly, the impression that a box of Seagate disks was simply a box of Seagate disks, regardless of whose logo was stamped on the cabinet. By the end of the Aughties, the common management elements of SMI-S were so gutted and implementation of the protocol on vendor products was so rare as to make the entire effort largely meaningless.
As we close out the decade, the best management approach is being proffered, interestingly, by one of the smallest players in the business: Xiotech Corporation. Leveraging open-standard Web Services REST protocols, they have demonstrated with their own products the real potential to wrangle storage infrastructure resources into a service-oriented management model. However, it remains to be seen whether the company has the ability (or the money) to build an adequate ecosystem of both vendors and consumers to promote a disciplined implementation of this model in a broad footprint -- especially with most of the gear vendors now endeavoring to build their own mini-me mainframe stacks by purchasing server, switch, and software vendors that will share a common brand with their proprietary storage rigs.
SAS and SATA
Another architectural innovation we saw in the Aughties was the introduction of Serial Attached SCSI and Serial ATA (SAS/SATA) technology-based arrays into the enterprise space. SATA came to market first, and vendors such as Nexsan worked diligently for years to surmount FUD (fear, uncertainty, and doubt) campaigns from the Fibre Channel folks, who characterized SATA as a desktop, not an enterprise, technology.
SATA, of course, benefited from the huge capacities enabled by Perpendicular Magnetic Recording (PMR) and a drive cost far below that of an FC drive, and demonstrated sufficient resiliency to make their way into lower-cost systems leveraging commodity NAS heads and iSCSI interconnects. They found a home within the medium-sized enterprise first, where buyers were more technology-savvy and price conscious, and eventually made inroads into larger business environments.
Aiding the rise of SATA was the introduction in the middle of the decade of the long-awaited SAS protocol, which was downwardly compatible with SATA and could be used to create a respectable "fabric SAN" of its own (and at a reduced cost compared to FC, though this was not openly discussed early on). By the second half of the decade, SAS technology adherents sported an extensive list of hardware vendors, including many who already sold FC equipment.
Arguably, the success of SAS/SATA also reflected of a growing realization by companies, not only of the fiscally insane economics of data growth in a FC world (at a couple of hundred dollars per GB, petabytes cost real money!), but also about the realities of data re-reference. Once written, most data was never accessed again.
A lot of SATA, and now SAS, storage was and is sold to support the only valid storage "tiering" meme in distributed computing: "capture storage," requiring fast speed/low capacity FC or SAS disk capable of capturing data at the speed that the most demanding application can write it, and "retention storage" featuring lower speed/high capacity SATA disk (plus maybe tape and optical). SAS/SATA, by definition, captured this idea elegantly.
Virtualizing SAN sans SANs
In parallel with the introduction of SAS/SATA was the re-introduction of storage virtualization. Pioneered by DataCore Software, the first outing of storage virtualization technology, which aimed at separating expensive "value-add" array management features from the box of spinning rust and placing it instead in an independent software layer, was summarily panned by the array makers in the late 1990s. The Aughties, which saw an explosion of hype around server virtualization, reinvigorated the storage virtualization story.
It helped that one-time storage virtualization adversaries EMC and IBM subsequently introduced their own hardware products into the market early in the decade, causing the companies to blunt their FUD campaigns around the technology. It also helped that European firms, late to the SAN table, saw the wisdom of virtualizing spindles sooner, in part to cope with OPEX costs that were already becoming an issue for SAN users in the States. DataCore stayed alive by becoming a prominent player in the European market, then reentered the U.S. market with great success.
With the growing adoption of I/O-brain-dead server virtualization products in the last couple of years, storage virtualization has increasingly become the prescription for presenting volumes to guest machines in a reliable way. DataCore, FalconStor Software, Exagrid, and a few other ISVs have been direct beneficiaries of the strategy, especially in shops that do not want to pay the prices charged by three-letter vendors for this functionality and who are seeking to reduce obscenely expensive software license renewals for on-array value-add software.
From Hardware to Software
The reinvigorated fortunes of software players in the storage virtualization space are echoed in other storage software categories. The economic downturn has helped many enterprise customers see the enterprise storage rig for what it is: a stovepipe conceived in the 1990s, when money was flowing and common sense was put on hold.
Management software has benefited from the economy. Tek-Tools, Virtual Instruments, and others have experienced growing popularity as companies seek to manage more capacity without adding staff. Some hardware vendors, again notably Xiotech, have seen the wisdom of gutting the feature/function set on their array controllers, leaving only the functionality that must be delivered at the box level, and instead turning to the ISV community to deliver an ecosystem of best-of-breed, off-box software to perform functions such as thin provisioning, data migration, and disaster recovery.
One of the smartest innovations in the software space was made by CA in its data protection and disaster recovery products. Near the end of the decade, CA added SRM and de-duplication to its backup products -- the former enabled the refinement of backup job creation so that operational windows would not be compromised, and the latter to prove that you didn't need a special hardware stovepipe to de-duplicate datasets such as backups that you wanted to keep available on disk for fast restore of individual files. Symantec and others are catching on to this strategy, begging the question of whether EMC's purchase of Data Domain, an early de-duplication hardware vendor, for $2.1 B made any sense at all.
Also of increasing interest is software to help sort out the storage junk drawer. Driven by the twin goals of cost savings (bending the storage-spending curve) and compliance (conforming data to rigorous audit requirements, retention/deletion schedules, etc.), data discovery and auto classification software has become a hot spot.
With "manage thy data better" becoming the eleventh commandment during the Aughties , it made sense for EMC to buy Kazeon Software (not to mention that the move helped sabotage the products of rival vendors using the software in their products). However, the payout to venture-capital backers of the Kazeon deal has not been missed by investors. It has encouraged the funding of many more start-ups with big ideas about how to automate the classification of files. Led by Digital Reef, expect this space to become more important.
Other Trends Worth Mentioning
Toward the end of the decade, a few other trends are worth mentioning here. The first have to do with hardware.
Much noise has been made over the past 12 months regarding Flash solid state disk (Flash SSD). The story goes that there are a lot of performance-starved applications (PSAs) that are in need of a solid state read/write target in the form of memory. Using conventional dynamic RAM chips (DRAM SSD) to create the targets is expensive, while Flash memory is considerably cheaper. Hence, we will shortly see Flash SSD supplant high-speed disk, or short-stroked arrays of high speed disk, according to advocates.
For all the talk, we have yet to see a realistic strategy for this architecture. Bottom line: Flash SSD is subject to memory wear (maximum writes to a cell location are 500K, then the cell burns out) and that is an important gating factor in the cost-efficiency of this approach, particularly with respect to transaction-oriented systems. "Wear leveling" has been the focus of considerable engineering by both chip makers and product developers, but the real solution has yet to be advanced. Outcomes over the next couple of years will help determine whether Flash SSD is everything the advocates say, or just a flash in the pan.
On the other side of the equation, hardware technologies that only a few years ago were said to be on their last legs have come back into vogue. Tape technology is one example. Counted out as a flat line industry segment in large enterprises throughout most of the decade, big tape has seen a resurgence, with adoption trends in large shops looking promising. Spectra Logic may be the big beneficiary of this trend, especially since Sun Microsystems has been swallowed up (together with its STK tape portfolio) by Oracle.
Another technology that enjoyed a spurt of revival was the mainframe. The cost-efficiencies of mainframe computing are significantly better than those of distributed computing, but this point appeared lost on companies until the recession. Many companies that have deployed Big Iron have decided to postpone plans to decommission the technology and to migrate more workload to it.
The knee-jerk reaction of several "open systems" advocates, including HP, Cisco Systems, Oracle, EMC, and others, has been to mimic the mainframe by purchasing server, storage, and networking companies of their own so they can stand up "mainframe mini-me"s with all hardware and software coming from a single brand. It remains to be seen whether (1) consumers will lock themselves into such a hardware/software stack, with all of its potential for price gouging and inflexibility, and (2) whether such proprietary stacks will actually deliver promised advantages of "mainframe-like" resource utilization efficiencies.
Another story still incomplete is the storage cloud. We will watch the skies for developments in this space over the coming year. Till then, thanks for reading. Your comments, as always, are welcome: [email protected].