The Holy Grail of Storage Efficiency (Part 2 in a Series)
When considering budgeting for storage, cost efficiency provides the best way to frame a discussion of storage efficiency.
In Part 1 of this series, I looked at the challenge of defining precisely what is meant by a term that has become part of most storage vendors' marketing literature in 2011: "storage efficiency." Various definitions of the term originate in engineering, operations, and finance. They illuminate different -- yet interrelated -- aspects of the same idea. My conclusion was that cost efficiency in storage is associated closely with both efficient storage infrastructure design and with efficient operational management. If you were presented with a Venn diagram depicting storage efficiency, you would need a magnifying glass to find much separation between these ideas.
Bottom line: the cost-efficiency perspective provides the best way to frame a discussion of storage efficiency. It provides a bridge to the front office, which is less likely to understand the nuances of storage architecture and performance metrics. Plus, management controls the purse strings. Without support, planners won't see their strategies funded.
In these times of constrained budgets, senior management's attention is rightfully being placed on all significant capital and operating outlays -- of which electronic data storage is one. When you look at the IT hardware budget in most organizations today, storage sticks out like a proverbial nail in search of a cost-cutting hammer. Storage hardware acquisitions account for between 33 and 70 cents of every IT hardware budget dollar, depending on which analyst you read.
Making storage operations more efficient -- that is, deriving more return on investment -- must begin with an examination of what is being purchased. Such an analysis begins with questions: How are we choosing infrastructure components? Are we fitting our selections to a set of strategic architectural criteria derived from a clearly defined vision of how our infrastructure should look and behave in the future?
Alternatively, are we taking a vendor's word for what constitutes "sensible architecture" -- essentially, hitching our wagon to a vendor's star, or do we take our cues from a prominent industry analyst house? Given today's business models of the Gartners, IDCs, and their smaller kin, taking their word on what to buy may be much the same thing as simply adopting a vendor's recommendations without question.
Starting with a Strategic Vision
What IT planners should be doing, first and foremost, is constructing a strategic plan that articulates a vision of what they want to achieve with their storage investments.
Initially, this requires the planner to answer a simple architectural question: are value-added services such as thin provisioning, de-duplication, on-array tiering, etc. best provided on hardware (on an array controller) or in free-standing software (abstracted away from the hardware)? What are the benefits of buying storage packaged in the way that most of the industry wants to sell it (value-added features increase the cost of storage kits and drive vendor revenues) versus the alternative of buying reliable bare-bones kits that are supplemented by best-of-breed, free-standing software utilities?
You might want to start by considering storage infrastructure requirements (and cost) over time, rather than looking at which vendor has the "hot product" today. This requires you to consider how you will scale infrastructure over time.
How Will It Scale?
Take it on faith that storage capacities will most certainly grow over time in response to data growth. The current predilection of storage vendors to embed value-added functionality directly on their array controllers is, therefore, a double-edged sword.
On the one hand, integrated functionality is appealing for its ease of vendor management. If something breaks, there is no finger pointing between multiple vendors -- between those providing the hardware and those providing software-based, value-added features. This "one-stop shop" model delivers, in essence, a "one throat to choke" meme for addressing problems; this resonates with many IT managers who have better things to do with their time than troubleshoot the problems created by a disjointed gaggle of tech vendors.
Another plus of buying an "integrated storage system" might be that it spares the planner from having to discover and vet third-party software intended to provide the value-added functionality that he or she deems to be desirable for storage. Who among us hasn't groaned aloud at the thought of enduring a bunch of WebEx or in-office software demos delivered by earnest technical sales folk who are mostly clueless about our real day-to-day problems?
Then, of course, there is the issue of "solution management" over time. With storage gear being retained longer these days -- five to seven years, rather than three -- planners want to have confidence that their infrastructure, however they design it, will continue to perform well and receive good maintenance service. Rightly or wrongly, we tend to have more confidence in a big three-letter vendor to still be around a decade from now rather than in a venture capital-funded start-up with software that will "change the world" if enough people license it. Plus, we tend to take for granted that name brand vendors will ensure that migrations will occur smoothly as upgrades are made to their hardware and software cobbles -- despite significant empirical evidence to the contrary. We are less certain that a group of software firms will be able to stay in lock-step with changes in hardware.
For all of the confidence that buying recognized brand names may provide, planners also need to look at the facts from a storage efficiency standpoint. A key one: many of the brand name kits, or at least their value-added functions, simply do not scale.
The problem is only now beginning to resonate with IT professionals. You run out of space on a thin-provisioning array, so you need to buy another -- only the thin provisioning engine on array #1 will not span to another stand of disk drives, so now you have two "islands" of thin provisioning disk -- or de-duplicated disk, or tiered disk. Over time, IT planners find that their storage infrastructure has become isolated islands of services, complicating management efforts. Storage efficiency, by any metric you wish to use, is diminished.
It should also be noted that the confidence that supports a one-stop-shop approach in hardware acquisition too often falls prey to realities of technology trend lines such as Moore's Law. For example, drive capacities double about every 18 months. When new drives are introduced, they often can't be retrofitted to existing chassis. In a year or so, new SATA drives with 40TB capacities in a 2.5" form factor will come to market that cannot be fitted to existing shelves of lower capacity 3.5" drives. Chances are good that a vendor's "migration path" will be to require customers to perform an upgrade of their kits to take advantage of the newer drives.
There is precedent: last year, one prominent vendor suggested that refitting arrays with high capacity drives was a "green solution" -- more capacity, same energy draw. The vendor neglected to mention that the firmware of high capacity drives wasn't compatible with its NAS operating system and that the drives themselves couldn't be connected physically to the backplane of the existing rig. Solution: buy a new rig.
Hardware technology change isn't the only constant. All-in-one storage system vendors themselves often find it difficult to keep pace with the real masters of the IT universe: the operating system and business application software houses. When Microsoft or Oracle or VMware decide to change how their OSes or apps interact with storage, it may take six months to a year before storage vendors catch up -- a problem that reflects, by the way, the inherent difficulties vendors have in managing changes made to complex code embedded on value-added array controllers.
Finally, everyone should know by now that vendors routinely attach "end-of-life" provisions to the kit they sell to customers within 18 months of bringing it to market. The responsibilities for supporting "legacy" gear -- for providing tier one and tier two support on last year's products -- is increasingly handed off to surrogate maintenance and support organizations. This calls into question the value of long-term warranty and maintenance contracts that are usually purchased with the original equipment. Why pay a premium for third-party support that is provided through a third party after 18 months? More to the point, why renew a vendor-branded warranty and maintenance contract after 36 months (the typical duration of warranties) at the price that the vendor charges (often equal to the cost of the original kit!) when you can source maintenance from a third-party service provider for a fraction of that price?
A Better Strategy
Such questions begin to bridge the divide between the financial, operational, and architectural perspectives on storage efficiency. They suggest that buying one-stop-shop storage technology from a "trusted vendor" may not, in fact, be delivering the strategic value to the organization or return on investment that the vendor claims.
The alternative is to develop a strategic vision that does not key to a particular hardware vendor but instead to a company's own needs. Divide the concept of storage infrastructure into two distinct parts: hardware platform and services platform. Separating the two will expose opportunities to improve storage efficiency substantially and meaningfully.
What are the characteristics that should be sought from hardware platforms? Management is key, though it rarely appears on the top ten lists of consumers buying storage equipment. Performance, capacity, connectivity, on-array data protection, and serviceability are also important.
Readers of this column know well my preference for standards-based storage management, Web Services RESTful management in particular. Leading the way in this area is Xiotech, but many other vendors have indicated that they are also pursuing this direction. With RESTful management, vendors can build capacity and performance management utilities that play on any browser-based device enabling the administrator to interact with storage using mobile clients such as iPads or smartphones. Moreover, since a growing number of business applications and operating systems already request resources such as storage via REST and HTTP, allocating capacity can become much more automated than it is today, with implications for labor cost efficiency in storage.
Hardware can be viewed as foundational, providing a rock-solid location for storing bits: "a field of bricks" to use one vendor's description. Above this foundational layer are two layers of services. The first is a presentation layer.
Storage is essentially block in nature. It can be presented as block-based capacity to applications such as Exchange or many databases that use capacity in its raw form, or it can be surfaced as file system-based storage for apps that work with files. Neither of these two modes of presentation (which will shortly be joined by a third, "object storage") needs to be delivered by the base infrastructure itself. The presentation layer can be handled by a specialized server head (a controller) or by a generic server running file services. Separating the two -- storage hardware and the presentation service -- simplifies capacity scaling and cost and allows planners to pick the presentation products that make the most sense given their application set.
Above the presentation layer, or parallel to it, is another kind of services layer -- one that is focused on storage operations themselves. These services include disk pooling, disk caching, thin provisioning (automated allocation of capacity on demand to a virtual storage volume), de-duplication and compression, tiering and archiving, etc. Provided as abstracted software services, these features and functions are no longer restricted to a particular vendor's array controller, but are extensible to all storage infrastructure -- and, more important, selectable based on what the data being stored requires.
If this sounds like storage virtualization, it is. One way to implement storage services outside of embedding them on a hardware array controller is to place them in a software-based virtualization layer that interacts with the backend hardware infrastructure, presents it in the appropriate access meme (block or file) to applications, then manages capacity allocation, utilization, and performance efficiency.
The Internal Storage Cloud
This vision of storage infrastructure as a field of manageable bricks, surfaced as block- and/or file-based capacity, and supplemented by externalized storage optimization services describes a viable alternative to hardware silo-based approach that has been plied by the industry for decades. It is reminiscent, in many respects, of systems managed storage in the mainframe world -- where simple DASD was pooled, shared, and augmented by software functionality operated on the host, and not on the storage. This simple design choice made capacity allocation efficiency on the mainframe significantly better (at around 80 percent of optimal) than it is in distributed storage infrastructure today (at about 17 percent of optimal).
Some call this idea an internal storage cloud. Perhaps it is. More on that in the next installment.
Your comments are welcome: firstname.lastname@example.org