Getting Smarter About Storage (Part 2 of 3)
Choosing cloud-based vs. in-house storage solutions is like herding cats -- tricky.
In my previous column, I considered two trends in the storage industry that struck me as diametrically opposed. The first was one we have seen for several years: adding automated functionality to an already-bloated array controller in order to "smarten" storage for the non-storage geek.
The second trend was to deliver storage capacity as a service. Increasingly, "cloud storage" marketing is succeeding in convincing companies that storage can be better purchased "by the pound" like so much ground beef, alleviating the labor costs associated with building your own infrastructure.
How are these trends in opposition? Simply, "smart storage" equipment from brand name vendors costs significantly more than their simpler cousins -- both in terms of acquisition price and in cost of ownership. In a nutshell, vendors are using "value-add" functionality embedded directly on array controllers to jack up the price of their commodity components -- the underlying tin cabinets and spinning rust -- to obscene levels. Quizzically, higher price seems to have become synonymous with "enterprise-class value" in many consumers' minds, despite the obvious single-vendor lock-in that is created by deploying this technology.
If cloud storage service providers are required by their customers to deploy the same brand name "smart storage" products (mainly, to ameliorate consumer concerns about resiliency and performance), they will not be able to survive in the market' because they will not be able to realize the economies of scale from storage sharing that are key to sustaining their business models.
Whether deployed in the cloud or on your raised floor, so-called smart arrays tend to increase storage infrastructure and labor costs rather than reducing them. This is an inescapable, if counterintuitive, fact of life. To be sure, the purported cost savings of smarter array controllers, given that they automate such functions as capacity management through on-array tiering, thin provisioning and de-duplication, may be realized in a shop where there is only one storage array -- or in some instances when a multi-array environment is comprised wholly of a single vendor's product. Letting the smart box, rather than a skilled storage administrator, do the work of forecasting capacity requirements, provisioning capacity to applications, or migrating data around on spindles, may, indeed, enable companies to downsize expensive IT staff as vendors claim.
Typically, this may produce a short-term gain but with longer-term consequences. Over time, sustaining labor cost-savings requires businesses to deploy only one type of equipment from only one vendor, creating a dependency on that vendor. In turn, thisT requires that the business wed its storage infrastructure to the vendor's product road map, ignoring useful innovations that might occur outside the vendor's development shop and exposing the business to the demands of the vendor's design and delivery processes. IT veterans remember this from the mainframe experience of the 1980s, when IBM sought to force march their customers to next generation products year after year whether the customer needed them or not.
In addition, vendor product developers are tasked to build equipment that ensures that it will be difficult and costly for customers to deploy a competitor's wares from the standpoint of capacity administration. With many of the smart storage products in the market today, just sharing capacity between heterogeneous arrays (e.g., those from different vendors or different classes of installations from the same vendor) provokes nightmares. The result is that smart arrays become stovepipes -- isolated islands for data that resist common coherent management.
If storage clouds are, as marketing suggests, your data center in the sky, then the same problems apply to them with respect to "smart storage" as to the traditional in-house IT shop. Although some of the early cloud storage providers have attempted to crreate their own storage infrastructure, designing a JBOD-based commodity infrastructure with a software NAS clustering front end, they don't seem to be getting the attention from the larger business clients they expected. Bigger firms seem to have been convinced by years of brand-name vendor marketing that "you get what you pay for," so the pressure is on many service providers to deploy stovepipe solutions from EMC, IBM, NetApp, and others in order to attract the more lucrative business accounts.
Storage virtualization, which abstracts value-add software functionality from the array controller and thereby reduces equipment to its basic JBOD or simple RAID form, has the advantage of leveling the field in the brand wars and exposing the simple truth that services are more efficiently provisioned off-box. Simplifying storage and delivering value-add functions from servers or routers reduces the cost of the crate of disks, reduces the tendency of the array to fail (software failures account for many more operational outages in storage than hardware faults), and enables value-add services to be available to a growing number of spindles rather than to a stovepiped few.
In addition, controller functionality abstraction enables users to administer value-add functions in a more consistent and less costly way. Think about the difference. If you buy three smart arrays from three brand-name sources, someone must be responsible for setting up on-array storage tiering policies for each array. In a virtual storage environment, the tiering service extends to all three platforms, enabling all data-tiering policies to be administered from a single screen by a single administrator.
Some would argue that storage virtualization merely transfers the lock-in created by hardware-embedded value add software to a software vendor instead. Such comments are frequent at recent storage conferences, and they contain a kernel of truth. In some cases, they are followed by a war story describing a bad experience with this or that virtualization product from Symantec, IBM, or some other vendor. To be sure, software controllers -- which are what software virtualization products are -- do have their foibles. Some are as proprietary and almost as expensive as the array controllers whose functionality they usurp.
Truth be told, a storage virtualization product is only one way to host storage value-add functions. It is a convenient approach in many cases, but it is by no means a necessary one. The idea of a storage resource management engine predates the concept of a "virtual storage controller" and can be implemented in a number of ways.
In the mainframe world, Systems Managed Storage (SMS) was traditionally a function of the operating system. This is being challenged as IBM encourages companies to migrate non-mainframe workload into mainframe logical partitions and has begun fielding storage platforms to support this workload and its data that are definitely not managed using SMS tools that are used for traditional DASD. The idea of having the operating system provide the host for tiering logic, capacity allocation, etc. is as old as the proverbial hills.
Storage Resource Management (SRM) has also been sold as a free-standing utility software package for many years. SRM leverages a combination of information sources about storage, whether collected via Simple Network Management Protocol (SNMP) or through proprietary array-vendor application programming interfaces (APIs), to provide storage administrators visibility into infrastructure conditions and to report on status. Users must interpret this information and make changes to configurations of each piece of equipment manually using the specific vendor's own element management software package.
Useful as SRM tools are, they largely failed to resonate with consumers over the years. One problem was the difficulty finding an SRM package that interfaced with all of the hardware brands fielded by consumers in their shops: SRM developers needed to make an annual pilgrimage to every hardware vendor in the market to beg or buy access to their APIs just to keep up. Another issue was the timeliness of the information collected by the tools themselves.
To its credit, the Storage Networking Industry Association tried to tackle these issues with a new provider technology that ultimately became the Storage Management Initiative-Standard (SMI-S). Unfortunately, SMI-S was developed by many of the vendors who were bent on maintaining the "smart storage as stovepipes" model. SMI-S was challenging and costly for smaller vendors to build into their equipment, and the efforts of larger vendors to enable their platforms with the technology, which could enable vendor-X products to be deployed next to vendor-Y products without management disruption, seemed at best half hearted.
In the end, many SRM vendors took the path of many of the storage virtualization vendors, preferring to manage storage at a more abstract level: working with storage volumes that servers could see rather than the physical gear itself. That increased, rather than decreased, the incentives of the array vendors to field value-add functionality on their array controllers directly, since neither SRM nor virtualization products provided the means to configure or otherwise manage the operation of underlying hardware.
In the final analysis, the idea of smarter storage is closely aligned with the concept of ease of storage management. None of the storage management paradigms proffered over the past few years has delivered a unified approach that addresses both the resource management component of storage management (the hardware configuration side) and the services management component of storage management (the software services side) in a satisfactory manner.
There is hope on the horizon. Part of the hope comes in the form of an ongoing initiative being pursued by one storage array vendor, Xiotech, to build a management paradigm from open standards Web Services protocols. A few weeks ago, I was in Eden Prairie to see a demonstration of the use of a REST-based storage management tool running on an iPad that effectively delivered full visibility and manageability of something like a petabyte of distributed storage infrastructure. It knocked my proverbial socks off and we will examine it in greater detail in the next installment of this column.
Since then, I have also been talking to a number of start-ups in the cloud technology space, including SIOS Technology Group, Aprigo, and Crossroads Systems/Fujifilm, whose software stacks for internal and external clouds and for storage analysis services bode well for improved management of storage assets. I'll talk about what we learned in a future column.
For now, the smartest storage move is probably to eschew on-array smarts in favor of an off-box approach to managing data across hardware. Your comments are welcome. firstname.lastname@example.org