In-Depth
The Holy Grail of Storage Efficiency, Part 5: Rethinking Storage Management
In the concluding article in our series, we look at storage management, examine how it serves as a translator, and discuss the search for a useful automated storage management paradigm.
Storage Management versus Storage Resource Management
The typical response to a discussion of storage management is a yawn. The term is commonly used to describe a set of technologies for collecting status information about storage components and interconnects, then collating it into a set of reports or display screens that provide trends and other data useful to storage equipment optimization, troubleshooting, and planning. This view of storage management is more correctly termed "storage resource management" (SRM) and tends to reflect the needs of the storage engineer or technologist than those of the operator/administrator or business manager.
SRM is a fairly well-defined idea. Collectors obtain information and measurements from storage devices via utility software often deployed on the device itself. This is usually done today via proprietary "element management" software supplied with each storage rig and accessed either via a Web browser or direct access via an application programming interface (API) or command line interface (CLI). Such approaches become problematic as the number of devices increase, however, creating a need to spend time "surfing" device after device to obtain its status information, then, after contacting the last device, starting with the first one again.
A couple of decades ago, the polling process was refined using a protocol that became part of the IP standards suite, Simple Network Management Protocol (SNMP). In this strategy, information was collected into packages with a pre-defined structure or template that could be periodically collected by a software process and returned to a central monitor for correlation and analysis. SNMP was more or less passive, however, as it provided a collection mechanism but offered little in the way of interaction with the monitored device itself -- whether to change configurations or to resolve detected problems. To interact with devices, engineers and administrators needed to access the device directly via its element management utility.
Beginning with the introduction of Fibre Channel-connected storage fabrics in the late 1990s, which were devoid of any in-band management capabilities as part of the FC protocol, storage plumbing became more complex. At least three connections needed to be made to each and every array in a fabric: a minimum of two connections to provide a primary and secondary path for data traffic, plus a third IP connection for handling SNMP or management utility access. Efforts to add in-band management to the FC protocol basically died on the vine (one reason that FC cannot create a storage area network is because it has no management layer, which is part of just about any definition of a true network). However, groups such as the Storage Networking Industry Association endeavored to compensate with a new and improved management interface called Storage Management Interface - Specification or SMI-S.
For a number of reasons, SMI-S never found widespread adoption within the industry. Engineers found it difficult and time-consuming to implement on storage arrays and, more to the point, vendors discovered that the effort to enable their products with SMI-S was rarely rewarded by increased consumer interest in purchasing their products. Moreover, a number of vendors did not see the value of universal SRM schemes such as SMI-S because they had little interest in enabling their customers to manage their gear in cooperation with the arrays of competitors.
Ultimately, "SMI-S ready" never became much of a product discriminator and many vendors who implemented early versions stopped supporting the technology. The development effort which continues today in some quarters did, however, give rise to a different conceptualization of storage management by defining storage resources as "services." That idea has been seized upon by advocates of Web services generally -- and Representational State Transfer (RESTful) in particular -- as a mechanism for infrastructure management.
RESTful management of storage uses easy-to-construct APIs to interface with the storage device and provides simple HTTP connectivity for users. It provides a simple set of verbs or actions that enable not only passive data collection and reporting, but actual interaction and configuration capabilities. The pioneer in this space is Xiotech with its Intelligent Storage Element (ISE) blade array, but other vendors are now pursuing RESTful management, including IBM with its Project Zero initiative.
From the standpoint of traditional SRM, RESTful management eliminates the need for element managers. Xiotech is already working on enhancements to its RESTful management layer that will enable its products to "sense" each other, to automatically "friend" each other, and to share resources in a dynamic and intelligent way to support the capacity, performance, and data protection/replication requirements of the data that it hosts -- all easily viewed on any browser-equipped client from a laptop to an iPad to a smart phone.
From SRM to Storage Service Management
Open standards-based approaches to storage resource management, such as Web Services and REST, compete with more proprietary approaches currently being offered by brand name vendors under the rubric of "unified storage." NetApp, EMC, and others are working to unify the element managers that ship with their boxes so they can be aggregated into a common management console that only works if every storage array deployed has the particular vendor's logo on its bezel. This is part of the simplify-storage-through-single-sourcing meme addressed in a previous column in this series.
To the extent that the two -- open REST and proprietary unified management -- can be compared at all, it is at the level of numbers of components whose management functionality can be combined. It is an exercise in false equivalency, however, to make such comparisons when storage is heterogeneous. In such cases, open standards-based RESTful management has the clear edge. This is evidenced by the practice at Xiotech of publishing its REST management code for common use at Cortexdeveloper.com!
Perhaps a more significant aspect of efforts such as Web services standards-based management is that it changes the long standing definition of storage management from a narrow focus on the management of storage infrastructure components and plumbing to the management of storage services in relation to application requirements.
The diagram below has three overlapping circles arranged in a row: with the center circle overlapping the outer boundaries of the other two. The circle on the left represents service requirements -- what the business, the application, and the data require of storage. The circle on the far right represents storage hardware resources, including media, arrays, interconnects, and topologies. The right side circle, the hardware layer, is the domain of traditional SRM.
The center circle, however, is what we should be calling the domain of storage management. Storage management rationalizes the requests of the users of storage resources and provisions them with the right set of required storage services (hardware and processes). Storage management in this context includes not only the set of storage management activities for which storage administrators are typically responsible -- capacity management, performance management, and data protection (replication and backup) management -- but also includes currently less-well-defined activities, including data management, service-level management, and maintenance/architecture management.
Storage management, in this context, serves as a translator. It takes the "requirements" of the organization -- an amalgamation of data retention, archive, and security policies identified by corporate governance, risk and compliance managers, budgetary advice from corporate finance, plus the accessibility, performance, and capacity requirements dictated by users and applications -- and translates this into technological requests filled with specific hardware and software "services" that are presented by the resource circle. (In Web services speak, everything is a "service" -- from disk capacity, transports, switch ports, and other physical resources, to processes such as de-duplication, archive, and backup.)
Some work has already gone into the creation of such a translation function, of course. Numerous storage vendors have begun collecting information on how certain applications use storage during operation. HP and others use this information to create storage "profiles" that can be used to advise planners and administrators about the characteristics of new storage that needs to provisioned to applications that are running out of space. Many vendors are following a similar path, and in the future smart storage management may automate the provisioning of both hardware and software resources dynamically based on historical application requirements.
Ultimately, given the comparative ease of enabling products ranging from application software, hypervisors, and operating systems to switches and routers to peripheral devices (including storage) with RESTful management APIs, the hope is that it will eventually lead to a more automated storage management paradigm in which an application signals a need for resources and the infrastructure responds to the request with the appropriate set of "services" with no operator intervention at all. In such a setting, provisioning data with the physical storage resources it requires, the optimal path to those storage resources, and the "handling services" needed by the data based on business policies (retention and migration, backup policies, archive policies, etc.) would be dramatically simplified.
Until real storage management happens, storage efficiency will remain a holy grail -- a goal that can only be attacked at the edges by smart product selection, smart storage virtualization, and even smarter human administrators. Your comments are welcome. [email protected]