The Real Storage Crisis
A patent infringement case may be just the impetus IT needs to initiate data management projects.
About two weeks ago, there was an announcement of tremendous potential importance to the storage industry. However, it received virtually no attention in the trade press or at the recently-concluded Storage Networking World conference in Dallas, TX. The U.S. International Trade Commission (ITC) was about to consider a claim involving patent violations involving just about every disk drive manufacturer in the world. The result could be an injunction by the ITC barring import of all hard disk drives into the U.S. until the matter is resolved.
In a statement issued on October 10, the ITC said an investigation has been launched in response to a complaint of violation of Section 337 of the Tariff Act of 1930 that seeks a ban on importation into the U.S. of products that allegedly infringe on U.S. patents. The complaint was made in September by Steven and Mary Reiber of Lincoln, California. They hold the patent for "dissipative ceramic bonding tips," a technology used to make electrical wire connections inside hard drives.
Calls to Western Digital and Seagate, two of five companies involved in the dispute, including array makers and resellers (Toshiba America Information Systems, Inc., Hewlett-Packard Company, and Dell Inc.), regarding the matter remain unanswered. It is typical for companies to maintain silence while legal investigations are under way, but the bottom line is that disk shortages will provoke higher prices on short supplies—the logical outcome of any injunction.
With industry analysts projecting an increase in disk storage capacity demand of 300 percent from 2007 through 2011, the question is: how will supplies keep pace in the event of an injunction? If the issue of patent infringement results in a reduction in supply, I'm hoping that it will be an incentive for companies to initiate data management projects.
There is an abundance of data showing how companies are misusing about 70 percent of every hard disk they own. Taking into consideration orphan data and non-business-related material, data that still has business value but is never accessed (and is therefore suitable for archive), and capacity that is being held in reserve by disk array vendors for their own software or by application software to ensure that elbow room is available for future needs, we are wasting most of the disk we have in service today.
Recently, I had the opportunity to chat with IT and storage administrators for very large financial and manufacturing companies and the story is always the same: “oversubscription with underutilization.” Companies are buying far more storage than they need and utilizing what they have very poorly.
In the case of one firm, the IT manager I spoke with at Sun FORUM said he had a Fibre Channel fabric with over 300 TB of capacity. A recent analysis of capacity allocation in the SAN revealed that there was only about 30 TB of useful, business-relevant data in the “SAN.” Despite this fact, the man said the company has been buying more storage capacity every year from its vendors. With budget cuts looming, he sounded desperate to find a way to slow his storage growth.
In another case, the IT manager for a large financial institution with one of its data centers in the New York metropolitan area complained that storage gear was now replacing servers as the biggest consumer of electrical power in his shop. He was constantly adding more capacity to his “SAN” but lacked visibility into his arrays to determine how space was actually being utilized and what data was parked on expensive Fibre Channel drives versus more capacious SATA disk. His vendor simply did not provide tools to assess how space was being used or the importance of data that was written to all of the spinning rust. A pressing problem confronting the man was that additional power for new equipment was increasingly difficult to acquire, yet his vendor was telling him that he would need to deploy additional arrays within the next three to six months.
In both of these cases, a comprehensive storage assessment should be the first order of business. In both cases, however, vendors have reassured the customer that cheap, plentiful disk, combined with new features such as thin provisioning, compression, and de-duplication, not to mention re-driving older arrays with newer high-capacity drives, were the preferred course of action.
Preferred to what? To data management—the analysis and classification of data and its hosting on media-based characteristics such as volatility, access demand and frequency, and business value. While data management might be the real road to storage capacity allocation and utilization efficiency, the short cut, according to vendors, is to keep bits anonymous and to play games with space instead.
Making Storage Management Work
Let’s face it, no one really likes the idea of having to conduct a search of stored bits of data. Even if the result is a more efficient (and in the case of tech infrastructure, more cost-effective) use of space, the thought of doing the heavy lifting of data management is anathema in most IT shops.
The reasons are simple. First and foremost, in most companies IT alone is not qualified to assess the business value of stored data. Making such an assessment work requires the involvement of many business stakeholders, including those who create the data (users), the legal department, senior management, compliance and audit, and many others. If IT still lives in a glass house, apart from the rest of the company, the thought of interacting with the business side of the house may be considered with trepidation and alarm.
Assuming that cultural barriers can be surmounted, the next issue is one of tools. What can we use to reveal the actual distribution of data and to measure its growth, access frequency, and change rates? There are tools, of course, but some represent a hefty investment, if not for the wares themselves, then for the training time and other resources that will be required to put them to work. Many of companies I talk to regard lack of tools as the number-one impediment to data management and say that they are waiting for silver-bullet solutions such auto classification. My advice is always not to wait too long. Compliance, security, business continuity, and data center greening are all being held hostage to blind faith in the progress toward classification nirvana.
Truth be told, an effective job can be done using just about any backup software package (if you want to back into the data characteristics by looking at how often changed data needs to be backed up). Alternatively, through an interview process—such as those used in disaster recovery or security planning (when these activities are done correctly) to identify data assets that need to be protected and business processes that need to be recovered—rudimentary data classification can be accomplished using only a spreadsheet.
The rest of the solution is to come up with infrastructure that is appropriate to data behavior. If data is referenced but not changed, it can be hosted on lower-cost, higher-density media such as tape or optical. Give it a try: a nice product from Digistor Solutions called the Centurion DiscHub Optical Library provides a data hosting location in the form of 100 DVDs or CDs and costs about $400. You can stack these units to get to some amazing capacities on a budget: perfect for the office setting in your distributed environment.
For a more “enterprise-class” solution, you might want to re-visit automated tape libraries for less frequently accessed data. With capacities exceeding several thousand tape cartridges and access times measured in milliseconds rather than microseconds, tape might just offer the right combination of price and performance for less frequently accessed data. At pennies per GB, you can’t beat the price.
For some reason, despite the well-known failure rates of hard disk drives, there is a bias toward storing everything on disk that simply doesn’t jibe with real world data use. Somewhere along the way, we started letting the disk array folks call the shots in storage. This bias needs to change—and fast—if we are to green IT, protect data appropriately, and right size infrastructure to what we really need.
If the ITC moves forward to block importation of disk drives, maybe this will build the fire that needed to burn away years of storage abuse and indifference and to get real about data management. Your comments are welcome: firstname.lastname@example.org.