A Paler Shade of Green Storage

A greater use of the "delete" key may be the greenest data strategy of all.

Last week, I received the results of an independent survey of 225 IT managers commissioned by San Jose, CA-based Storewiz, which touts itself as "the only provider of primary storage data compression solutions." The survey covered questions I track, such as rates of data growth.

Of those interviewed, 74 percent said they were interested in reducing the size of their stored data and management time, and 71 percent professed interest in cutting backup time. Additionally, 82 percent said they had no budget for spending more on storage capacity in this quarter, with 39 percent reporting that they are managing more than 10 TB of data, and seven percent managing over 100 TB.

Most respondents said their data was growing between 11 and 25 percent per year, while twelve percent said their data growth rate was between 26 and 50 percent per year; four percent said they were expecting more than 50 percent growth.

While I am dubious that most companies know how much data they have, let alone any realistic notion of how fast data is growing (they usually know how much storage capacity they are adding each year, which is another matter altogether), I found the sampling probative in at least one finding: 28 percent said that power consumption associated with data storage will rise by as much as 50 percent in the coming year.

Power concerns—where additional power is going to come from and how much it is going to cost—seems to be the real driver of Green IT in the US. Unlike our eco-conscious brethren in Europe, Green initiatives seem solidly focused in most businesses today on cost-savings in the area of utility bills.

Of course, green goes a lot deeper than power consumption. The past few weeks have seen announcements of numerous green initiatives aimed at the tech equipment manufacturing process and everyone who is anyone is going along for the ride. Chip manufacturers, LCD screen manufacturers, and many others are taking the green pledge, promising to clean up manufacturing where possible to eliminate or minimize nasty chemical compounds that play havoc with the Earth’s ecosystem. That’s good for everyone.

However, there is an undercurrent in the green movement, particularly among storage vendors whose equipment is rapidly becoming a dominant consumer of power and generator of heat in the contemporary data center, that is a paler shade of green. Much of the talk about greening storage emanating from the vendor community is contributing more to global greenhouse gases than addressing the root causes of storage-related power consumption and how to check it.

A look at recent articles and white papers by the storage vendor community will get you a whiff. "Getting to green storage" involves, depending on the vendor you consult, one or more of the following options:

Option 1: Change out your current disk drives for more capacious units that use the same amount of electricity. One leading enterprise storage vendor, EMC, has just announced its embrace of big SATA drives and its press releases are filled with information about how much more capacity a user will get from the same amount of power. In truth, they are re-treading the same turf already visited by a competitor, Network Appliance, earlier in 2007. Now, just about every vendor of RAID arrays seems to be towing the line.

Option 2: Add "thin provisioning" (TP) to get more efficient use of the disk capacity you have. Pioneered by DataCore Software back in the 1990s, the term "thin provisioning" has been coined by 3PAR and other vendors to describe a kind of shell game that on-array software plays with "reserved but not yet allocated" disk space. Thin provisioning puts reserved space into play to hold real data, providing the customer with the illusion of "more space" for storing data. TP, so the story goes, is green because it enables the customer to defer additional purchases of storage arrays.

Option 3: Deploy Massive Arrays of Independent Disk (MAID) arrays. The primary evangelist and vendor of this play, COPAN Systems, offers storage arrays enabled with special software technology that allows you to power down disk drives when they are not in use. Doing so, says the vendor, yields reductions in overall energy consumption by hungry disk spindles.

Option 4: De-duplicate your data using products from companies such as Data Domain or Diligent Technologies. Approaches vary somewhat from one vendor to the next, but basically you compress data by substituting a stub for a predefined bit pattern, which after several passes, makes big files smaller and frees space for reuse on your disk drive. Like thin provisioning, this technology is said to be green because it enables the purchase of additional arrays to be deferred. As an alternative, you can go with Storewiz, whose STN-6000 compression appliance claims to be able to deflate space hungry data and to return "a minimum of 2x to 6x additional capacity to companies" that deploy their product.

Option 5: Use a combination of options 1 through 4 to see interesting capacity improvements that yield corresponding power demand reductions.

Welcome to the wacky world of storage, where vendors attack all problems by throwing more technology at them.

Any or all of the above "strategies," while they might reduce power consumption rates in the short term, are not strategic at all. They do nothing to address the root cause of the "carbon footprint" problem in IT—the politically correct and "eco-conscious" way of saying increasing demand for electricity.

The root cause of accelerating power requirements for storage technology is the data itself—or, rather, it is unsorted and unmanaged "junk drawers" of data that we charmingly refer to as storage infrastructure.

That storage arrays are quickly filling to the brim is less a sign of a sudden increase in business-relevant data, than a clear indication of mismanaged data. We have accepted the habit of storing all data, from the gems to the junk, in a huge undifferentiated mess that consumes storage capacity the way that acid reflux syndrome eats the esophageal lining.

To address the root cause of climbing power demands strategically, we first need to "green" our data—parse through the data we store, save what is worth saving, and delete the rest. As Mike Linett, CEO of Zerowait, a high-availability storage engineering firm, likes to say, "The delete key is the greenest key on the computer keyboard."

That’s part of it. Green data is realized, in part, by culling out duplicate files, junk data, spam, contraband files, and orphan data from your storage repository. A "deeper shade" of green is realized when you deploy an honest-to-goodness archive strategy at your enterprise, moving data that must be retained for business or regulatory reasons, but that has little chance of re-reference, onto green (low-power) media – such as tape and optical.

Greening your data is strategic and is likewise capable of delivering long-term power demand reductions. Greening your storage is tactical, and is ultimately about as beneficial as re-arranging deck chairs on the Titanic.

My fear is that the industry is spending a lot of money getting consumers to believe that their tactical solutions will actually amount to real results. They are finding partners in the energy community. PG&E, the San Francisco-based utility company, recently announced an energy credit program for companies who deploy COPAN Systems arrays. However, you get no credits on your PG&E bill if you deploy tape or optical disc, or initiate a real archive strategy that will have far more impact on energy consumption going forward.

Bottom line: I find myself scratching my head when I read the storage industry response to greening. Is it about the green (carbon footprint reduction) or just about the green (money from selling more bogus green technology)? You be the judge, and feel free to let me know your thoughts. jtoigo@toigopartners.com.