Happy Green-er New Year
All of the Energy Stars in the world will not keep the lights from going out.
In one of the last columns I wrote in 2008, I managed to raise the hackles of Dr. Jonathan Koomey, a project scientist at Lawrence Berkeley National Laboratory (not Lawrence Livermore, as I mistakenly stated in the column), consulting professor at Stanford University, and author of Numbers into Knowledge: Mastering the Art of Problem Solving, which is now in its second edition. Specifically, I referred to his study on server power consumption, which has become a seminal component of the Environmental Protection Agency's Energy Star program as it pertains to IT hardware. In a word, I called the study "flawed."
It was not my intention to discredit Koomey's work -- especially given my understanding of the challenges he faced in coming up with any sensible numbers whatsoever. However, I had to question the impression created by the study: that servers were the number one power pig in the corporate data center. In a growing number of the shops I visit, and in the interviews I conduct (and read) with data center managers, it seems that storage arrays are soon to surpass, or have done so already, servers as the leading consumers of electricity in the data center.
As a follow on to the piece, I contacted Koomey who generously granted me some time for an interview. As expected, the bulk of the conversation was a defense of the findings of his research, which focused on historical trends using 2000 through 2005 data from analyst IDC's server database. That he needed to leverage the data of a commercial organization whose analytical purity is hardly above suspicion is a pity. Koomey said that the installed base data "didn't seem too far off the mark"; I will not argue the point.
AMD funded Koomey's efforts, which partly accounts for his focus on only server power consumption. In his defense, however, he included networking and storage electricity use in his most recent estimate of electricity used by data centers (http://stacks.iop.org/1748-9326/3/034008). He also went to great pains to acknowledge the limits of the original server study -- specifically, the difficulties in characterizing power consumption by servers based on available data.
Said Koomey, "Coming up with a typical server power-consumption estimate given the differences in configuration and workload was a challenge. Any numbers you derive are likely to be wrong for some installations."
He had to sift through manufacturer data, which does not usually capture nuances of configurations -- how much memory is typically installed, how many feature cards, etc., let alone differences in workload. Clearly, a file server typically fields fewer requests and generates much lower IOPS than say the same server equipment hosting a high performance transaction processing application.
Koomey supplemented manufacturer data with interviews with "key people" at Sun, HP, Dell, and IBM for better estimates. The resulting power consumption data, he said, gave "a reasonable aggregate picture of power use per server."
"The data is not a perfectly accurate picture of actual power consumption," he said, "but it is a good back-of-the-envelope estimate, informed by some of the most knowledgeable technical people in the industry."
Koomey stands by his numbers, and by the forward-looking estimates of server power consumption growth proffered by EPA as a follow-on to his historical research. He said that the LBL scientists used IDC forecasts of server installed base to develop their projections of future server power demand and had "factored in the impact of server consolidation on blades and through the use of server virtualization in its projections." He maintained that servers will continue to be the dominant power consumers in data centers in the future, with upwards of 60 percent of kilowatt hours consumed by the server farm, versus the 10 to 20 percent that goes (each) to networking and storage gear.
He said that IDC did conduct a similar power-use study focused on storage that followed similar methods to those used in the EPA report to Congress, with comparable results, but he refrained from offering any additional commentary on that study except for the observation that "power use scales with spindles, not with terabytes." He said that LBL would like to do a formal peer-reviewed study of storage that is similar to the server research, but that the work needs a sponsor. Apparently, there is no AMD in the storage world that finds it to be of value to understand storage power consumption for the industry as a whole.
Koomey and I ended the call as friends. We are both seeking to raise awareness about power inefficiency in IT gear so that something can be done about it, he acknowledged. That was foundation for future conversations.
My parting thought to him was this: assessing energy consumption, as innocuous as it may seem (except for the power bill presented to corporate accounting and perhaps the long term impact on climate change), has far-reaching implications about the nature of technology products and about the vendors who sell them. Rather than arguing over the question of which technology is the bigger pig -- servers or storage -- perhaps a more systemic view is required, with special attention paid to the impact of vendor competition.
Here's what I mean. Servers are used at less than 15 percent of their capacity in most shops, encouraging the adoption of technologies such as virtualization that are intended to increase their resource allocation efficiency. Can we combine workloads from multiple servers onto one using virtual machines to get our efficiency numbers up and reduce the number of physical frames that we are plugging into the wall? VMware, Virtual Iron, Citrix Systems, Microsoft, most Linux vendors say yes. (The Linux guys correctly observe that they use all of the cores in a multicore processor, unlike Redmond, which creates a more efficient utilization model from the get-go.)
I found a kindred soul in Koomey on this point when we shared our thoughts on the greater efficiency of mainframes over distributed servers generally. "It is interesting that we have come back to that point," Koomey said, his smile apparent even through the phone line.
However, a big difference between servers and perhaps networking gear and storage gear is that the products themselves are largely commoditized and standards-based. A Dell server can be readily replaced by an HP server and an HP server by an IBM server. There is very little difference at the hardware level. The same holds true for most network switches, hubs, and routers despite the value-add features that Cisco Systems, Juniper Networks, or others add to the mix.
Storage, however, is a different animal. Despite reams of paper dedicated to standards concerning interconnects and disk drive components and interfaces, vendors continue to perpetuate the myth that storage products differ vastly from each other. In the case of storage, Fibre Channel switches do not interoperate, even if they conform at base to the same ANSI standards. At the frame level, vendors seem to go to extraordinary pains to create stovepipes that will not work and play well with a competitor's stovepipes.
Part of the interoperability -- or more importantly, the inter-manageability -- dilemma is owed to a self-serving desire to make things much more difficult for the consumer who is thinking about replacing an EMC array with an HDS array or an IBM array. Proprietary data-pathing schemes, RAID schemes, value-add software schemes, and barriers to status and capacity monitoring via third-party tools all contribute to the pain of heterogeneous storage and its management. While folks such as CTO Ziya Aral at DataCore Software have made great strides to virtualize the mess that is storage in order to bring some coherency and manageability to both array and interconnect heterogeneity, many consumers are still dubious about the storage "V" word. Marketing campaigns by big vendors are aimed at infusing discussions of storage virtualization with fear, uncertainty, and doubt -- except, of course, for the storage vendor's own virtualization solution that works best with its own products.
From where I'm sitting, the perpetuation of the myth of storage product unique-ness by the three-letter vendors is a huge contributor to the growing power consumption of storage gear. One company that seems to get this is Xiotech, with its Intelligent Storage Element (ISE) platform that it bought from Seagate in December 2007. ISE is a Lego-style building-block array, readily managed via any W3C standards-based Web services approach. I am probably viewed as a cheerleader for this product, having written about it several times in 2008, but I am willing to take the risk, not just because of my affection for the product itself, but because it points storage in a sensible direction.
Hardware is not the root cause of storage power consumption; data is. Piles of data and a lack of attention to its storage from the standpoint of both business value to the organization and frequency of reference in day-to-day business are making IDC's otherwise idiotic "storage gap" thesis (we aren't fielding enough disk to keep pace with rates of data growth) a self-fulfilling prophecy.
Supported by tons of marketing verbiage, disk array vendors are gaining mindshare that 1) intelligent archiving is hard work -- too much for busy IT professionals, and 2) disk is easier and more effective as a storage medium for undifferentiated and rarely accessed data than either tape or optical. Their answer to the data burgeon: more disk, and maybe some compression or de-duplication smoke screening to help squeeze the data burgeon onto more and more spindles that are themselves ever growing in size.
Think that EMC's "shatter the platter" marketing campaign, intended to turn folks off to optical disc, had no impact? The failure of Plasmon, uber vendor for optical technology, in 2008 can be chalked up at least in part to such efforts. So can the general disenchantment I am sensing from other optical players in their own industry association, OSTA, which failed to counter EMC marketing in any sort of effective way.
You don't need to look very far to see the campaign of the disk proponents against tape technology. Backups, we are told, are now best made to disk, then de-duplicated using proprietary squeezing technology, then replicated over a wire to the same vendor's de-duplicated virtual tape subsystem. Data Domain continues to offer handouts at conferences and trade shows in the form of a red bumper sticker stating in bold black type, "Tape Sucks." I have yet to see one on an actual automobile bumper, but if I did, road rage might result.
Last year saw major improvements in tape technology, including massive capacity gains at the high end and products such as Crossroads Systems' Read Verify Appliance (RVA 3.0) that can help optimize tape operations and dramatically reduce costs. Spectra Logic added comprehensive automation in tape management in their gear, an advance that was hardly mentioned by the trade press.
While it is estimated that 70 percent of the world's data is stored on tape, and it is certainly the case that IT operations people understand the value proposition of tape, the front office cannot help but be influenced by the barrage of marketing and sales information that suggests how tape is a huge cost center best addressed by its replacement with spinning rust.
Against this backdrop, we have research into power consumption and green IT. Most of it focuses on electricity consumption by individual products, rather than looking at things from a systemic point of view: data lifecycle and resource management. Without raising our eyes from the weeds, all of the Energy Stars in the world will not keep the lights from going out.
Happy New Year.
Your comments are welcome. firstname.lastname@example.org.