The Myth of Storage Grids
Storage vendors use "grid" as a sexy, futuristic-sounding metaphor, but grid storage has nothing whatsoever to do with grid computing.
Here in Florida, where winter doesn’t last longer than a day or two, it’s nearly time to invite the extended family over for a pool party and cook out. One thing I will have to do this year before firing up the old grill is to replace the grid that holds the hamburgers and hot dogs away from the charcoal. The current one is weathered and rusty.
Come to think of it, you hear a lot of folks talking about grids today. IBM, long-time investor in the High Performance Computing Center at the University of New Mexico (where the idea of grid computing was mostly hatched and demonstrated), throws the word around at every opportunity in describing virtually anything the company is doing. Late last year, Network Appliance began following IBM’s lead and started applying the term, like generous dollops of BBQ sauce, over its vision for the Filer of the future. At least one start-up vendor has actually made the term part of its corporate moniker: ExaGrid Systems in Westborough, MA.
To be sure, grids are hip and cool in certain company: mostly among IT folk and undergrads in scientific research labs and universities -- organizations that often lack the coin to buy high-end hardware and instead use Linux-based commodity servers and grid clustering technology to build their own supercomputers. Hearing how storage vendors are abusing their grid terminology would probably steam these guys so much that all tape on the collective bridges of their horn-rimmed specs would unravel all at once.
The more savvy and seasoned among the grid computing crowd might simply sigh, realizing the inevitability that their favorite term would eventually be co-opted and rendered as meaningless as “SAN,” “information lifecycle management,” and “cluster” by that huge marketing machine in the storage industry. The storage guys use grid as a sexy, futuristic-sounding metaphor: something to suggest that they’ve got their game on, as my kids might say. They pay little attention to the underlying architectural model, which is something best left to engineers and academics.
The important thing for consumers is not to confuse the two. “Grid storage” has nothing to do with “grid computing architecture.” The latter can be used to aggregate compute resources to do useful things like simulating nuclear explosions, modeling and predicting weather, or searching the heavens for Near Earth Objects that might be on a collision course with the Big Blue Marble. “Grid storage” means selling you the same stuff we always sold you, but using a different name for it.
Maybe I am being a bit harsh, especially on newcomers like ExaGrid. Their product, which goes into general release sometime this year, is an interesting play in the disk-to-disk-to-tape (DDT or D2D2T) data protection market. The solution comprises an extensible network of NAS heads, called GRIDfilers, that store data on disk trays called, you guessed it, GRIDdisks. ExaGrid add some secret sauce software for replicating data from primary storage to these platforms, for keeping platforms in sync with one another (FTP-based peer-to-peer protocols), and for virtualizing the storage resources that are part of the package so the whole thing scales with data growth over time.
ExaGrid execs say their product solves the complexity issue in traditional backup, improves data protection reliability, and provides scalability, fail-over and policy-based data migration. In short, it is a “one-stop shop” rather than a distressing kludge of vendors A, B, and C’s hardware components with vendors X, Y and Z’s software components. “Consumers are pretty sick of serving as the point of integration,” ExaGrid pitchmen say.
ExaGrid’s pitch for “NAS as Tier 2 storage” predates Network Appliance’s NAS grid pitch by about a year, raising questions of possible message “copycatting.” However, while the start-up and stalwart are basically chanting the same mantra -- proprietary hardware and software for file-only migration and replication services -- it has nothing whatsoever to do with “grid computing.”
Grid computing is a true utility infrastructure comprising commodity gear and mostly open source code. The components are not simply part of the same network, they are seamlessly connected via true, application level, clustering. Most grids are massively parallel by design and can allocate and de-allocate resources to applications that use them. Sharing storage in such an environment is a technically challenging task, but an area of development with tremendous prospects for replacing the brain-dead SANs of today.
I, for one, am looking forward to true grid storage and I am watching the University of New Mexico, not Silicon Valley, to give it to me. Meanwhile, there is a cookout to plan.
Jon William Toigo is chairman of The Data Management Institute, the CEO of data management consulting and research firm Toigo Partners International, as well as a contributing editor to Enterprise Systems and its Storage Strategies columnist. Mr. Toigo is the author of 14 books, including Disaster Recovery Planning, 3rd Edition, and The Holy Grail of Network Storage Management, both from Prentice Hall.