In-Depth
Why All-in-One Platforms are a Data Warehouse Dinosaur
If a single, combined platform for transaction processing and analytics is a bad idea, isn't it likewise a bad idea to champion a single DW platform for the wild profusion of analytic workloads? Netezza, an IBM Company, thinks so.
- By Stephen Swoyer
- 09/20/2011
Don't ask Razi Raziuddin, director of product marketing for IBM Netezza, about the virtues of a combined platform for analytic and OLTP workloads.
Raziuddin says he doesn't want to hear it. The industry's spent the last two decades getting away from an all-in-one platform for data management (DM); during the last decade, Raziuddin argues, this transition actually accelerated.
The upshot, he maintains, is that all-in-one is regressive, not progressive.
"What customers are realizing is that while that [all-in-one] architecture might look great on paper, it's very difficult to achieve. With data growth [trends] and the different types [of] analytics that customers increasingly want to do, it's very difficult and almost cost-prohibitive for a single system to accommodate all of the [different kinds of] data and data types," Razi contends. "That's where you pretty much have to use a distributed model where you have ... systems optimized for different workloads and different analytics working together."
To the extent that all-in-one has any currency, Raziuddin suggests, it's largely at the behest of -- or, more precisely, in the service of -- vendors that want to control the complete DM stack, such as IBM Corp.'s arch-rival, Oracle Corp.
"IBM has a very different perspective. We announced a strategy [viz., Smart Consolidation for Smarter Computing] that's really all about the shift in enterprise data warehouses and large data warehouse architectures from [a] very monolithic to a large distributed data warehouse architecture," he comments.
There's a sense, he suggests, in which DW professionals have labored to replicate in a data warehousing context that which they profess to eschew in a broader DM context: the all-in-one platform -- in this case, the enterprise data warehouse. If a single, combined platform for transaction processing and analytics is a bad idea, isn't it likewise a bad idea to champion a single DW platform for the extreme diversity -- the wild profusion -- of analytic workloads?
"The industry over the last decade or so has tried to build this single large EDW that'll meet the needs of every single user in the enterprise and every single line of business in the enterprise, using a monolithic system, very much propagated and promulgated by Teradata, and also by Oracle and IBM," Raziuddin explains.
There's another wrinkle here, too -- namely, that the EDW, which (in the abstract) promises a single, centrally managed version of the truth, almost invariably produces its opposite.
"In most environments [the needs of] the line of business … are not met in time, and so they go off and create all kinds of data marts, and it's very common that you get a scenario where you've got hundreds of spreadmarts scattered across the enterprise," he points out.
"What we're talking about ... is you can go in and consolidate these marts into appliances. We've done this over and over again [with] a pretty large number of our customers. You can get real tangible value in a short amount of time by consolidating these data marts and putting all of the analytics onto these Netezza systems."
This isn't a vision in which one all-in-one platform supplants another, however.
In a distributed analytic environment, Raziuddin explains, queries must be shunted or routed to the appropriate destinations -- i.e., to systems that have been optimized for their specific workload characteristics. Ditto for certain classes of users. In this respect, it isn't unlike the vision Teradata touts in tandem with its forthcoming Unity offering.
Raziuddin acknowledges as much. The industry is trending toward a distributed analytic architecture, he maintains. It's just that some vendors are more bullish about this change than others.
"If you have different systems optimized for different [analytic] workloads, you will also need [to be able to] transparently ... route queries," he explains. "You want transparent query rerouting technology, [which] we have ... in the form of IBM's Federation Engine, [which is] part of Infosphere Federation Server. These are integrations that you'll hear more about in the not-so-distant future."