In-Depth

The Rising Tide in the Data Warehouse vs. Data Mart Debate

Although an enterprise scope is still seen as the Holy Grail of data warehousing, departmental and even enterprise data marts are now countenanced as well.

Is building an enterprise data warehouse (EDW) the best path to business intelligence (BI)? It's a perennially vexing question that -- thanks to a couple of recent trends in BI and data warehousing (DW) -- has taken on new life.

The value of the full-fledged EDW seems unassailable. Over the last half-decade, however, some of the biggest EDW champions have moderated their stances, such that they now both countenance the existence of alternatives and, under certain very special conditions, are even willing to admit they're useful. The result is that although the EDW is still seen as the Holy Grail of data warehousing, departmental (and even enterprise) data marts are now countenanced as well.

Active EDW giant Teradata Inc. is the foremost case in point, but other players -- including relative newcomer Hewlett-Packard Co. (HP), which is in the high-end DW segment (by its acquisition of Knightsbridge Solutions) and markets Neoview, a DW appliance-like offering -- are staking out similar ground. (In addition to Neoview, HP also partners with both Microsoft Corp. and Oracle Corp. to market appliances in the 1 to 32 TB range.)

It's a fine line. Teradata, for example, last year moderated its decades-long stance, effectively conceding that -- under certain conditions -- data mart deployments aren't just inevitable but viable, too (see http://www.tdwi.org/News/display.aspx?id=8968).

Teradata wasn't just paying lip service to a trend. In 2008, it introduced a data mart appliance of its own even as it refined its Active EDW messaging to accommodate data mart deployments. Officials have worked to both ground the validity of a hybrid data mart and EDW coexistence model and champion the desirability of Teradata's Active EDW vision. In this respect, representatives argue, although data mart deployments may be inevitable (because business units will try to go around IT when they feel their needs or wants are given short-shrift), this doesn't mean that they must remain data marts.

"There are these splintered [groups inside of] organizations that will build a particular data mart … because it's their data … [so] they want to control it, or maybe for political reasons. Even for customers [who] are executing an enterprise data warehouse, there are data marts being built," acknowledged Randy Lea, vice president of product and services marketing with Teradata, in an interview last month. In the same discussion, however, he was also careful to stress that (per Teradata's vision) data marts must at some point be brought into an overarching Active EDW.

"We frequently get asked, 'Wait a minute, appliances are used for data marts. Does that mean you're no longer advocating your enterprise data warehouse -- your integrated data warehouse view?' No, it doesn't. We've always believed … that the value of integrated data drives differentiated answers and questions, differentiated actions that you can take against your competitors for competitive advantage," he said. "We are always going to position our enterprise data warehouse as the primary architecture from a business perspective."

Acquiescence

What's interesting is that Teradata, long a visible (and credible) champion of Big EDW, has company for a couple of reasons.

The first and most obvious reason is that the DW appliance model has indisputable validity. Whether that validity stems from the efforts (or disruptiveness) of the DW appliance vendors themselves; whether it's part of a bigger back-to-users trend; whether it's indicative of what cultural historians call a Tendenzwende (a shift in the Zeitgeist); or a combination of some or all of these factors is still anyone's guess.

The back-to-users trend seems most compelling. No less than three vendors (Lyza Inc., QlikTech, Jaspersoft Corp.) now market desktop-based analytic tools (Microsoft will be the fourth vendor sometime next year.) These aren't the Excel-oriented offerings of old, either; they're full-fledged analytic workbenches, complete with local in-memory data stores. Industry watchers describe them as "Workgroup BI" toolsets, largely to distinguish them from the highly centralized, all-in-one BI suites marketed by Oracle, SAP AG, and other players (see http://www.tdwi.org/News/display.aspx?ID=9373).

Sound familiar? DW appliances bring data processing back to the workgroup, department, or business unit. The term "Band-Aid" has a pejorative connotation, but it's an apt description of how they function, which is to say, they've been called Band-Aids precisely because they've been used as Band-Aids: organizations deploy them chiefly to mitigate (or eliminate) chronic business pain points, such as poor ad hoc query performance.

At some point over the last 24 months, the DW appliance turned a corner: appliance pioneer Netezza Inc. went from unprofitable to profitable in the space of about a year, and enterprises are aware of the value of the appliance model.

So, for that matter, is Teradata, and it isn't alone.

Recently, HP started making noise about its data warehousing ambitions. Like Teradata, HP wants to subordinate the data mart to its overarching EDW vision. HP doesn't have Teradata's pedigree in the high-end DW space; it effectively vaulted into DW contention two and a half years ago when it introduced Neoview and (about a month later) snapped up BI and DW consulting powerhouse Knightsbridge. Nevertheless, it emphasizes a centralized EDW vision, with its Neoview platform at the center. Neoview, in HP's vision, occupies much the same role as the Teradata Warehouse in Teradata's Active EDW scheme. What's more, HP officials likewise talk up the desirability of the no-holds-barred EDW.

"Wherever the data warehouse is centralized, not only is it cheaper, but you get more out of the warehouse in the long run," argues Vickie Farrell, who heads product management for Neoview. There are exceptions in Farrell's and HP's doctrinal view, however. She concedes that would-be EDW users can't just develop, test, and implement a data warehouse of enterprisewide breadth and scope overnight, so she espouses touts a phased approach to EDWs. "What we're saying is, you centralize to begin with; you federate where you need to; you do your ETL, or your ELT, if that makes more sense."

Federation opens the door to a host of doctrinal exceptions, starting with (otherwise) siloed or application-specific data sources, and ultimately encompassing data marts. Although she's careful to delimit both the scope and the applicability of data marts, Farrell nonetheless concedes that they do have their place. She champions what might be called The Second Coming of Data Marts -- "disposable" data mart configurations that organizations can plug-in and quickly populate largely to address specific business problems.

"We're seeing the complete reinvention of data marts. Where once they were independent -- they were on expensive platforms that required a lot of tuning, they were not under the control or knowledge of the IT organization, … and they [posed] dreadful data synchronization problems. What we're seeing instead are these pop-up disposable data marts that are on very efficient platforms. They are under the control of the IT department, but they're much quicker [and] much easier to deploy." What Farrell describes sounds suspiciously like a DW appliance, of course, a point she concedes.

HP does not position Neoview as an appliance play. It can function as an appliance, Farrell and other officials stress, but it's happiest when yoked as part of an EDW platform. HP is happy to collaborate with Microsoft and Oracle; both companies offer pre-sized DW configurations, and HP offers optimized hardware configurations for both SQL Server and Oracle 11g. The point, Farrell maintains, is that EDW is a goal, not a forklift proposition: it's even conceivable that -- once an organization has centralized most of its data in an enterprise-wide data warehouse -- pockets of comparative isolation (exposed by data federation software or by means of other connective technologies) might persist.

HP's EDW vision is a flexible one, she maintains. "A lot of times, it's doing it on an application-by-application basis. As they're building [new] applications, they're building a better data foundation than what they had before -- [one] that will serve the new application and over time will serve other applications," she comments, citing not just Neoview but HP's Knightsbridge consulting expertise in this regard.

"We're going in and complementing and augmenting and extending what [customers have] already done. We're provisioning the data once so that it can be used by all of the analytic applications, rather than ETL-ing data over and over and over for every data warehouse and data mart they have -- kind of providing an operational analytic platform that complements what they've already done."

Must Read Articles