In Praise of Operational Data Warehousing

Real-time or not real-time is no longer the question.

The debate over the need for real time BI misses the point. A better way of looking at the issue, argues Philip Russom, senior manager with TDWI Research, is in the context of operational data warehousing -- OpDW. In this model, Russom explains, real-time isn't an end in itself. Instead, it's just one of several benefits that can be attributed to a robust OpDW architecture.

"Real-time interfaces and similar functions are key enablers for OpDW, but it's about much more than loading data faster into a data warehouse," Russom writes in a new study from TDWI Research, Operational Data Warehousing: The Integration of Operational Applications and Data Warehouses.

Russom explains that operational data warehousing "unifies operational and analytic processes, as well as their supporting technologies. OpDW transforms how the business runs (and how its IT systems interoperate) so that as many business operations and applications as possible are enlightened by the full informational view, historical context, and analytic power of the data warehouse and related business intelligence infrastructure."

Real Time, Right Time: No Time Like the Present

Data integration (DI) vendors have been speaking up about the importance of accessing operational data at real-time (or at close to real-time speeds) for half a decade, so you could say it's been a much-hyped theme.

Where there's a surfeit of hype, BI users -- and IT pros, especially -- tend to detect a lack of substance. It's certainly true that brass-tacks adoption of real-time DI has (at least relative to its hype) been underwhelming.

According to a 2009 survey from TDWI Research, for example, just over one-sixth (17 percent) of shops had implemented real-time DI in one form or another.

At the same time, few in the industry would deny the importance -- to say nothing of the emerging criticality -- of real-time information access.

"I don't think there's any question that [real-time] is a priority for our customers," says Michael Corcoran, senior vice president of marketing with Information Builders Inc. (IBI). "A few years ago, I would've added a disclaimer, [something] like 'for some of our customers.' Now, the ability to get fresh -- or fresher -- access to data is something that everyone wants, with very few exceptions."

On the other hand, some data feeds are fresher than others, which is another way of saying that not everybody actually needs (or can afford) up-to-the-second access to operational data.

That's why most real-time discussions tend to have a "right-time" component. "Right-time," in this context, generally describes the acceptable level of latency --a combination of what's desirable and affordable -- for a specific organization. Because the cost of access information increases as one gets closer to real-time, right-time comprises an economical trade-off.

The question of how best to do right-time is as tendentious as ever. Russom favors an OpDW-based approach.

"There are good reasons why data should go through the data warehouse," Russom writes, noting that "a lot of operational data is taken from a frozen moment in time, whereas many decisions need a broader view. The warehouse can draw from its historical record to create a long-term or seasonal context for operational data." That's precisely the approach advocated by a lot of BI players -- including, not surprisingly, almost all analytic database entrants.

"We tell [our customers that] the best way to get [to] real-time is to go through the [data] warehouse," says Tasso Argyros, CTO and co-founder of analytic database specialist Aster Data Systems Inc.

"This is where we excel," Argyros continues. "We're designed to address specifically this kind of Big Data [use case], where you're analyzing real-time and historical data in the same [context]. Ten years ago, you just couldn't do this -- the technology didn't exist. Now you can bring this real-time [data] into the warehouse and you [can] use MapReduce to analyze it [along with] historical [data]. You don't want [your managers] to be making decisions without seeing the data in context. The historical [data] gives you context."

An ability to enrich time-sensitive information with historical data is the killer app for real-time, argues IBI's Corcoran. For this reason, Corcoran -- like TDWI's Russom and Aster Data's Argyros -- champions an OpDW-centric approach to real-time. (In all fairness, IBI has been championing operational data integration for half a decade now.)

"When you push these real-time business intelligence capabilities down to more and more users, you have a situation in which a manager in one department can see [for example] that maybe the same department in another store is selling lots of red gloves. So he calls them and asks, 'Why are you selling more of these gloves?' Maybe it's because [the manager in the other store] is displaying them with certain hats and these coats, so he can change his own display [accordingly]."

This is one of the biggest reasons why companies that haven't yet made the move to real-time plan to do so -- and soon, Russom observes.

In the same 2009 survey from TDWI Research, for example, an overwhelming majority of respondents -- 92 percent -- said that they expected to move to real-time by 2012. Respondents weren't asked to specify which form their real-time implementations would take, but Russom recommends a OpDW-based approach.

"[O]ne of the roles of the warehouse is to keep a historical record of corporate performance. Time-sensitive data tends to be a good measure of performance, so pushing it through the data warehouse enables the warehouse to record it, aggregate it, and provide an audit trail," he points out. "[T]he data warehouse in an OpDW implementation may execute much of the processing of operational data [via ELT or by rescoring analytic models], provide a needed data staging area of operational data in motion, and generate alerts, recommendations, and other events based on changing data values."