In-Depth

Data Management at a Crossroads

The database industry is at a crossroads, but few agree on where things are headed.

According to IBM Corp., Oracle Corp., and a number of experts, the database industry is at a crossroads. The problem is that no one agrees on just where the industry is headed.

Oracle, for its part, has come around to Big Blue's thinking -- circa 1968. The firm now trumpets a single-stack vision -- comprising Oracle database, middleware, and application software, along with Sun hardware -- that sounds like IBM's single-stack mainframe pitch. Oracle also markets the data management (DM) equivalent of a System z mainframe -- its Exadata version 2 (V2), which it positions as a one-stop shop for enterprise DM -- i.e., a single system, based on a single database configuration, that can support both OLTP and data warehousing workloads.

IBM, on the other hand, offers a DBMS "solution" for any conceivable workload: it markets System z "solution" packages that offer turnkey data warehousing and analytics; pre-configured data warehouse appliances (based on its System x platforms and DB2 database); pre-built analytics databases (its "Smart Analytics" offerings); and TCP-record-setting OLTP configurations (its System p "Deepest Blue" package).

IBM and Oracle aren't the only vendors in the game, of course. Microsoft Corp. has had a RDBMS contender since SQL Server 2000 (or, depending on your viewpoint, since SQL Server 2005); the company also promotes a line of canned data warehouse systems via its FastTrack SQL Server program. The upshot, industry watchers say, is that IBM, Microsoft, and Oracle are no longer "it" as far as DM technology is concerned.

Experts don't just mean Sybase or Teradata, either. They point to a line of scrappy upstarts -- starting with analytic database pioneer Netezza Inc. -- that market specialty databases (or, in Netezza's case, database appliances) designed to address specific workload requirements. Netezza, of course, started out in data warehousing; since then, other players have emerged to push specialty analytic databases (usually based on column-oriented implementations) as well as specialty OLTP databases, too.

Companies such as Netezza now claim that it's specifically because of the success they've had in Oracle's and IBM's bread-and-butter markets -- and particularly in the very large data warehouse (VLDW) segment -- that both vendors have, in turn, articulated ambitious DM strategies of their own.

Certainly, Oracle CEO Larry Ellison is aware of Netezza. He referenced the appliance pioneer (along with a few others) in several conference calls with financial analysts. There's also a clear sense that Oracle's Exadata v2 strategy (i.e., combined OLTP and DW) is a response to the efforts first of Netezza (which Ellison lists as one of Oracle's three biggest competitors in the DW segment) and of a new crop of OLTP specialists (such as VoltDB, a new in-memory database start-up) that have emerged to target traditional Oracle database performance or manageability issues.

This is where things get interesting. Last month, IDC analyst Carl Olofson published a report in which he predicted first that column-oriented (or "columnar") database configurations will come to dominate the data warehousing world and that existing OLTP databases -- i.e., vanilla DB2, SQL Server, Oracle, and Sybase ASE -- will either be shifted to run entirely in-memory or will come to be "augmented" by in-memory database technologies.

Database players that use a columnar implementation -- including Sybase (with its venerable IQ) along with newcomers ParAccel and Vertica -- were undoubtedly pleased with Olofson's prediction. Database players that use row-based implementations (albeit in massively parallel processing configurations) naturally took issue with Olofson's prediction.

Take data warehouse guru Foster Hinshaw, for example. He's CEO of analytic database specialist Dataupia Inc., which -- along with competitors Greenplum, Kognitio, and the former DATAllegro -- was part of the first wave of post-Netezza specialty database systems. The database software that powers Dataupia's Satori server appliances uses a row-based implementation. Hinshaw, not surprisingly, doesn't seem columnar as a DW silver bullet. For this reason, Hinshaw -- who (prior to coming onboard at Dataupia) was in at the founding of Netezza (and who's thusly been dubbed the "father" of the data warehouse appliance) -- endorses the viability of the specialty database model, but rejects Olofson's and IDC's prediction that columnar will come to dominate the DW side of that segment.

"Depending on the [kinds of queries] that you tend to know about ahead of time, that feeds well into the columnar or aggregation world. Columnar works great if you know the queries you're doing ahead of time. If you have a different kind of query, if you have a query that you can't anticipate -- so the kinds of ad hoc queries that you get all of the time [in the decision support realm] -- they don't work nearly so well. That's why you need both. The ad hoc capability [that you get from a row-based approach] as well as the kind of pre-determined capability, which is all you get from columnar, or which we can give you with materialized views."

On the other hand, Dave Menninger, who heads product marketing with Vertica, touts Olofson's report as a vindication of both the column-oriented and the specialty database models, which he contrasts with the row-based approaches of competitors such as Dataupia and the "One Database to Manage Them All" model outlined by Oracle with Exadata.

"Our sort of philosophy is that one size does not fit all. You can't use a single technology for every different type of application. You can't use a single technology for every single workload. Could you use the same hardware to do OLTP and DW, like what Oracle is claiming with Exadata? Sure. Are they really tightly integrated? I don't know about that. They're running on the same box, but they're not really tightly integrated," he argues.

"A better way of looking at it isn't 'Can I do this?' but 'Can I afford to do this?' The cost of building, managing, and growing [a single database for OLTP and DW] with something like Exadata is just much, much more than [the cost of] Vertica."

Phil Francisco, vice-president of product management and product marketing with Netezza, seems genuinely uninterested in IDC's conclusions. "I'm not familiar with the report, so I can't speak firsthand from it," he said when asked. Given a summary, Francisco, not surprisingly, disagrees. "I just don't think that's going to happen. Honestly, I don't. If you take one of the biggest real benefits of columnar [technology], [which is] compression -- you're able to get huge compression from columnar -- we can deliver similar [compression] capability with our compression engine," he continues.

"Column is great for certain kinds of queries, but for very busy systems, where you have a mix of ad hoc [and] very active processing, it's just hard for columnar to keep up."

Although Olofson's and IDC's prediction that column-oriented databases will come to predominate analytic workloads is tendentious, to say the least, the IDC report's conclusion that specialty databases (or database technologies) will emerge to service both DW and OLTP applications seems less controversial. The question of what the database landscape of the future will look like is still anyone's guess, however. Oracle touts its Exadata vision; Big Blue, a diverse data management platform experience; specialty vendors, a workload- or application-specific technology prescription, usually with an idiosyncratic (or self-serving) wrinkle or two.

Data management consultant Mark Madsen, a principal with consultancy Third Nature, seems anything but enamored of Oracle's single-platform pitch, however.

"I think that Oracle … may be right about the bulk of customers moving to Exadata simply because Oracle is flogging it as the path, and there are lots of companies with smaller workloads where it's feasible," he comments. "Running all the OLTP and BI workload on the same box is iffy unless they can dynamically partition and isolate workloads to keep one part of the system from stepping on the other. There's still a good reason for separating analytic workloads and data from the operational workload and data. I don't see it going away. If [venture capital] investment in analytic databases is any sign, the future is separate data platforms for the two workloads."

Must Read Articles