In-Depth

ParAccel's Coup

Columnar DW specialist ParAccel delivers new analytic database software and trumpets a venture capital cash infusion.

Columnar data warehouse (DW) specialist ParAccel Inc. this week announced both a new version 2.0 release of its Analytic Database software and -- of especial interest at a time when venture capital (VC) backers are becoming increasingly parsimonious (see http://www.tdwi.org/News/display.aspx?ID=9493) -- a fresh infusion of investment capital.

ParAccel's announcement comes at a particularly fraught time in the high-end DW segment. It's in this respect that its funding win ($22 million from a number of VC backers, including new investor Menlo Ventures) is significant as it suggests that VC backers might now be moving away from the hardware-only DW appliance model first popularized by Netezza Corp. That company and relative newcomer Dataupia Corp. -- along with high-end DW champion Teradata Corp. -- remain the three primary purveyors of hardware-only DW appliance systems; another hardware-only player, the former DATAllegro Corp., was acquired almost one year ago by Microsoft Corp. Netezza, a publicly-traded company, enjoys profitability, as does Teradata, which was spun off from parent company NCR Corp. more than two years ago. Dataupia, on the other hand, has struggled to raise additional capital. In March, it promoted a new CEO (Cognos Inc. veteran Tony Sirianni) and, in May, laid off approximately two-thirds of its workforce.

ParAccel, by contrast, trumpets a hybrid software or hardware deployment schema. Its model is similar, in this respect, to offerings from other "third-wave" DW appliance players -- i.e., vendors such as Aster Data Systems Inc., Greenplum, Infobright Corp., and Vertica Corp. -- as well as seasoned veteran Kognitio (nee Whitecross), which has long sold its DW software separately.

Unlike the hardware-only systems marketed by Netezza, Teradata, and Dataupia, customers can purchase the ParAccel Analytic Database software and deploy it on their own hardware. Alternatively, by working in tandem with ParAccel or hardware partners, they can purchase the ParAccel Analytic Database as a preinstalled option.

For a long time, it seemed as if both models could coexist. Hardware-only appliance players used to tout either the logic or the desirability of the pre-fab appliance model -- DW appliance visionary Foster Hinshaw, late of Dataupia, famously championed what he called a "Tivo Test" for DW appliances (see http://esj.com/articles/2008/01/30/qa-data-warehousing-and-the-appliance-model.aspx): customers wanted something that they could plug in and turn-on, not something that they had to size, install, and configure on their own, Hinshaw argued.

It increasingly looks as if the VC community -- if not the market itself -- has decided to back the hybrid horse, industry watchers say. "You can't sustain a hardware-based database company," argues a prominent industry watcher with insight into Dataupia's travails who spoke on condition of anonymity. "Teradata and Netezza have [a] longer life, but in the end they'll need to change."

In Dataupia's case, this insider says, there are certainly other adverse circumstances: "[Oracle] wants to sell more RAC licenses, Dataupia reduces the number of licenses [that a customer needs]." Nonetheless, this insider contends, Dataupia's hardware-only model -- chiefly, its inability "to keep up with commodity hardware" -- comprises a big problem in its own right.

In this respect, it's hard not to see ParAccel's $22 million financing round as a validation of the hybrid model.

New Checklist Features

In addition to its successful Series C financing, ParAccel also announced version 2.0 of its flagship Analytic Database. That release is similar to several recent DBMS deliverables (from competitors Greenplum and Vertica, among others) in that it both fleshes out ParAccel's SQL feature set -- the revamped ParAccel DBMS supports SQL 2003 amenities such as window aggregates and scalar user-defined functions (UDF) -- and is said to boost performance, too. Kim Stanick, who manages marketing for ParAccel, says ParAccel 2.0 also ships with a significantly improved query optimizer facility.

In this regard, she argues, ParAccel continues to shed its PostgreSQL origins.

"We've invested heavily in a new advanced Query Optimizer for MPP queries that is MPP aware, columnar aware, and does sophisticated and very advanced column-pruning and also does advanced query rewrites. It also has an extended query decorrelation capability for these correlated subqueries," she says.

"We've decided that the best thing to do is to build the best brain to run the database. It's actually the last part of our database," Stanick continues. "People kind of accuse us of being a Postgres replica. We do have a Postgres origin, but we have two parts to our architecture. One is our compute nodes, which handle the networking; the other is the leader node that does the parsing and the planning and the optimization happens."

ParAccel's compute nodes were "built from scratch" using a new (non-Postgres) database engine, according to Stanick; its leader nodes, on the other hand, retained a good chunk of Postgres code.

The new Query Optimizer replaces a Postgres-based optimizer that was designed chiefly for OLTP database platforms. "Quite frankly, the Postgres optimizer couldn't plan very well beyond a couple of dozen tables," she says.

ParAccel and other DW specialists -- including both Greenplum and Vertica -- have variously worked to either refine their open source software (OSS) innards or flesh out their capabilities, particularly with respect to SQL support (see http://www.tdwi.org/News/display.aspx?id=9450).

ParAccel 2.0 boasts several other differentiators, according to Stanick. For one thing, she claims, it delivers improved support for Clariion storage from EMC Corp. "Our blended scan [capability] gives us the ability to work with EMC Clariion," citing ParAccel's partnership with EMC to deliver a pre-fab analytic appliance.

"Blended scan basically allows you to leverage both the on-server disc that you would get with a normal appliance like us or Vertica or DATAllegro or Greenplum, and you can marry that with a SAN attachment and use all of those SAN discs too. We intelligently blend the data so that the data that's on the servers is a portion of all of the data."

The key benefit, according to Stanick is that "you're able to leverage both sets of I/O and both bandwidths at the same time."

About the Author

Stephen Swoyer is a Nashville, TN-based freelance journalist who writes about technology.

Must Read Articles