Analytic Database Players Keep Things Interesting

Vendors competing in the fractious analytic database segment sure know how to keep things interesting

Vendors in the ceaselessly striving analytic data warehousing (DW) segment certainly know how to keep things interesting.

Just last month, for example, columnar database specialist ParAccel Inc. announced a pair of Monty Hall-worthy deals: Cash for Clunkers (as in clunker DBMSes, not cars) and "Faster or Free," an initiative that effectively has the company putting its money where its marketing mouth is. Nor was that all: rivals Aster Data Systems Inc., Greenplum Software Inc., Kognitio, Netezza Inc., and Vertica Inc. -- all of which compete with ParAccel in the analytic database arena -- recently trumpeted announcements of their own.

ParAccel Preening

The first phase of ParAccel's Cash for Clunkers program targets Netezza -- and only Netezza -- systems. Customers can "trade in" any flavor of Netezza appliance for a deeply-discounted ParAccel software subscription that entitles customers to one free year of the ParAccel Analytic Database (PADB) and two additional years at what amounts to cut-rate ($15,000 per TB) pricing. ParAccel's list price is $100,000 per TB; its real-world pricing is said be closer to $50,000 to $60,000 per TB.

To put this into perspective, ParAccel's $15,000-per-TB pricing both undercuts Netezza's nominal pricing (Netezza charges about $18,000 per TB) and is competitive with the $15,000-per-TB that at least one of its competitors (Teradata Corp.) charges for its entry-level appliance.

Oddly, ParAccel announced Cash for Clunkers about a week after Oracle unveiled Exadata v2. The timing seems odd because ParAccel passed up the chance to have some fun at the expense of its biggest rival: namely, by serving up an Oracle-themed Cash for Clunkers migration program for "legacy" Exadata version 1 customers.

That's an opportunity that company officials say they don't have any immediate plans to pursue -- although they don't rule it out, either. "Will we have fun with this and extend it beyond Netezza? We've started thinking through the second phase and the third phase and even the fourth phase, and I think we can have a lot of fun with it continuing. However, I can't talk about anything [concretely] in terms of that yet," says CEO David Ehrlich.

"Faster or Free" is a less lighthearted promotion. Officials like to claim that ParAccel hasn't lost any of the dozens of proof-of-concept (PoC) trials it's entered; competitors counter that if ParAccel hasn't lost any PoCs, it's because it proactively withdrawals if it looks like it's going to lose.

Ehrlich denies the allegation. "We just don't get beat in customer environments. Every customer environment we go into we end up winning on performance," he says. What of competitor claims that ParAccel tends to opt for tactical retreats when it's opposed by creditable competition? Not true, Ehrlich contends. "I'll be frank," he says. "We haven't won every deal. Sometimes people make a decision on features or brand, for example, so we don't win [in those cases], but from a price-performance standpoint, we've never been beaten."

That's the impetus for Faster or Free, Ehrlich maintains; ParAccel positions it as a very-few-strings-attached proposition: would-be challengers have only to sign an evaluation agreement.

From there, they can download PADB and benchmark it against either their current or prospective DBMSes. If ParAccel isn't faster, he says, the PADB software license -- exclusive of maintenance costs -- is free.

Ehrlich says ParAccel plans to disclose the results of Faster or Free, too. "We hear that other companies are out there saying the same thing. That they don't get beat, either. We know that isn't true for every single one of our competitors, because we've beat[en] all of them," he explains. "So we finally said, 'We need to deliver some credibility around this. Why don't we put our money where our mouth is and just go out and make a bold statement?'"

Kognitio Highlights Promise, Perils of Big Data

Unlike ParAccel, analytic database pioneer Kognotio -- which formally incorporated whitebox database appliance specialist Whitecross Systems almost half a decade ago -- has been relatively quiet. In early August, however -- in time for the TDWI World Conference in San Diego -- Kognitio trumpeted the results of a survey that it says explores the intersection of Big Data data warehousing and personal privacy. It concluded that most respondents believe that the companies who collect their personal information for marketing purposes plan to use that data in unethical ways. Just over a quarter (28 percent) believe companies are using (or plan to use) this data "in an ethical manner." (Responses to Kognitio's survey were collected from a self-selecting sample that participated via Twitter.)

Why is Kognitio surveying users about data privacy? In part, officials concede, because the increasing prevalence of "Big Data" solutions such as Kognitio's own WX2 database (along with analytic entries from Netezza, ParAccel, Vertica, Greenplum, Aster Data, and other competitors) makes it possible for companies to collect and analyze customer data at a scale never before believed possible. In other words, says Sean Jackson, vice-president of marketing with Kognitio, the temptation to misuse such data is probably greater than ever.

"Responsible companies are making sure that they use this data in an ethical way, but customers seem to have concerns regardless," said Jackson, during an interview with BI This Week. He insists, however, that Kognitio doesn't have a brief here, at least when it comes to touting privacy-friendly or other privacy-enhancing data safeguards. WX2's native administration facility gives shops the tools that they need to safeguard data, control, and audit data access, and -- if necessary -- age data out of a system, pursuant (for example) to corporate or regulatory policies. Kognitio's competitors offer similar (if less intuitive, says Jackson) capabilities.

"It's just something that we wanted to raise awareness about. We don't really have any angle here. This is an issue that we in the industry -- as companies that offer these technologies [viz., analytic databases] that enable [customers] all that they need to store and analyze data at this unprecedented scale -- have a responsibility to draw attention to," he concludes.

Aster Trumpets New MPP Online Backup

Aster Data Systems, for its part, celebrated the one year anniversary of its MapReduce co-coup -- last September, both Aster and Greenplum touted the near-simultaneous availability of DBMS-native MapReduce -- by announcing a new MPP backup feature, dubbed nCluster Online Backup. Officials talked up online MPP backup in a June interview (coincidental with Aster's release of a new nCluster-branded analytic data warehouse appliance), and they also discussed a similar online backup and recovery scenario in the January timeframe. At any rate, Aster formally introduced online MPP backup as part of its new nCluster and nCluster Cloud Edition releases, which it shipped a few weeks ago.

In the past, representatives have positioned MPP backup as consistent with Aster's "recovery-oriented computing" vision. "A customer can perform full or incremental back-ups [of an Aster MPP cluster] while nCluster is running. This is a production scenario we're talking about here, with users running queries against nCluster. They [customers] can even do backups while they're simultaneously loading data [into nCluster]," said Mohit Aron, a software architect with Aster Data, in a June interview.

Aron didn't position this capability as a forthcoming or a soon-to-be-introduced feature; more to the point, a description of Aster's delta-based backup and recovery facility -- which officials outlined to BI This Week in a January interview -- sounds very similar, too. "It winds up being on the order of seconds to minutes to copy back the data that's changed, as opposed to our competitors and some traditional approaches that do a brute force [recovery] of just copying all of the data," said Steve Kung, senior director of product marketing with Aster, at the time.

When a single or multiple nodes in a cluster fail, Kung explained, "[you do have to] provision a new node. You pop [that node] into your nCluster, [and] in that scenario it will take time to recover, because you are migrating and recovering many, many terabytes of data. Naïve approaches by our competitors and others would do that in a way that requires an outage; our approach is that we do require the transfer -- there's no way to avoid that -- but in this scenario, we maintain availability for the system. As these many terabytes of data are actually copying, we do it in the background, and we allow queries and changes and other types of workloads to continue uninterrupted."

What's new is Aster's description of a dedicated backup cluster (the aptly-titled Backup Cluster), as opposed to intra-cluster backup and to other non-dedicated nodes (which are presumably running online, analytic workloads). This Backup Cluster approach is consistent with disaster recovery (DR) or business continuity planning (BCP) best practices, inasmuch as it permits geographical separation between source and backup systems. What's more, Aster officials say, Backup Cluster even supports a cloud-based topology, so ambitious DR or BCP practitioners could experiment with online backup to either private or public cloud services, such as Amazon's Elastic Compute Cloud (EC2).

More News: Netezza, Vertica, Greenplum

Weeks before ParAccel's Cash for Clunkers and Faster or Free pitches, rival Netezza Inc. stepped up its own marketing efforts, unveiling a new analytic database appliance in tandem with DW software specialist Kalido Inc., introducing a version of its Netezza Performance Systems (NPS) appliances optimized for Oracle's Business Intelligence Packaged Application (which comprises Oracle Business Intelligence Enterprise Edition and Oracle Business Analytics Warehouse, among other offerings), and delivering a vertical-specific effort -- in tandem with both Kalido and consultancy AMR Research -- to promote packaged analytics for the pharmaceutical industry.

The Oracle Business Intelligence announcement, which Netezza trumpeted early last month, seems particularly ironic in light of Oracle's own DW product blitz -- i.e., Exadata version 2 -- which it trumpeted just a couple of weeks later.

With Exadata v2 -- as was the case during its Exadata v1 launch a year ago -- Oracle took direct aim at Netezza, calling it out by name and claiming that Exadata was both faster and cheaper.

Vertica has been as busy as any of its competitors. Late this summer, for example, it announced a major new revision of its DW software -- Vertica 3.5, which boasts a new "optimized" columnar technology ("FlexStore") -- and unveiled a joint BI-and-DW-in-the-clouds offering with partners JasperSoft, Talend, and RightScale. That was in August. Then, last month, Vertica was one of a number of BI and DW players (such as data federation specialist Composite Software Inc.) that presented at VMWorld. Vertica officials were on hand to discuss the challenges of virtualizing DBMSes -- and particularly analytic DBMSes -- as well as some of the best practices that Vertica (in tandem with virtualization heavyweight VMWare Inc.) have developed.

Early last month, Greenplum, which (along with Netezza and Kognitio) is a stalwart of the analytic database segment, announced a new accord of its own: a partnership with customer experience analytics (CEA) specialist ClickFox. Under the terms of the agreement, ClickFox says it has "integrated" its CEA software with -- more precisely, ported its data architecture to run on top of -- Greenplum's analytic DBMS. Greenplum's DBMS, like Aster Data's nCluster, boasts a native implementation of the MapReduce API. This enables developers to write directly to MapReduce -- without having to take into account the complexity of the underlying MPP architecture.

The upshot, Greenplum officials claim, is that ClickFox -- which today processes about half a billion "multi-channel customer interactions" each month -- now believes it has enough analytic horsepower to accommodate the projected one billion monthly customer interactions it expects to process in 2010.