In-Depth

Kognitio Touts "Train-of-Thought" Analytics

Company says the OLAP status quo has got to go.

Analytic database (ADBMS) specialist Kognitio this month announced the general availability of version 1.0 of Pablo, its cube-less take on OLAP.

Kognitio positions Pablo as an "extension" to its flagship WX2 DBMS. It ships with a complementary offering that Kognitio calls "A La Carte" -- basically, a WX2-enabled plug-in for Microsoft Corp.'s ubiquitous Excel spreadsheet program.

Competitors like to talk up cutting-edge analytic technologies such as MapReduce, but Kognitio principals note that good old-fashioned OLAP cubes aren't going anywhere anytime soon. If anything, argued Kognitio vice president of marketing Sean Jackson, the OLAP status quo itself has got to go.

"One of the problems with a traditional OLAP approach is data latency. The problem with extracting data and transforming it is that it takes time to actually carry out that operation. In some cases, a cube can take hours to build," Jackson told BI This Week in an interview at TDWI's World Conference in Las Vegas.

Enter Pablo, which permits users to construct and query virtual OLAP cubes. Kognitio says the Pablo-fication of OLAP amounts to a kind of "extreme analytics" in its own right. After all, Jackson noted, cubes aren't just problematic to build -- they have to be managed on an ongoing basis as well.

Cubes add up, he argued, so much so that some shops maintain a repository of hundreds or even thousands of cubes.

"People are having real problems actually managing [all of these cubes] ... because when you try to change it, you have to do a complete rebuild -- there isn't a batch mechanism that allows you to do that," he maintained. "Besides, by extracting this data, you're duplicating data all over the place, so you run the real risk of introducing inconsistency into your data warehouse environment."

Pablo promises to erase this requirement. Instead of extracting and staging data prior to transforming it into cubes, Pablo generates cubes on the fly. Users can choose to persist cubes -- or virtual cube views -- but they don't have to, Jackson explained. For this reason, he claimed, conventional DBMSes -- even those that (like SQL Server or Oracle 11g) boast built-in OLAP engines -- are at a multidimensional disadvantage relative to a massively parallel processing platform like WX2.

"With your standard relational database, you typically have to compromise as to what you include in the cube -- e.g., fewer dates or fewer dimensions than you might like because you can't actually build the cube if you include all of these things," Jackson observed.

According to Jackson, Pablo enables a new kind of analytic model -- that of "train-of-thought analytics." A less glamorous way of describing what Jackson means would be "attention-span analytics." The idea, Jackson explained, is that Pablo's speed lets a user chase down an insight before it's lost.

"This means sub-20-second consistent query times no matter what query you do. If it goes much longer than 20 seconds, you start losing that train of thought process and it kills what you do," he said.

Beneath the covers, Jackson continued, it's a matter of creating virtualized star schemas -- and of "translating" these star schema structures into cubes. "What we're doing is transformations on the fly through views. We're not not doing the transformations, we're just not doing them on physical disks. By doing it on the fly with views, we're [able to] keep everything in memory. We then instantiate those views, hardwire them into memory, that process, rather than taking minutes or hours -- which it normally takes -- will typically take 10 seconds."

Must Read Articles