In-Depth
BI Industry Outlook: Busy and Bright
After a tumultuous year of consolidation, the BI industry outlook seems as busy -- and as interesting -- as ever
At last week's TDWI World Conference in Las Vegas, it was clear: BI is where the buzz is.
A host of BI industry powers announced new products, touted enhancements to existing offerings, or -- in the case of SAS Institute Inc. and Teradata Corp. -- notched new partnerships. Elsewhere, data integration stalwart Informatica Corp. announced a new product and services bundle -- the Informatica Data Migration Suite -- which it says can provide a one-stop shop for enterprise data migration needs. (See http://www.tdwi.org/News/display.aspx?ID=8814)
It wasn't just familiar BI players making news. After a year of unprecedented consolidation in the BI segment, a surprising number of start-up or smaller vendors -- companies such as InfoBright Inc., Kognitio Inc. (the former WhiteCross, which makes its U.S. debut early this year), ParAccel, and Vertica, among others -- are sure to keep things interesting.
SAS Cozies up to Teradata
The SAS/Teradata partnership -- which caps months of increasingly cozy commingling between both companies (which at one point even prompted an outright denial of an impending merger or acquisition from SAS) -- was one of the biggest events of the show. Both companies are collaborating to enhance the use of SAS' analytic offerings with Teradata Warehouse.
One obvious point of collaboration involves the joint use of Teradata's ADS Generator with SAS' Enterprise Miner. The partners described a scenario in which Enterprise Miner users can prepare their data inside Teradata itself, then use the predictive modeling markup language (PMML) to have Teradata consume that XML and do the scoring in the database.
That capability is available today, according to Ken Hausman, product marketing manager for data integration products with SAS.
Other still-percolating initiatives include incorporating more SAS procs into the Teradata database -- both vendors demonstrated Proc-Free and Proc-Put capabilities at the show -- as well as the use of external stored procedures. The big takeaway, according to Hausman, is that Teradata and SAS are working to effectively move SAS analytics into the Teradata database.
Call it synergy. Call it symbiosis. Call it technological mutualism (of a sort). Regardless of what you call it, Hausman argues, integration of this kind benefits joint customers.
"What we're saying is, let's leverage the SAS analytics and put it inside of Teradata. It's a win-win. Customers will get the same broader, deeper analytics that they can get with SAS, and still have the enterprise-class, huge data warehouse that they can get with Teradata," he told BI This Week.
The SAS-Teradata integration project is an ongoing enterprise, Hausman concedes, so customers shouldn't expect either vendor to immediately deliver on the ambitious integration vision they've outlined.
"It's going to be over two years before we get most of that functionality in there. Starting today, we already have established a joint BI competency center together, so we're helping customers with things like [and identifying] best practices. What can you do as a joint customer do today to make what you have work better?" he says.
"Within the first half of this year, we'll provide better capacity for loading Teradata, focusing more on our ELT capabilities as opposed to just ETL. We're looking toward converting more of our analytics into Teradata user-defined functions," Hausman continues. "Ultimately, the goal is to take some of our own workspace servers and actually put [them] inside of TD, have [them] run inside, so they can actually call SAS within TD."
Over time, Hausman notes, SAS hopes to initiate similar partnerships with other BI and DW vendors. Why did SAS decide to partner with Teradata first? For a couple of reasons, it seems. Hausman observes that Teradata has been open to partnering.
"Teradata historically does what it does through many partnerships and alliances with vendors, some of which we compete with, some of which we don't," he says. There's also a lot to be said for that most-belabored of industry buzzwords: synergy. In the case of SAS/Teradata, Hausman argues, there's a real opportunity for synergy, largely because there's so little overlap and so much best-of-breed excellence.
"When you peel back the layers of Teradata, they're very strong in everything that you would do relative to DW, all the pieces of it, but they partner for ETL and data integration," he indicates. "They don't really partner so much with the BI side, which we can provide that, and, as far as analytics goes, they have some analytics that are more industry-specific, which we can either complement, or in some cases they don't have the analytics that we can provide, so retail will do some complementary things [and] finance will do some complementary things.
Newcomers Make a SplashYou might expect an anti-climactic atmosphere to prevail at any BI industry trade show after last year's frantic acquisition spree. After all, the acquisitions of Business Objects SA (by SAP AG), Cognos Inc. (by IBM Corp.), and Hyperion Solutions Corp. (by Oracle Corp.) displaced the three leading independent BI and PM players.
All three players were present at this winter's TDWI show (Cognos and Business Objects as themselves; Hyperion as part of Oracle's presence), and their change in ownership was offset to a degree by a whirlwind of activity in the red hot data warehouse appliance segment.
Appliance pioneer Netezza Inc., along with DATAllegro Corp. and Dataupia Corp., were joined by a host of scrappy newcomers (to the U.S. market, at least): InfoBright, Kognitio, ParAccel, and Vertica.
All four tout a rather un-appliance-like spin on the near-venerable data warehouse appliance: a software appliance without the accompanying hardware.
You can order the hardware from all four vendors, to be sure, but -- if you'd rather just deploy their software on top of your existing assets (or if you'd rather order your own data warehouse hardware from your preferred vendor) -- you can buy just the DW "special sauce" itself.
All four had large presences at the TDWI conference, and all four seemed to be doing a brisk business at their respective trade show booths. Vertica -- which lists both columnar database creator Michael Stonebreaker and veteran database architect/Oracle executive Jerry Held on its management roster -- demonstrated the analytic performance of its columnar technology with demos involving sub-second or single-second response times on 5 and 10 billion-row data sources at its booth.
How does Vertica hope to differentiate itself from the rest of the field in a teeming DW appliance segment? According to officials we spoke to, Vertica's columnar database technology -- which it has been testing for more than a year in customer accounts -- isn't just a rehashing of 20-year-old columnar or relational database techniques.
"When you look at some of the other companies that are here that provide high performance solutions for analytics, many of them have taken technologies from other places and adapted them or changed them to make them fit, whereas Mike and I started from scratch and architected something completely new from the ground up," says Vertica co-founder Andy Palmer.
"We've built a product that for the past year and a half has been deployed in production at a number of very large customers. We're very excited not only because the deployments have gone well, but because the pace of those deployments has gone well, they were able to get up and running quickly."
That's InfoBright's angle, too. Like Vertica, it markets a columnar database solution that ratchets up performance. InfoBright's BrightHouse analytic data warehouse generates metadata about the data it's storing to help speed both retrieval and analytics. It delivers what CEO Miriam Tuerk calls a "Google-like" approach to information retrieval.
"We're all about working smarter instead of harder. We actually use the intelligence of the data -- there's a lot of information in the data that can give you roadmaps and visibility in terms of how to do things," she comments, "so we just load the data in as it comes. We break the data into individual columns, then break it into data packs, which consist of [more than] 65,000 elements. Instead of just using a standard compression algorithm against the data, we look at the data as we're loading it into the data packs so we can optimize for individual data packs."
BrightHouse stores the information that it collects from its data packs into a so-called Knowledge Grid. This is its metadata layer. "The Knowledge Grid has three different layers of metadata about the data. It keeps information about every single data pack -- about the min/max, about the average range of data in every single data pack," she explains. "The next layer is metadata about the data inside the data pack. We would have a histogram of the data inside the data pack." There's a third level, too: relationships about the data.
BrightHouse's Optimizer component uses this information whenever it runs a query, Tuerk explains. "The optimizer doesn't calculate the query path once: it calculates the best query path, runs it against the knowledge grid, then it looks to see how it can re-optimize," she says. "The way that I resolve the query isn't to access the data by using more MIPS processing, more CPU, and more memory; it's to figure out how to do the query without actually doing the calculation at all."
In a future issue we'll take a closer look at the range of data warehouse appliance competitors. It's a market segment teeming with options.