In-Depth

DataFlux, Data Integration, and the Data Management Platform

DataFlux's bid for data integration bragging rights isn't going to be easy, given the market. CEO Tony Fisher wouldn't have it any other way.

At the TDWI Winter World Conference in Las Vegas, data quality (DQ) vendor DataFlux, a subsidiary of SAS Institute Inc., had an industry coming out of sorts. The company announced the first fruit of a long-percolating technology effort, its new DataFlux Data Management (DM) platform.

DataFlux CEO Tony Fisher told BI This Week, "There are a lot of things going on that led up to where we are today. The primary thing is that there's a real movement in organizations to consolidate [their] data management activities, so fundamental to our decision to take SAS data management and merge it with DataFlux data management was a recognition that this made sense from [the perspectives of] our customers, with what they were doing internally," explained Fisher.

DataFlux Data Management employs an array of SAS-only data integration (DI) technologies. SAS plans to continue to co-develop some technologies (e.g., SAS Enterprise ETL) for an indefinite period; DataFlux will increasingly become the public face of SAS DI.

It's an ambitious agenda, but it wasn't hastily conceived, Fisher stresses. Almost two years ago, in late 2008, SAS announced Project Unity, an effort whereby DataFlux was to eventually take over stewardship of its data integration (DI) toolset. The technology reshuffle made sense, SAS officials said at the time, in part because the requirements of business intelligence (BI) and data analysis had themselves changed.

DI was increasingly less concerned with traditional ETL (and its batch-centric world view). The DI of the present and future -- with its real-time or right-time connectivity requirements -- prescribed an array of complementary technologies: ETL (with an emphasis on faster information access), data quality, data monitoring, data profiling, event management, master data management, and other disciplines.

Fisher says that SAS and DataFlux tended to approach DM from two very different perspectives. "SAS had always had very, very strong expertise … in ETL, in ELT, [and] in the data warehousing-analytic side of data management," Fisher explained. "DataFlux, on the other hand, had always had very strong technology in data quality, in SOA, in transaction-oriented integration, and in Web-services-oriented integration."

The new DataFlux Data Management Platform combines both perspectives, comprising what Fisher bills as a one-stop shop for data management: data quality (DQ), data profiling, data monitoring, master data management (MDM), business rules management (BRM), event processing, data federation, and ETL.

Fisher invites comparison with other DI offerings (from IBM Corp., Informatica Corp., Oracle Corp., and others) but insists on a semantic distinction. "We're very careful about not calling it a stack. We're very careful about calling it a framework and a platform. Stack to us implies multiple levels of technology, everything from the operating system to the ERP system. When you think about the stack vendors, you think about SAP, you think about Oracle," he argues.

"As a platform, our advantage is that organizations usually have multiple stack environments. Nobody ever has a single stack," he continues. "If you are 90 percent SAP, you should buy SAP's data integration tools. They might not be as good, but ultimately they'll be a better fit for you. The environment that we thrive in is [one where you have] the integration of all of these different stacks."

Platform Neutrality

Since its acquisition by SAS Institute Inc. in 2001, DataFlux has tried to position itself as a comparatively autonomous DQ player. SAS officials used to likewise emphasize DataFlux's autonomy: if (for example) you asked SAS DI marketing manager Ken Hausman about real-time data integration -- a technology area that would at least seem to be the purview of SAS ETL -- he'd strongly demur, disclaiming "that's where we'd bring DataFlux in [to speak with a customer]: real-time, right-time, that's more their area of expertise."

In an enterprise application market that's been rocked by acquisition, neutrality sells. Consider the case of Informatica Corp., which presciently rebranded itself as an independent, DI-only player, a move that required it to effectively jettison its PowerAnalyzer analytic environment seven years ago. Informatica was able to capitalize on the rapid consolidation of the DI segment in 2005 and 2006. The result: its revenues, profits, and prominence spiked sharply.

With its new Data Management Platform, DataFlux hopes to lay a claim to neutrality, too. It's done so in part by "forking" (or by developing and maintaining two slightly different versions of) the former SAS DI technologies.

After all, the Achilles Heel of SAS DI (from a platform-neutrality perspective) was its reliance on SAS code. The traditional SAS ETL engine, for example, is a SAS language beast. The DataFlux Data Management Platform is a polyglot proposition, however. "The Data Management Platform is primarily a SQL code generator [and] an XML code generator. We also integrate very well with the [SAS] Data Integration Studio workflows, so you can manage them within the Data Management Platform as well," Fisher explained.

He positions the move away from a SAS language-only engine as dictated primarily by topological and not by political concerns. "One of the real reasons for moving away from the SAS language is the ability to do more and more things closer to the database, to do more ELT, to do things with XML, and to use XML workflows to do things that aren't easily managed within SAS."

Not surprisingly, Fisher pitches the first edition of the DataFlux DM Platform as a feature-complete release. He concedes, however, that Project Unity -- the feeder effort which helped product the DataFlux Data Management Platform -- is in many respects an ongoing effort. For this reason, he says, DataFlux will continue to support several SAS-y attributes in its flagship (i.e., platform-neutral) DM offering. SAS, moreover, will likewise continue to develop and market a SAS-ified (i.e., optimized) version of its DI platform to existing customers.

"There are occasionally things that SAS just does really, really well, so if you look especially at some of the mining stuff and some of the predictive stuff, those technologies within SAS are the best in the world, so we want to be able to take advantage of them. Primarily we will maintain the SAS [lineage] for these things."

Must Read Articles