DataFlux Puts a Data Quality-Centric Spin on Customer Data Integration

Data quality specialist sees CDI as an evolutionary extension of bread-and-butter data management.

And so it goes. In the last month, business intelligence (BI) players such as Business Objects SA and Informatica Corp. have taken the data quality plunge—but that doesn’t mean data quality pure plays (or quasi-pure plays) are sitting still. Some are hacking away at the next frontier in data management—i.e., customer data integration, with all of its attendant data quality and reliability issues.

Take SAS Institute Inc. subsidiary DataFlux, which this week announced a new customer data integration solution, dubbed, appropriately enough, DataFlux CDI. DataFlux execs bill the company’s first branded CDI deliverable as a software- and best practices-based bundle that’s designed to help companies get their own CDI practices up and running. To that end, DataFlux CDI brings a quality-focused approach to the synchronization, consolidation, and management of customer information. It’s based on service-oriented underpinnings and plugs right into DataFlux’ Data Quality Integration Platform.

DataFlux director of corporate communications Daniel Teachey says CDI is more or less an extension of DataFlux’ existing data quality toolset. And lest you question DataFlux’ own motives in the customer data integration market, Teachey says the company’s customers have already been tapping its data quality tools for use in their customer data integration efforts. “What we’ve sort of noticed in the last two to three years is that a lot of our customers have been trying to use our data quality and data profiling to create a single view of their customer information,” he comments. “What we’ve done in the last 18 to 24 months really is co-define a best practice and method, a data model, around a CDI solution, so that we can really shorten the time to implementation for these data quality projects. So it’s a rapid-time-to-implementation approach to CDI.”

Ron Agresta, solutions manager for DataFlux CDI, says his company has a different take on CDI and its attendant problems. “A lot of people have viewed customer data integration as a plumbing problem, about how to get data from Point A to Point B, Point B being a master hub. We think a lot of attention needs to be focused on the stuff in the pipes, and a lot of work needs to be done to make the data as consistent and reliable as possible.”

Enter DataFlux CDI, which ships with pre-defined business rules for managing customer data, along with canned best practices and a tunable data model. Customers can tap DataFlux CDI to support data quality and matching, augment customer data, enforce “householding” practices, build cross references to source systems, and persist data in a master customer file. Because it’s based on the same service-oriented underpinnings as the rest of DataFlux’ Data Quality Integration Platform, CDI can incorporate data from other, service-enabled sources, too.

“[Customers can] implement the functionality for the Web services that we have to get at data in these [other] systems, and a lot of our customers have been doing this for a while anyway, with our [service-enabled Data Quality Integration] Platform,” Agresta says. “We’ve got [more than 200] jobs and services that are built inside this DataFlux world. We’ve pre-built those jobs, we have a generic data model that supports multi-party customer information, and then we have a best practices methodology and other associated deliverables, which allows people to implement a CDI solution quickly.”

CDI can be deployed as a mostly canned solution, Agresta says, but customers can also customize it, if need be. “They can change the model if they need to, change the data quality rules, the identity management rules.”

Nor must CDI be an all-encompassing practice, Agresta says. Companies can take baby steps. “We’re not saying you’ll want to store every little piece of a customer inside this master reference, but we are saying if someone wants to learn more about something, they can store some key attributes—name, address, that kind of thing—inside,” he comments. “At the end of the day, you’ll have a master repository that has the linkages and cross references to know where the other pieces are.”

In many cases, Agresta maintains, customers have already been tapping DataFlux’ data quality expertise for just this purpose. “They’ve used a combination of technologies, and DataFlux in the past may have provided the matching mechanism, or the cleansing mechanism,” he argues. “So CDI for DataFlux is kind of a natural evolution for us. For us, it’s what our customers have been doing for many years, and it just made sense for us to kind of get them further down the road, give them more of our expertise.”

In light of the maneuverings of other traditional BI players into the data quality market, do DataFlux officials foresee a more visible role for their own technologies in the bread-and-butter SAS BI stack? Teachey and other DataFlux representatives are non-committal on this point. “SAS has taken our data quality solution and typically applied it within their solutions, such as their anti money-laundering [offering]. At this point, we’re staying on the operational side of the world, so we’re trying to centralize data for operational purposes, or people who are actually at a console for ERP,” Teachey comments. “SAS works more on centralizing information for a data warehouse for decision support, so I would assume at some point there will be some bleed over, but we mostly see it as sort of two markets.”

About the Author

Stephen Swoyer is a Nashville, TN-based freelance journalist who writes about technology.