IBM’s Data Integration Strategy Coming Into Focus

IBM tries to keep current Ascential customers happy while satisfying the needs of new adopters attracted by the promise of its all-in-one data integration move

When is a beta announcement newsworthy? When it has some bearing on the outcome of a billion-dollar business intelligence acquisition (BI), of course.

That was the case last week, when IBM Corp. announced beta programs for the next versions of WebSphere Information Integrator (code-named "Serrano"), and the former Ascential Software Corp.’s next-generation data-integration suite, code-named "Hawk." Hawk, several years in the making, gets a WebSphere-branded makeover, but—according to IBM officials—should be autonomous enough to address the concerns of existing Ascential customers. In addition, Big Blue hopes to satisfy the needs of new adopters attracted by the promise of its all-in-one data integration play.

Post-acquisition, Ascential’s DataStage ETL product is now known as WebSphere DataStage. Ditto for Ascential’s data-quality offering, QualityStage. Ascential itself used the codename "Hawk" to describe its long-incubating, next-generation data-integration suite, a convention that IBM has retained.

According to Mark Register, vice-president of marketing for IBM’s information integration program, the Hawk release should be business as usual for existing Ascential customers. "Entitlements for our customers remain unchanged. They get access to the technology if they’re on maintenance—and we’ve got huge maintenance-retention rates—then those customers get entitled to upgrades," he comments. "This will be version 8 of our product line. If they have DataStage today, then they’ll get DataStage version 8."

IBM has already talked at length about Serrano, which—according to Nelson Mattos, distinguished engineer and vice-president of IBM’s Information Integration product group—will feature super-charged search capabilities and will extend Information Integrator’s reach into an even wider variety of structured, semi-structured, or unstructured data sources. Less has been said (on IBM’s behalf, at least) about Hawk, which was an umbrella term for Ascential’s ambitious plan to consolidate and reconcile its technology asset portfolio.

Some of Ascential’s technologies—like its parallel-processing and data-quality capabilities, for example—came to the company by way of acquisition. Competitors have charged that Ascential will have its work cut out for it as it tries to more tightly integrate these products; Ascential representatives, on the other hand, have said that the company’s products are already effectively integrated.

Last week, Register and other IBM officials didn’t have much to say on this front. Instead, they talked up a new Hawk-based product—WebSphere Information Analyzer—and also touted the enhanced synergy between IBM’s Information Integrator suite and the Ascential technologies.

"On the Hawk side, the focus is on the entire data integration suite—WebSphere DataStage, WebSphere QualityStage, and WebSphere DataStage TX," Register comments. "What’s new in this [beta release] is WebSphere Information Analyzer. It’s a data auditing, free-form analysis product [that helps business users] understand data and source systems, be able to actually see the content and quality of that data, and build definitions from that."

Like the other Hawk technologies, WebSphere Information Analyzer (which went by the internal code name "Sorcerer") shares a central repository with IBM WebSphere DataStage and IBM WebSphere QualityStage. It ships with what Register says is a substantially overhauled UI, designed to make it usable by both data integration professionals and business users. "We really started with a blank slate, built this entirely new type of interface, and threw all of the old rules out the window. So we’re not only simplifying information integration, but putting it closer to the hands of the business analyst," he comments.

On the synergy front, Register says Information Integrator and the former Ascential technologies share a common service deployment model and can also exchange metadata. "In addition to staying on track with those two releases, [we’re] focusing on additional points of integration. So we have metadata interchange across the traditional data integration products, and we’re also providing a common service deployment model," he concludes. "When you define data transformation rules, these can be published using a common service deployment model [so] you can deploy those things now into an SOA," Register comments. "It’s done very simply without having to have J2EE or Web services skills. Instead, it’s all done through a wizard."

Also last week, IBM added Rational Data Architect to its Information Integrator technology portfolio. Rational Data Architect is based on the open source Eclipse platform and helps data architects model, discover, map, and analyze data across multiple information sources.

But What’s in Store for DB2?

Less clear is whether IBM will drop a stripped-down version of Ascential’s ETL technology into DB2.

The impetus for such a move is compelling: Microsoft Corp. and Oracle Corp. both deliver integrated ETL capabilities as part of their flagship relational database offerings. DB2 does have a limited, SQL-based data-integration capability, but this is a far cry from what both vendors bring to the table.

With a substantially revamped ETL capability on deck from Microsoft in SQL Server 2005 (the much-anticipated Microsoft Integration Services), and a revamped Warehouse Builder rumored to be in the works from Oracle, such a move might seem like a no-brainer for IBM.

But Big Blue’s Register remains non-committal on this issue, saying it's a question of tightly coupling the (once-autonomous) Ascential technologies with DB2. "Specifically on the DB2 question, there’s a whole lot of architectural teams working on the ideal strategy going forward, but we’re not optimizing our product for DB2, that’s one of the things I want to emphasize," he comments. "The strength of Information Integrator is the heterogeneity of the product set, the fact that we can work with DB2, Microsoft, Oracle, Sybase, and Teradata."

About the Author

Stephen Swoyer is a Nashville, TN-based freelance journalist who writes about technology.

Must Read Articles