Q&A: Ascential Acquisition a Boon for Big Blue’s Customers

There’s a good chance many highly specialized capabilities will find their way into DB2, Information Integrator, and other IBM products

Jeff Jones, director of strategy for IBM Corp.’s data management portfolio, speaks in the lulling cadences of a late night radio disk jockey. But don’t let Jones’ laidback presentation fool you—he’s an astute observer of the relational database market in general, and of the data integration space in particular.

We spoke with Jones about Big Blue’s $1.1 billion acquisition of Ascential Software Corp. Ascential gives IBM much-needed extraction, transformation, and loading (ETL) capabilities, Jones says, but brings much, much more to the table,—addressing the “PhD-level” problems associated with cleanly populating an enterprise data warehouse from heterogeneous data sources.

On top of this, of course, Ascential gives IBM a native mainframe ETL engine, along with metadata management capabilities. The upshot for Big Blue’s customers, Jones argues, is a complete data integration stack—with the likelihood that many highly specialized capabilities will find their way into DB2, Information Integrator, and other IBM “middleware” offerings.

In a certain sense, the Ascential acquisition is hardly surprising, given IBM”s acquisition of Informix. Can you talk about your relationship with Ascential since then, and maybe touch upon some of the areas in which you’ve partnered?

In July of 2001, starting then and probably even before then, we had a relationship at a partner level with [Ascential’s] integration software, so we’ve been working in a very friendly fashion with them for a long time. We resell them [their DataStage ETL tool], and we integrate with them at the product level. And because of that, this is a very simple, very friendly acquisition.

Ascential is probably Big Blue’s biggest acquisition since Rational, back in December of 2003. When you talk about an acquisition of this size—$1.1 billion—you almost always look for areas of overlap. In this case, though, it looks as if there are a lot of complementarities. Do you see any areas where you and Ascential are competing with one another?

We are both going after similar targets, but I think it’s very important that the technology that they bring is a wonderful complement to our own integration technology. They jump into a piece of our spectrum that could use some improvement --

You don’t currently offer an enterprise ETL tool, do you?

Well, we do. DB2 has had an ETL component. It’s SQL-based. It’s basic. It’s not going to challenge Ascential; it isn’t going to challenge too many people. But we recognize that to really capture hearts and minds, we needed to go further than we could go from inventing. Ascential gives us the ability to instantly upgrade to high speed, high volume, data cleansing, data profiling—far beyond what SQL is capable of doing.

That’s where you see the other aspects of complementarity, then, with Ascential’s data-cleansing and data-profiling capabilities?

Yes, and there’s also this whole focus on metadata management that Ascential brings, which also builds nicely on stuff that we started. We have the Information Catalogue that is a piece of the Data Warehouse Edition of DB2 that is a metadata repository … but Ascential and their metadata management goes quite a bit further. Metadata management is all about providing commonality, providing understanding. When a tool thinks it’s operating on a column and understands this data, it really does understand what it is doing and is able to make the right decisions.

Where do things like data cleansing and data profiling fit into the picture?

I look at those as incredibly sophisticated parts of the "T" in ETL. It’s all data transforms, but in this case it’s all Ph.D.-level types of things—data scrubbing, data profiling, pattern recognition, it’s providing mining. This is stuff that we just didn’t have on our own, outside of DB2 Intelligent Miner, which does some of the stuff—[with DB2 Intelligent Miner] we have applied that technology to scoring services, which is kind of a fancy way to apply data mining.

But what Ascential has to offer takes it to a whole new level. What Ascential is doing is more broadly applicable in the process of cleanly populating a warehouse, so that and all of the other benefits are instant strengths that we really respect.

Earlier you mentioned the Data Warehouse Edition of DB2. Do you bundle ETL technology from Ascential or Informatica or another vendor to enable that, or is that something you’ve developed on your own? And what’s going to happen to that now that you’ve got the Ascential technologies?

The Data Warehouse Edition is all IBM. It offers basic SQL-oriented ETL, and it also offers hooks and ways to plug in other things, but we try to make it easier to plug in Ascential and other tools. We haven’t announced any plans or product sketches or any ideas of how that’s going to look, but you can deduce that the Data Warehouse Edition should benefit from Ascential’s ETL expertise.

One compelling upshot of the Ascential acquisition is that it gives you a native mainframe ETL tool, where before, you offered only mainframe data access, with WebSphere Classic Federation. What do you think Ascential brings to the table for existing mainframe customers, especially those who use DB2 or IMS, or other native Big Iron repositories?

The Classic Federation part of WebSphere Information Integrator is built on [technology we acquired from] CrossAccess, and Global Services has built some IDMS and other pre-relational hooks for Information Integrator. So information integration has pieces, many pieces. Heterogeneous access is the part that we do really well with WebSphere Information Integrator, and there may be ways that the Ascential integration on the mainframe can apply.

But the Ascential stuff really applies more directly and without as much overlap in the areas of data transformation and data movement, the whole ETL idea. I think for the ETL kinds of applications it makes sense to use straight Ascential, so we’ll be rationalizing it as we go along. Maybe it’s not sufficient for all applications to get data from wherever it might be and pull it in. Maybe it’s too slow. Maybe for certain applications you need to have a copy of it locally that’s automatically maintained, that involves some transformation. This is the whole reason for doing ETL.

Of course, once I find things and discover things, I need to blast out my discoveries to a large population of receivers, which is the idea of event publishing, which is really the third piece of the information integration puzzle that we address.

I know it’s still early, but can you tell us anything about how IBM will incorporate DataStage and other assets into its product portfolios?

Right now, we’re continuing as partners. We’ll be looking at all opportunities to weave what they can do into our information management portfolio, so you’ll see lots and lots of things happening with rebranding and reselling, but also departmental-level integration with DB2 and WebSphere, so you’ll see all of those things happening. They will be part of the IBM Information Management software division; they become Nelson Mattos’ [who’s second in command to Janet Perna in IBM’s Information Management group] responsibility.

About the Author

Stephen Swoyer is a Nashville, TN-based freelance journalist who writes about technology.