In-Depth

Enterprise Information Integration Enters the Spotlight

IBM, Oracle, BEA, and others elbow into burgeoning EII market, but no single vendor offers complete EII solution

The nascent market for enterprise information integration (EII) software got a visible boost in early February when IBM Corp. disclosed plans to ship a pair of EII tools for both relational and unstructured data.

Early last month, Big Blue announced beta versions of its DB2 Information Integrator v8.1 and DB2 Information Integrator for Content v8.2 products.

IBM’s move dramatizes both the promise and the relative immaturity of the EII tools market, which in 2002 accounted for less than $100 million in revenue, according to consultancy Aberdeen Group. Although the potential market for EII-related products and services is enormous—Aberdeen puts the figure at $7.3 billion—until recently, the market was populated primarily with niche products from specialty vendors.

The upshot, industry watchers suggest, is that when IBM formally ships its DB2 Information Integrator products for both relational and unstructured data sources later this year, it will instantly become a big player in a potentially lucrative space.

There’s good reason for this, suggests Mike Schiff, a senior analyst with consultancy Current Analysis Inc., who stresses that EII describes the abstraction of a variety of different data integration technologies, including ETL, EAI, workflow and collaboration. Comments Schiff: “While most companies don’t have the resources to [address all of these aspects] on their own, IBM does. IBM got the nickname ‘Big Blue’ for a reason. With its software and services, it has resources not available to small companies.”

That’s not to say that IBM will elbow existing EII players out of the way, stresses Wayne Kernochan, a managing vice president with Aberdeen Group. “IBM seems to be focusing on the high-end, naturally, and that leaves room for lower-end or point-specific solutions to thrive quite nicely. That said, it has managed to spread across both the relational and content-oriented with its two products.”

Besides, suggests Kernochan, IBM’s EII play is part of a larger trend among big players—some, like BEA Systems Inc., with no established pedigree in the BI space—to muscle into the market. This trend began last year, Kernochan suggests, when the EII market was populated almost exclusively by small start-ups with less than two years of product experience each.

In August, for example, BI powerhouse Sagent introduced an EII solution based on its Data Flow Server, an ETL tool that features data query and transformation capabilities. J2EE Web application server specialist BEA notched a deal in November 2002 with EII start-up Enosys Software, under the terms of which it will make Enosys’ XML and XQuery data integration technologies available as part of its WebLogic Enterprise Platform.

Moreover, traditional database giants such as Sybase Inc. and Oracle Corp. have gotten into the act as well. Sybase, for example, markets version 12.5 of its Enterprise Connect Data Access, a data access product that leverages a technology called DirectConnect to facilitate access to a variety of different data sources. Oracle, for its part, claims that EII features have always been a part of its flagship 9i database. Says Bennie Souder, Oracle’s VP of distributed database development: “We’ve been doing what IBM’s saying that they’re going to do for years and years and years. We’ve provided data integration technology for the last decade. We started in Oracle 5.1.”

That may be so, agrees Aberdeen’s Kernochan, but Oracle didn’t actively tout 9i’s data integration technologies as tools for EII until late 2002—about the same time that many of its competitors shifted their attention to the space as well. “They had never given me any indication that they had any such capability. When they came around in November running through a whole litany of stuff in 9i, they happened to mention this feature. That’s the first that I heard about it.”

When EII isn’t quite EII

Analysts are skeptical of some of the data integration tools that are touted by their purveyors as suitable for performing EII.

Kernochan, for example, draws a distinction between an EII solution and a so-called operational data store, which typically flows data from a variety of different databases into one central location. In its simplest sense, Kernochan explains, EII requires much more: An interface to facilitate a transparent view of heterogeneous data; a meta data repository, to store information about each of the different data sources; adapters to ensure reliable front- and back-end communications; and APIs to which developers can program applications.

In this respect, he argues, Sagent’s product is more of an operational datastore than an EII solution, proper. Sybase’s DirectConnect technology essentially provides only an interface to heterogeneous data sources. “There’s no [meta data] repository, there’s no [provisions] for front- and back-end communications. It’s just the interface.”

Even Oracle’s EII features are less than full-fledged—in the out-of-the-box 9i database, at least. Oracle’s Souder says that 9i facilitates data integration with ODBC and JDBC data sources, but acknowledges that “the one piece that is extra cost is these branded things—we call them gateways, IBM calls them wrappers—that are tailored for specific data sources.” Oracle separately markets 9i gateways for a variety of different data sources, Souder confirms.

As a result, suggests Kernochan, out-of-the-box 9i functions as a bare-bones platform for EII. “At a basic level, Oracle [9i] provides the fundamentals of EII. You could do a query across multiple data stores if you took the time to set it up. But a full-fledged EII solution has this all automated.”

Fact is, analysts say, that no single vendor fields a complete EII solution.

On paper, IBM’s DB2 Information Integrator product set looks the part: Both tools specify an interface, leverage a meta data repository, are expected to offer integrated adapters (“wrappers”) for front- and back-end communications, and provide an API. In addition, the DB2 Information Integrator v8.1 tool has the ability to update a single relational target, which is noteworthy if only because many EII solutions support query-only access to data sources.

At the same time, points out Current Analysis’ Schiff, both tools are still in their beta stages. This isn’t as big of an issue as it might otherwise be, he allows, because they’re both based on existing IBM products, DataJoiner and Enterprise Information Portal respectively. In this regard, he notes, Big Blue certainly isn’t marketing vaporware. A bigger problem, he continues, is posed by the absence of data source wrappers into specific applications. “What IBM is trying to do is put a stake in the ground and tell the world: ‘We’re doing [EII].’ But without back end adapters, pulling production and transactional data out of applications will have to be accomplished programmatically.”

For his part, Jeff Jones, IBM’s director of strategy for data management, says that DB2 already provides canned support for the relational and non-relational data stores that are exploited by most enterprise applications. On the content integration side, he explains, IBM has forged partnerships with competitive content providers, and will support the XQuery standard when it is approved. In addition, IBM’s WebSphere Business Integration suite provides a wide range of wrappers that expose esoteric application data sources that are specific to the healthcare and other industries.

Even though its DB2 Information Integrator products are only available as betas, Jones says that some customers are using them to do EII in their environments today. Research powerhouse Indiana University (IU), for example, is rolling out DB2 Information Integrator in support of an initiative to provide researchers at its medical school with transparent access to both internal and external data. “The idea is that researchers can sit down at their computers and ask a research question and get all of the appropriate data, whether it comes from their own lab, or from another lab here, or from some other resource on the Web. Information Integrator is the centerpiece of our development effort to achieve this goal,” says Craig Stewart, director of research in academic computing with IU.

Stewart says that his researchers are mostly accessing information stored in Oracle databases and flat file data, but confirms that for a variety of reasons, IU didn’t opt to do EII with Oracle. What tipped the scales in Big Blue’s favor, he concedes, was DB2 Information Integrator’s already-extensive support for data source wrappers, many of which are specific to bio-medical research. “I’m not aware of any other system that lets you initiate a very computationally intensive analysis such as a Blast search or an HMMer analysis as if it were a database query. You’re in fact launching a massive calculation with a SQL command.”

Must Read Articles