Application Integration: Mix & Match

ETL emerges as a compelling alternative to traditional EAI tools.

Since 2000, Amsterdam-based global financial services giant ING Group N.V. has been on an international buying spree of sorts, acquiring nine companies, including U.S. insurance giant Aetna International. As a result, ING is now the 24th largest company in the world, with over $150 billion in assets and more than 100,000 employees in Europe, North and South America, and Asia.

As vertical markets go, the financial services industry tends to be overwhelmingly dependent on information technology, including decades- old proprietary legacy applications. It's tough enough when a single financial services company wants to integrate its disparate information systems. Imagine, then, the task of integrating the information systems of nine companies, scattered across the world, with those of your own.

Such is the unenviable lot of Babu Kuttala, ING's director of information management.

"We have over 200 legacy systems, and moving away from some or all of them is impossible," Kuttala explains. Most of these so-called legacy applications (some up to 40 years old) still support mission-critical operations. "There aren't any people who know these things. These systems run very well. You cannot easily move the systems to new technology, because there is a 30- or 40-year history in the system, so you have to keep them."

ING is currently in the midst of an enterprisewide application integration project that Kuttala estimates could take several years, but which, he allows, will probably never quite be finished. "It's an evolving process. By the time you're finished developing, things may have changed or consolidated, so you have to go back and do another thing."

In order to better facilitate its application integration efforts, ING plans to leverage a database extraction, transformation and loading technology (ETL).

ETL's emergence as a tool for enterprise application integration (EAI) is controversial, however. Proponents of ETL tout it as a solution for data- centric integration problems that aren't easily addressed by existing EAI tools. ETL naysayers claim that it's not really an EAI technology at all.

The Changing Role of EAI
For years, enterprise application integration was an issue primarily in mainframe environments, because client/server systems, where they existed, were deployed primarily to support end-user productivity applications or to facilitate file-and-print services. Mainframes, on the other hand, hosted the bread-and-butter inventory management, point-of-sale (POS), accounting, and human resources applications that ran the enterprise.

When customers determined that it would be advantageous to enable an inventory management system to exchange data with (or pull data from) a POS system, for example, IBM Corp. eventually responded by delivering a series of middleware tools: CICS and MQSeries, among others. Eureka! EAI was born—and in grand fashion.

"These are some of the oldest middleware tools out there, and they were designed for this sort of thing [EAI]," notes Rob Enderle, a former IBM-er and a research fellow with consultancy Giga Information Group. "In addition to its reliability and security, these [integration tools] help to explain why the mainframe has been such a viable platform for so long."

Today, says Stefan van Overtveldt, director of WebSphere program marketing with IBM, Big Blue is pushing its WebSphere application server as a next-generation EAI tool for customers on all platforms. At the same time, van Overtveldt allows, CICS and MQSeries are still powerful integration tools—and may be all that customers in some environments require. "Don't forget, [IBM] built MQSeries and CICS to do this [facilitate interoperability between applications]," he says. "Both [middleware tools] are quite successful on their own terms."

EAI Today
When mainframes ruled the roost, EAI was more of a known quantity. In most shops, there was generally only one mainframe, and it ran a single, known operating system and supported a single, known programming language.

As ING's Kuttala can attest, EAI today is considerably more complicated. Not only must IT organizations contend with a dizzying array of disparate platforms, but—in the event of a merger or acquisition—they must also be able to accommodate radical organizational differences and simultaneously assimilate a benumbing complexity of potentially contradictory business processes.

As a result, most of the big players in the EAI space—including webMethods Inc., Tibco Software Inc., and Vitria Technology Inc.—don't provide point solutions for application integration (such as CICS or MQSeries middleware applications). Instead, they market full-scale integration platforms.

The Emergence of ETL
In recent years, ETL has been touted as a potentially exciting application integration technology. This is due in no small part to the proliferation of ETL tools in the enterprise (they're usually deployed in conjunction with a data warehouse), which, coupled with ETL's proven ability to interface with, and facilitate data exchange between, a number of database environments, has occasioned talk of a "convergence" between conventional EAI tools and ETL.

Most of the big EAI vendors accept without argument the proposition that ETL is an important technology for application integration. Tibco, webMethods, and Vitria have partnerships with Informatica Corp., a purveyor of ETL tools, and all three are working to improve interoperability between their EAI platforms and Informatica's PowerCenter ETL suite. At the same time, however, EAI vendors frequently make the claim that ETL, while a legitimate application integration technology, is suitable only for specific markets or for specialized implementations.

"ETL tools have been around for a long time, and they're really considered point solutions, and typically point solutions are batch-oriented. We still see them as important tools, but as more of a niche market," asserts Vitria CTO and co-founder Dale Skeen.

More to the point, EAI vendors charge ETL is an acceptable integration technology for use with applications that rely on batch-processing, such as a POS system that processes receipts overnight or during off-peak hours.

"ETL as a technology serves a very important purpose, but ETL applied to integrating systems together really doesn't make a lot of sense," claims Scott Opitz, senior vice president for strategic planning at webMethods. "[ETL is] batch-oriented in its approach; it frequently moves large batches of data around in a non-real-time sense. In batch mode it moves very large volumes of data, typically from point A to point B. In that regard, you could say that it offers a very rudimentary type of integration between two systems."

Opitz's criticism of ETL as an application integration technology is not a new one. After all, ETL tools first grew out of the business analytic space, where they were traditionally associated with data warehousing. Insofar as they are leveraged to facilitate "integration" between applications, ETL tools typically pull (extract) data from one database environment, transform it, and push (load) it into another database environment. Not surprisingly, many of the vendors that are today pushing ETL as a solution for application integration—Informatica, Ab Initio Software Corp., DataMirror Corp., and Ascential Software Corp.—got their start in the data-warehousing space.

That's the crux of the problem, EAI vendors charge: ETL is a great technology for use with data-intensive integration efforts, but when it comes to transaction-intensive applications that require real-time processing, such as online transaction processing systems, ETL simply doesn't cut it. In a world of 24x7 uptime requirements, they say, batch-processing windows are getting smaller and smaller, even as more and more IT organizations move to transaction- intensive application environments.

"[ETL is] definitely not suitable for real-time [applications], and by virtue of not being real-time, it really can't be considered for enterprise application integration," contends webMethods' Opitz. "Companies today say that latencies aren't acceptable. They're arguing about latencies of a few seconds, so if you start to talk about batch modes where you're only current within a couple of hours, they're not going to go for it."

WebSphere Application Server: One-Stop EAI

The emergence of the Web services integration stack—XML, SOAP, UDDI, WSDL—coupled with the cross- platform success of the Web application server, have engendered expectations of a one-size-fits-all solution for enterprise application integration (EAI).

If the forthcoming version 5.0 release of WebSphere Application Server (WAS) is any indication, these expectations could be met sooner rather than later—in mainframe environments, at least.

That's because WAS 5.0 is slated to ship with several tools that should make it easier to expose mainframe applications and information as Web services. As a result, says Stefan van Overtveldt, director of WebSphere program marketing with IBM Corp., WAS 5.0 will be a boon to EAI efforts in mainframe environments.

"It's a specific case of EAI for customers who are either building new apps that need some level of integration with existing systems—like mainframe legacy systems—or who want to expose those new apps they build very rapidly and want to use Web services to do so," van Overveldt explains.

WAS 5.0 will ship with a native Java Messaging Service (JMS) protocol stack and service provider, which gives it a high-speed asynchronous messaging bus that's based on Big Blue's MQSeries technology.

But the big story in WAS 5.0 is its so-called "Workflow Manager," a graphical tool that makes it possible to establish processes between legacy applications, WAS and Web services. "It lets you define flows between different Enterprise Java Bean components, Web services and connectors into legacy applications," van Overveldt says.

The new integration features in WAS 5.0 actually build upon capabilities that IBM originally incorporated into earlier versions of WAS. Beginning with WAS 4.1, for example, developers could build highly sophisticated adapters into Web applications that allowed them to define a "microflow," or sequence, into a back-end system. "That is particularly important if you're dealing with a CICS COBOL app," van Overveldt observes. "To interact with this app, the user would traditionally go through multiple green screens, but now you can sequence these events into multiple input/output fields."

WAS 4.1 also made it possible for mainframe administrators to incorporate several microflows into a single "macroflow." WAS 5.0's workflow manager enhances this feature by enabling administrators to safely back out of a workflow. If any of the requests that you do against a back-end app fails, you can cleanly back out of the flow, according to van Overveldt.

Workflow Manager's neatest feature, however, is the facility that lets mainframe administrators establish workflows by drawing arrows between J2EE assets, connectors and Web services. "Typically, what you do is you have two objects—you need to define two linkages. The first arrow you draw is the action that you want to take; the second is the data that you want to pass along. Then the next thing that you do is define whether you want this linkage to be transactional or not," van Overveldt explains, conceding that in spite of its point-and-click nature, effective use of Workflow Manager nonetheless presumes a familiarity with mainframe conventions and with applications such as CICS or IMS.

Add to all of this robust support for Web services and you've got a prescription for a one-stop shop for application integration on most platforms. "We really see Web services as a more standard interface to allow easy application integration," van Overveldt says. "It's not just about building new apps, it's about leveraging existing mainframe apps, pre-packaged apps, existing middleware that you have in place."

In its earlier incarnations, WAS constituted a good general purpose EAI toolbox. With the incorporation of Workflow Manager into WAS 5.0, however, IBM is now treading—and perhaps heavily, at that—on the toes of traditional EAI vendors such as webMethods, Tibco and Vitria, which market EAI products for mainframe environments, among other platforms. These vendors say they plan to compete successfully against IBM by resolving the business process integration problems that, they claim, commonly overwhelm EAI efforts.

"We see players like IBM and the application server vendors as providing the platform for the technical integration, what we call the lower-level technical integration," says Vitria's Dale Skeen. "What that means in turn for vendors like ourselves to be successful is that we need to focus on things like business-level integration."

Scott Opitz, of webMethods, agrees, adding that technical integration between applications is but one, small piece of the overall EAI puzzle. "Web services don't give you an enterprise application integration environment, they address one aspect of it, the point at which you connect," he says. "The reason that it doesn't solve your integration problems [is] because it's only a very small percentage of what you do."


Defending ETL
As far as ETL advocates are concerned, however, a charge of this kind is disingenuous. Mike Lamble, a senior principal at consulting and systems integration firm Knightsbridge Solutions LLC, in Chicago, Ill., notes that ETL isn't a technology that's going to converge with, or otherwise displace, existing integration solutions from some of the entrenched EAI vendors. Rather, Lamble argues, ETL is a complementary technology that is best deployed in data-intensive environments in which conventional EAI tools prove inadequate.

"If we think about EAI, that's really going to come in one of three flavors: presentation-oriented implementations, data-oriented implementations, or transaction-oriented implementations," he observes. While it isn't suitable for two flavors, he says it's often the best fit for data-oriented implementations.

No one is suggesting that ETL is a one-size-fits-all fix for application integration woes of any and all kinds, Lamble continues, only for some applications. For a data-centric environment in which an organization wants to present a customer with a unified view of all of his or her service calls, outstanding orders or promotions, ETL may be a good choice.

"In this situation, we can create an interim data store between legacy apps, and we get stuff from each of those legacy apps and load them in the data store. Then we have, through that data store, a unified view of the customer [and] of the business," he says.

Moreover, says Dan Nieman, director of strategic development for Informatica, batch processing isn't ever going to go away completely. Most Fortune 500 companies still host batch-intensive applications on their z/OS, S/390 and S/360 mainframes, he notes, and ETL tools provide a sophisticated and highly flexible means of accessing this data and of making it available to other systems.

ETL vendors understand this, Nieman argues, and have done their best to accelerate batch operations to bring them ever closer to real-time speeds. He also observes that organizations are anxious to deploy BI technologies. ETL tools provide the integration they need and the business analytics they want.

"The drivers on the ETL side are, ‘You guys are giving me excellent visibility into my business, as in, who are my top ten customers, who's my best supplier, what's my best time to delivery?' These are all things that we generate," Nieman says. "Increasingly, customers need this kind of information faster, as in [for example] the current purchase order that I have from a customer—how does that relate to historical purchase orders over time? These are the analytics that ETL brings to the table."

Putting it All Together
ETL's critics have charged that it is of limited applicability for many EAI scenarios. When all is said and done, says Mike Schiff, vice president of e-business and business intelligence with market research firm Current Analysis Inc., no single EAI tool is going to address all (or even most) of a large IT organization's application integration needs. For that matter, Schiff concedes, an assortment of EAI tools probably won't do the trick, either. Instead, it's likely that a company that relies entirely upon EAI technologies in the first place is making a big mistake.

That's because EAI is as much about business as it is about technology. In order to effectively integrate enterprise applications, an IT organization must understand the ebb and flow of its business processes. It's one thing to efficiently route an order from one application to another, but it's quite another to understand the business processes behind it. Should, for example, an order be routed to a second application as well? Should an order undergo business analytics? Should an order trigger a specific event of some kind?

"One thing that companies should do is create a model of their business processes," Schiff suggests. "If you don't thoroughly understand your flow and your business logic, you're not going to get any value from your integration."

To effectively integrate the data flows from dozens of ING's different legacy systems, Kuttala has deployed Informatica's PowerCenter 6 ETL tool. According to Kuttala, the latest version of PowerCenter delivers close-to-real-time performance, enabling ING to transition away from slower batch operations, and, at the same time, retrofit legacy systems.

"It's batch-oriented stuff, but we're going to real-time. The biggest challenge is to bring all of the information from these legacy systems to one environment," he says. "The good news is that Informatica just released a new version of PowerCenter that's real-time."

Because in this case he was dealing primarily with legacy, batch-oriented systems, Kuttala says going with an ETL tool was a no-brainer. At the same time, a single technology—such as ETL—shouldn't become the linchpin of an IT organization's EAI strategy.

"You need to mix and match, and there are a lot of challenges there as to how you're going to do this, what kind of database you're going to use, and how much volume of data you have," Kuttala notes. "One tool is not going to fix everything."