In-Depth
The EAI Challenge
The challenges facing businesses have evolved rapidly over the last few years as companies search for new ways to differentiate their products and services in an increasingly competitive environment. Web-based interfaces, customer self-service applications and the demand for conducting business over the Internet are driving an increasing rate of change and creating an unprecedented need for IT support organizations to respond more quickly than ever before.
The systems that will service the business needs of the coming decade are built on the inherited legacy of 30 years of software development and several discrete waves of computing paradigms – mainframe-centric, client/server, distributed computing, and now customer-facing and self-service systems. IT must address the challenge of delivering enterprisewide applications, combining components from each of these generations. This has created a new model for delivering strategic applications – the integration-centric approach.
In contrast to the development-centric approach of all prior generations of computing, developers now begin work by seeking to connect together existing systems, building new components (using traditional style development) only where absolutely necessary. This manner of delivering strategic systems has given birth to a new computing model – Enterprise Application Integration (EAI).
Global enterprises continue to store the bulk of their data and to process the majority of their transactions on IBM and PCMs. The most commonly quoted statistics all assert that more than 70 percent of corporate data and transactions run on these platforms. Rather than shrink, shipments of mainframes and mainframe MIPS continue to increase. As a result, EAI for most organizations must start from this base of mainframe application packages, homegrown systems, and data stored in mainframe-specific files and databases.
Inherited mainframe systems present a particularly difficult set of challenges for EAI. Online interactive mainframe applications are typically built as independent systems using proprietary data communications protocols, talking to complex heavily loaded character-based presentation interfaces. Units of work are defined by and within the specific OLTP environment hosting the application; transactional behavior is controlled by components that are part and parcel of the specific environment. Business logic, data access logic, presentation services and control flow are usually intermixed within the confines of individual programs. The well-layered application system is the occasional exception.
Mainframe-based data is stored perhaps in DB2, but more commonly in hierarchical databases, indexed or flat files or homegrown databases. Such data stores have in common a complete lack of any standard mechanisms for enforcement of data types, consistency or integrity. Data stored in these structures can only be reliably accessed by the applications that were built specifically for them. Although the Internet is an online medium, batch processing is central to many of the mainframe back-end systems. Finally, mainframe systems present the challenge of the EBCDIC character set and unique data types not found on non-mainframe platforms.
Successfully integrating mainframe components into an enterprise environment requires an understanding of all these challenges. Solutions are required for:
• Transforming the data – data names, formats, data types, character sets.
• Connecting the applications – supplying multiple and flexible communication models (synchronous/asynchronous), store and forward, publish and subscribe, request/reply, fire/forget.
• Controlling transactional behavior – defining and managing the scope of transactions, processing multiple updates, maintaining data integrity.
• Externalizing process flow from application functions – reliable, fault-tolerant and scalable process management.
In addition to an integration infrastructure, an implementation environment is required that:
• Provides rapid implementation tools that build and integrate interfaces with multiple supported APIs.
• Supports a repeatable process within the context of team development.
• Minimizes code creation by providing a graphical environment in which interfaces and business processes are described and connected rather than created.
• Allows new units of functionality to be rapidly developed and deployed.
There are three prevalent options in use today for delivering on the integration challenge:
• Do It Yourself – point-to-point custom integration of discrete components, typically solving each challenge as it arises when adding a new component into an existing mix.
• Low-Level Middleware – communication-based solutions for tying together disparate applications with message-oriented middleware.
• Integration Infrastructure – leveraging pre-built infrastructure and a common approach to component integration to support the rapid assembly of existing assets into an integrated application.
Do It Yourself
The Do It Yourself approach is best understood as a collection of techniques for representing character-based screens, accessing a variety of data stores through SQL or connecting applications on a point-to-point basis. Due to the variety of tactical solutions, there is an absence of common infrastructure or universal backbone. Data types, formats and character sets are translated on an as-needed basis. Solutions are limited on the low end (screen representation) by functional limitations and operational considerations, and on the high end (point-to-point connections) by the requirement for extensive in-depth knowledge of program operation and sequencing.
The earliest approach to integrating legacy systems was screen-scraping based on IBM’s HLLAPI standard. With the advent of Web-based applications, mainframe 3270 data streams are often presented within browsers. Screen-scraping products provide built-in character set translation. Data type conversion is not a problem because presentation screens carry only character data. However, screen buffer size limitations preclude this approach from providing an effective mechanism for moving data between applications, or among data stores that participate in an enterprisewide application. It is also programming-intensive, creating an ongoing requirement for dual maintenance with limited performance, which generally does not map well into object-oriented environments.
A second technique for integrating host data with distributed applications is through data gateways. Typically, these gateways are designed to access relational database management system (RDBMS) resources, such as DB2 and to provide mechanisms to access data in non-relational database structures, such as IMS or VSAM files. Since most gateway technologies support industry-standard interfaces (ODBC, ANSI SQL, etc.), host data is accessible via a wide variety of tools. This approach can satisfy read-only requirements, but when legacy data is to be updated the only dependable point of integration is the legacy application itself.
A third element of the Do It Yourself approach highlights another integration dilemma: programs vs. applications. EAI requires integration at the business function level, whereas the Do It Yourself approach operates at the module level. Screen-scraping attempts to operate at the business function level. Data gateway usage attempts to standardize on SQL access. Each has operational and performance limitations. The perceived way to avoid the limitations of screen-scraping and gateways is to integrate at the program level with invasive techniques or extremely narrow solutions. This can result in an approach where each component is connected to every other component.
Low-Level Middleware
Middleware solutions solve some EAI problems in a more cost-effective manner and with a quicker time-to-market than the Do It Yourself approach. By definition, such solutions supply a common messaging layer and support for one or more of the required communication models.
Few middleware products offer the flexibility required by enterprise applications. MQSeries is designed solely for asynchronous communication. Tuxedo, another popular middleware product, additionally offers support for transaction coordination with its support of the XA two-phase commit protocol.
Message-oriented middleware can answer two of the challenges of EAI systems – connection and transaction management. But they lack semantic translation and flexible process management. Without these facilities within the infrastructure, IT organizations must still develop complex integration functionality.
Regardless of the problems solved by middleware-based integration approaches, they are lacking in one area – rapid implementation tools. Middleware vendors publish the APIs required to manipulate queues and send or retrieve messages, but they do not supply tools for developing applications or robust testing and debugging tools that can be used on the delivered solution.
While a middleware-based strategy is an improvement over the complete Do It Yourself solution, it retains too many aspects of that approach. IT, or outsourcing integrators, must select an application integration environment and build a development infrastructure. Products additional to the middleware backbone – data transformation, process flow and possibly transaction management – must be evaluated, selected and integrated into the development infrastructure. A repeatable process must be created and implemented – this drives up cost and increases time to market.
Integration Infrastructure
A number of software vendors, systems integrators and standards bodies have set about to create an integration infrastructure. This approach combines architecture standards, integration tools and pre-built infrastructure to support the rapid assembly of existing legacy systems and packaged applications into larger information systems capable of supporting a set of streamlined business processes.
Integration infrastructures combine repository-based tools and run-time infrastructure for delivering enterprise applications using a component model. This approach supports standard interfaces (CORBA, Enterprise JavaBeans, DCOM) that are designed to loosely couple components to form complete business applications. A graphical development environment is required to support the task of developing and integrating components into applications. Use of standard interfaces along with communications protocols and products, process management and common data transformation deliver an integration interface that allows a wide variety of applications to be immediately integrated into a common infrastructure or application backbone.
The Object Management Group is adopting XML Metadata Interchange Format as a standard for exchanging and transforming data models.
XML is rapidly emerging as the standard mechanism for defining semantics for shared data structures. When XML data is carried over an HTTP transport layer, the problems of integrating the variety of modern interfaces is solved. XML is carried in a "stream" format, allowing it to be stored in a traditional file system or streamed across the Internet from a database or repository. More and more applications are being delivered with XML interfaces. Any such application using an agreed XML representation of shared data items can be immediately integrated into a larger application system.
Most mainframe applications will require a specific XML adapter to translate the input/output of mainframe legacy systems from their proprietary formats to a neutral representation as the basis for integration.
EAI also requires robust and flexible communications that are not tightly coupled with components or applications. Multiple forms of messaging have to be supported external to the applications themselves.
The most difficult challenge to rapidly integrating mainframe components into a larger enterprise application is the coordination of loosely coupled applications and components into a value-added business process.
This challenge arises from the fact that mainframe applications combine control flow with business and other logic and were not designed to participate in the process flow of an enterprise application. The challenge is increased when the applications live within an OLTP environment that was not designed to participate in a larger business process.
The only way to solve this challenge is through a process manager that can coordinate the activities of the separate applications while delivering high performance, availability and reliability. Middleware solutions attempt to solve this problem by having each application explicitly send necessary information to the next application in a process. This results in process logic being implicitly encoded into each of the participating applications and the delivery of a large- scale, but, fundamentally, point-to-point solution.
About the Author: Arthur E. Gould is Product Manager for Forte for OS/390, EAI Business Unit for Forte Software (Oakland, Calif.).