In-Depth
Enterprise Application Integration: Building an Infrastructure to Support Rapid Business Change
If the lifeblood of today’s corporations is information, then the arteries are the inter-application interfaces that facilitate the movement of data around the enterprise. For the typical organization, this network of applications has grown into a collection of ad hoc interfaces, added over long periods of time.
Current trends, such as data warehousing requirements, corporate mergers and acquisitions, electronic commerce and the influx of Enterprise Resource Planning (ERP) packaged applications have forced IS departments to constantly find new ways to make different systems talk to each other.
International Data Corporation (IDC) estimates that worldwide IT spending in 1997 was $751 billion, and Forrester Research estimates that 30 percent of that total, or $225 billion, was spent on the design, development and maintenance of integration solutions for enterprise applications. Most of these were time-consuming and expensive hand-coded, point-to-point integration programs.
Untangling the interface chaos requires a formal architecture for integrating applications across the enterprise – from legacy systems to PCs. Some organizations are already attempting to address this issue, yet the questions every company must ask themselves are:
• Why is application integration the only part of the corporate information environment where it is still acceptable to hand-craft custom code at enormous long-term cost to the corporation?
• Given the direct cause and effect relationship between a company’s information systems and its competitive agility, why are long-term enterprise application integration (EAI) infrastructure decisions being made at a technical project team level and based solely on individual project criteria?
• Why is the company’s strategically important EAI infrastructure permitted to continually evolve without a guiding technical architect or long-term plan?
What is an EAI Infrastructure?
The value of an EAI infrastructure is that it provides a centralized, manageable, scaleable foundation for integrating all enterprise applications – independent of vendor-specific applications, operating systems and databases. This enables organizations to finally untangle complex computing environments, rapidly deploy enhanced enterprise software applications and reduce the total cost of ownership and maintenance of their enterprise IT infrastructure. The key is business agility – with a flexible EAI infrastructure in place, IT is empowered to keep the pace with today’s rapidly changing business environment.
Take, for example, a data warehouse, which requires that large volumes of clean historical data be moved, on a regular basis, from operational systems into warehouses, data marts and OLAP products. The source data is usually structured for on-line transaction processing (OLTP), whereas the warehouse accommodates on-line analytical processing (OLAP). The source data must therefore undergo extensive aggregation and reformatting before it can be loaded into a warehouse. Anecdotal evidence suggests that some of the more complex aggregation processes can take as much as 50 hours of computer time to complete, which is a significant maintenance effort.
An EAI infrastructure can significantly reduce this time by providing a standardized system – complete with transformation power, scalability, and interface management – to integrate data from both operational and decision support (DSS) application systems with the same high performance.
Can your organization benefit from a formal EAI infrastructure? It can, if any of the following scenarios are occurring:
• The rate of business change outstrips the rate at which your business applications can be updated.
• Your organization’s eagerness to employ the latest technology leaves many incumbent systems playing "catch up."
• The data from certain business applications have become so critical that they are overloaded with requests.
• Batch windows on some applications are just not long enough to extract or load the necessary data.
• Application maintenance efforts focus more on the interfaces with other applications than on providing new application functionality.
• The transfer of data files between systems is the primary interface mechanism.
Recognizing these issues is the first step towards building a strategic EAI architecture. The next step is to find the most appropriate product, or mix of products, to address these issues – as well as your organization’s specific business requirements – while providing the scalability to allow for future growth and the flexibility to respond to rapid business change.
Building an EAI Infrastructure
So now you’ve decided that a formal architecture for networking applications across the enterprise is a compelling strategy. But, where do you go from here? There are a number of product categories that may support your objectives and requirements. Following, we will examine each of these categories in some depth to identify the best fit for the problem. In broad terms, this involves an analysis of product categories that specialize in data-level and business-model-level data movement and sharing.
Data-Level Products. A range of product categories have emerged that support the movement of data between applications. According to Gartner Group there are six types of products that do this: file transfer, copy management, data propagation, schema-specific data synchronization, database replication and extraction/transformation. To reduce this to a more manageable size, the concern need only be with products that are capable of getting data directly into and/or out of an application’s data store and can also cope with changing the format of the source data to fit the target.
This only leaves extraction/transformation products. Within this category there are fundamentally three different types of tools: code generators, transformation engines and data warehouse and data mart loaders.
Code Generators. The premise of the code generator is to assist with the manual coding of programs for extracting data from an application and transforming it for loading into another application. Though the tool itself may be independent of the source or target system, the resulting program is not. Consequently, any system that is interfaced to more than one system will have considerable extra processing.
Another issue is that the generated program rarely has all the desired data movement functionality. So, modifications to the generated code are either performed by the user if the user can work out what the code is doing – or through the consulting arm of the vendor. Also note that the language used for the generated program may differ from system to system – e.g., COBOL on MVS, but C++ on UNIX.
These generated programs represent point-to-point interfaces. Scalability is therefore a concern. Modifying an application can require major regeneration and modifications to existing interfaces. Therefore, development staff need to be fluent in the language of the generated program in case things go wrong – although for modest application networks, this approach can work.
Data Warehouse and Data Mart Loaders. Data warehouse and data mart loaders can be found in either code generator or engine/hub forms. However, as expected, their focus is in transforming operational data into a form that can be loaded into a very specific type of store.
Warehouses and data marts are structured in only a handful of different ways – ROLAP/MOLAP formats, aggregated fact tables, Star, Snowflake or HyperCube schema. All of these structures require that the source data be aggregated so that large volumes of operational data is reduced to a summary of historical records.
There is no question that aggregation is an essential part of transforming data in an EAI infrastructure. Even though some load processes must take place overnight, many tools in this class are unable to meet these time requirements on their own. Warehouse loaders generally do not have the fault tolerance or performance requirements that make them viable for linking together a host of operational systems.
Transformation Engines/Hubs. Trans-formation engine/hub tools, like code generators, use application metadata to create export-transform-load programs. The main difference – all the code is executed at a central location, independent from the source and target.
The transformation engine/hub works by getting the source data, then moving it to a separate place (usually a different machine) where the transformation can take place. Some tools do the transformation in memory, and consequently, these tools are not scaleable.
For very large volumes, some tools have a transient data store option. The transient store is used for excess data that cannot be processed in memory alone, or when processing multiple data feeds are required. The central transformation engine/hub, with the transient data store, meets our criteria for a centralized hub and data staging area. The same development environment and tools can also be used for all the application interfaces that are built; there is minimal impact on the source and target system; and higher data volumes can be achieved than any other of the data level approaches that have been discussed.
The transformation engine/hub style is recommended for data level interfacing. This centralized approach to writing and managing interfaces supports scalability, rapid interface development and maintenance, and data staging.
Business-Model-Level Products. Generally speaking, an application communicates through an external interface using one of two mechanisms – interface tables in a database, or through a series of function calls. As dealing with interface tables is a data level operation, many of the tools discussed in the previous section can be used for this. The focus here is on function calls.
According to Gartner, there is a variety of product types that can be used at the business-model level: synchronous Remote Procedure Calls (RPCs), stored procedure calls, Object Request Brokers (ORBs), Transaction Processing (TP) monitors, database triggers, message queuing, message brokers, asynchronous RPCs, and Publish and Subscribe. Unlike data-level products, it is not easy to discount products on a single generalization, as all of these product types can be used to link applications together and initiate communications.
The crucial test for the EAI infrastructure, though, is scalability and manageability. To put it another way, "With this product, how easy is it to add, remove or modify applications across the enterprise?" To answer this it must be realized that, at the business-model level, two applications will be interfaced through the use of function calls – one application sends data to the other by calling a function over the network.
This implies that one application knows what function to call in the other, and the other to know how to process the request. For a legacy system, this will probably require some amount of retrofitting of the application to use the functions of another.
Of all the product types, the RPC mechanism is the most basic. The source application calls the function of another by specifically naming the target application and its function.
At the other extreme is the message broker where an application simply calls a logical function that it wishes executed. The broker then maps this to a specific function in another application. Neither the source nor target application know in advance which application is involved. This level of detachment offered by the broker architecture makes it the most scaleable of the available options and most appropriate for an EAI infrastructure.
The Complete EAI Solution
The overall EAI solution consists of technology for supporting both data-level and business-model-level interfacing. The recommendation here is for a transformation engine/hub and a message broker or object request broker to be utilized together to support a large scale EAI infrastructure. There is some crossover between these two technologies, even though transformation engines are utilized more for batch-oriented movement of large data volumes and the message broker best supports near-real time movement of small data volumes. As the boundary between the two is blurred, there will be many instances where either may suffice for a particular interface.
The other issue is when a ‘step up’ or ‘step down’ transformation between data and business model levels needs to occur. Though it was stated earlier that all applications have a data level interface, it is often impractical to use it. Today, there is little if any choice for building ‘step up/down’ transformations. The transformation engine and message broker are flexible enough to support this need with some manual effort on the part of the end user. This will change and ‘step up/down’ transformation will be supported in both types of products. The final element to put in place is interfacing with existing infrastructure management solutions. Enterprise Management Framework products have still to support inter-application data communications.
With a strategic EAI infrastructure in place, organizations can finally get a comprehensive look at their corporate data – both operational and informational – across all applications, databases, and systems, from legacy to ERP. Unlike point-to-point solutions and manual coding alternatives, a hub-based, architectural approach to EAI offers the transformation power, interface management, and scalability required to end interface chaos and enable true business agility.
About the Author:
Brian P. Donnelly is founder and Chairman of the Board of Constellar Corporation (Redwood Shores, Calif.). He can be reached at bdonnelly@constellar.com.