In-Depth
Building Reusable Components from UNIX Applications
The promise of object-oriented programming is to promote the reuse of code. Unfortunately, in most object-oriented programming languages, software reuse occurs at the source or binary object level. That is, application programmers need access to source level interfaces, such as header files and object libraries, that they can link in to their programs.
Component programming is an extension of the concept of object-oriented programming that allows programmers to make use of reusable objects dynamically, using only binary representations of objects. Using components, a complex application can be composed from building blocks derived from many different sources, and these building blocks can be accessed directly using their binary representation.
Component technology allows you to integrate your existing applications with other applications in new ways, thus retaining the significant investment you have made in them. At the same time, componentizing your applications exposes their functionality to other programs. This article explores some of the perplexing architectural choices these technologies present us.
Why Components?
In a previous article ("Application Porting," Platform Decisions, August 1999), I introduced a number of issues associated with planning and executing a platform migration from UNIX to Microsoft Windows (including Windows NT and Windows 2000).
One of the most fundamental issues we may face revolves around enabling mission-critical application software on the target platform. The most obvious and straightforward approach to this issue is to port these applications as-is, minimizing source code changes. For example, we may choose to use a portability layer of some sort to implement the system behaviors on which the applications depend. Examples of such portability layers for Windows NT include the Posix conformant Interix subsystem developed by Softway Systems, and recently acquired by Microsoft (www.interix.com), Nutcracker from MKS/Datafocus (www.mks.com), or higher-level abstraction layers such as the Adaptive Communication Environment (www.riverace.com).
We refer to applications ported in this way as "literal" ports. Although many discrete changes may need to be made in order to make the application conform to the target environment, the architecture of the application, its inputs, outputs and processing remain essentially unchanged.
This is the most direct route to enabling the application on the new platform, and often is good enough for our purposes. In some cases, however, we find that porting applications in this way can produce unsatisfactory results. In particular, we may want the application to interact with users and other applications in novel ways (for example, through an Internet or intranet).
Why should we want to do this? One reason is that candidate applications selected for porting often encapsulate some core functionality unique to the business environment in which they have been deployed. Such applications will be targeted for porting rather than replacement because they represent a significant investment of intellectual capital. They will often be based on more mature bodies of code, in many cases implemented in C, C++ or other programming languages, and their interfaces with other applications will often be obscure. Typically, these applications have been in existence for some time, and were not designed with features of the target platform in mind.
The way we accomplish this is to wrap up the application’s functionality in a binary object (or "component"), that allows us to decouple that functionality from the user interface, databases, transaction monitors and other applications.
Two basic component models are of interest to us here: The Common Object Request Broker Architecture (CORBA), and Microsoft’s Component Object Model (COM). CORBA is a widely accepted specification for binary object programming developed by the Object Management Group (OMG), a consortium of technology vendors and end user corporations. The OMG currently includes more than 800 member organizations. A widely used Internet protocol called Internet Inter-ORB Protocol (IIOP), built on CORBA, is supported by a large number of technology vendors, like Netscape, Oracle, Sun, IBM and others.
COM is Microsoft’s mechanism for defining binary representations of objects. COM is fundamental to Microsoft Windows systems, and although its distributed version (DCOM) is conceptually similar to CORBA, the two architectures are not compatible at the binary level.
These competing architectures present us with some difficult design choices, which we will discuss in the next section. But, first let’s examine how these systems are similar. In a generic distributed component architecture the distributed object technology decouples client from the server in several ways.
The presentation logic on the client side may be composed of binary user interface components implementing forms, for example. These components communicate through several layers with application logic also encapsulated in components on the server side. These intermediate or "middleware" layers may implement transactional semantics, asynchronous communications via message queues and load balancing.
The problem we face is exactly how to unleash the core intellectual capital from our existing applications, while integrating with other applications, binary components and middleware as seamlessly as possble. Our design may be constrained by dependencies on particular database systems, message queuing APIs or other fixed criteria of the existing environment.
Technology Choices
Here, you may choose from a number of competing technologies attempting to occupy the same solution space. In particular, you may choose to deploy a solution conforming to Sun’s Enterprise Java Architecture, or Microsoft’s Windows DNA (Distributed Internet Architecture).
The architectural choices between these technologies can sometimes be perplexing. When is one more appropriate than the other? What are their advantages and disadvantages?
The key elements of Microsoft’s Windows DNA architecture are Active Server Pages (ASP), and the Component Object Model (COM). Transactional semantics may be provided by Microsoft Transaction Server (MTS) and a buffering mechanism might be provided by some queuing software, such as Microsoft Message Queue (MSMQ). All of these services are tightly integrated with the Microsoft Windows architecture.
The key feature of this architecture is its basis on the Windows platform for both client and server deployment. Interfaces to everything else, labeled "legacy" systems here, will be made through Microsoft’s Integration Server (the next release of SNA Server, code named "Babylon"), or through a data interchange server based on the Extensible Markup Language (XML), called "BizTalk."
The key difficulty here is bridging the gap between the communication mechanisms available through COM and those presented by your existing application. Significant channels here include TCP/IP sockets and IBM SNA protocols. The COMTI component of Integration Server provides the necessary bridging mechanism here. Using COMTI, and some glue logic necessary to massage your data formats into those required by the existing system, you can expose the functionality of that system to Web users.
These technologies are tightly coupled with other Microsoft technologies supporting the enterprise. Thus, Active Server Pages find their best support in Microsoft’s Web servers (IIS or Site Server Commerce Edition, or simply "Commerce Server").
An alternative approach to this same problem is based on Enterprise Java Platform. In this model, components are implemented as JavaBeans on the client side, or Enterprise JavaBeans on the server side. Applications are isolated from specific transaction monitors through the Java Transaction Service API (based on CORBA interfaces) and vendor independent IIOP. The Java Database API (JDBC) provides database connectivity. An application based on these interfaces can make use of binary objects presented by any CORBA-conformant object broker (ORB), and can connect to any database presenting a JDBC driver. JDBC drivers exist for most widely used databases, such as Oracle, Sybase, IBM DB2 or Microsoft SQL Server. Similarly, the Java Messaging Service (JMS) provides the capability for asynchronous communications in a vendor-independent manner. While commercial implementations are still lacking, JMS has been endorsed by a number of important vendors, notably IBM and BEA Systems.
What does this mean for us? We must choose between these two architectures, and there is very little common ground between them. If we choose to integrate with the Windows DNA architecture, we might try to wrap up our UNIX functionality in a COM object, or a COM object with transactional semantics (a so-called "COM+" object). If we do this, we will tend to obtain better integration with Microsoft middleware, but at a cost of relatively greater platform dependence. Wrapping up our UNIX application in a COM component is not as difficult as it might otherwise be, due in part to strong development tools. For example, MKS/Datafocus allows you to use the ATL COM wizard to wrap UNIX code up into a COM object.
Conversely, if we choose a Java implementation, we may obtain greater vendor independence and better support in the UNIX world. We will still be able to deploy into a Windows environment, although not as easily as might be the case with a COM-based implementation. In this case, we will want to wrap up our legacy code into a shared library, and invoke it from Java wrapper objects using the Java Native Interface (JNI).
In either case, we may run into difficulties due to incomplete implementations or dependencies on specific middleware products. A number of major vendors have announced support for the Enterprise Java Architecture, but compatible commercial products are lagging somewhat. Similarly, some key components of Microsoft’s Windows DNA Architecture are yet to be released, even in beta versions.
How Does It Work?
Let’s drill down and examine these basic approaches in a little more detail. In order to build a component, you must:
• Define an interface using an interface definition language
• Implement the interfaces
• Combine the interfaces with the implementations into an executable or shared library
• Register the component so consumers can find it
In order to use a component, we wrap it up in a container object. The container object acquires an instance from a server, which will provide the object’s implementation. In COM, this is called the "class factory," in CORBA the "object resource broker" serves as the intermediary between containers and servers. The container then queries the object instance for the interfaces required, in order to invoke the object’s methods.
The key concept here is that the container and the server may be implemented in any supported programming language, and may reside on different hosts, independent of operating system or hardware architecture.
Details of programming COM and CORBA are described elsewhere, but here it is important to note that although they are conceptually similar, the programming models are quite different.
Java provides a potential solution to this problem, since it provides portable interfaces to either CORBA or COM. Java and CORBA complement one another as a portable implementation mechanism and a platform-independent distribution mechanism respectively (visit http://java.sun.com/j2ee/corba, and www.omg.org/library/wpjava.html).
At the same time, Microsoft’s virtual machine allows Java and COM objects to interoperate. Thus, a Java program can access COM objects, and Java objects can present COM interfaces. The gotcha here is that the Java program must be compiled using the Microsoft Java compiler, using some Microsoft specific compiler directives.
What Can We Do with It?
Once we have an application wrapped up in a component, what can we do with it?
• Build a command line executable, daemon or installable system service
• Build a GUI, using platform native components
• Integrate with other components in novel ways (for example, you may be able to activate your component in a container document or spreadsheet)
• Build a control embeddable in a Web page
• Integrate with a Web server to enable Web pages to deliver dynamic content
The jury is still out on these baffling technology choices. While implementations of some key Windows DNA components are still pending, broad industry acceptance of the Enterprise Java Architecture is increasing, particularly in the UNIX community. For example, a beta implementation of Java Server Pages and Java Servlets, integrated with the widely used Apache Web server is currently available in beta (visit http://jakarta.apache.org).
On the other hand, COM enjoys enormously wide acceptance as a fundamental component of thousands of widely used Microsoft Windows applications, deployed on literally millions of systems around the world. If you are primarily targeting the Microsoft Windows platform for deployment, you will tend to achieve better performance, a lighter weight deployment and tighter integration with other Windows applications if you base your design around COM and the Windows DNA Architecture.
Our recommendation is to carefully examine your environmental requirements, ensuring that those requirements are met by supported versions of any third-party software on which your implementation will depend.
About the Author: Andrew Lowe is Chief Technologist for PSW Technologies Inc., and author of Porting UNIX Applications to Windows NT (Macmillan Technical Publishing). He can be reached at [email protected].