In-Depth

XML: From Chaos to Cooperation

Technological infrastructures within companies and supply chains today resemble the bar scene in "Star Wars." XML's promise: To turn that chaos into universal cooperation, thus enabling Web services. Here's how the XML revolution is affecting your company.

Listen to the XML buzz, and it's easy to hear echoes of the Internet in the 1990s. Vendors everywhere are incorporating eXtensible Markup Language (XML) into products and press releases. The media claims XML will make businesses more efficient and supply chains more effective. Numerous committees and organizations are hard at work trying to nail down details that can turn the promises into reality.

Is XML truly a game-changer? Although XML is a headline star, understanding is sometimes marred by what IT executives want it to do instead of what it can do. XML isn't a product or a platform, nor is it a replacement for any object-oriented (OO) language like C++. Neither will it completely replace CORBA, COM and other middleware, which will continue to be useful for connectivity that doesn't involve the Internet. Nor is it HTML on steroids, although it also uses tags.

Instead, XML is an enabling technology, comparable to IP. It leverages Internet ubiquity to enable intra- and inter-organizational integration. XML is particularly appealing because it can accommodate the rigid data structures of C, COBOL and other traditional languages without requiring programming that can cascade into unforeseen events. And unlike the middleware, the current workhorse of application integration, XML provides guidance about the content of the data being transferred.

Initial interest revolves around uniting different operating systems, languages, distributed systems and databases. However, XML's real potential comes as a stepping stone away from monolithic, one-size-fits-all applications. This vision involves Web services and similar technologies with scripted components that companies can dynamically orchestrate for specific functionality.

Although advanced XML standards are still being finalized, it's already being used for external integration in the Province of Manitoba [See "XML Moves Province from Last to First"—Ed.] and b-to-b exchanges like Escrow.com. Others, like healthcare giant Bayer, are using XML for internal integration. California State University also has found that XML can substantially speed mainframe data access.

XML Moves Province from Last to First

In the Canadian province of Manitoba, XML helped transform the provincial government's dowdy computing structure into an integrated infrastructure that both eliminated manual processes and opened the door to eventual Web services. An initial XML-based project was so successful that it took the province from worst to first in terms of integration, according to a supplier that works with Canada's governmental entities.

Like many organizations, the Province of Manitoba had grown its technology infrastructure according to departmental directives instead of organizational imperatives. The result: 26 departments, each with its own isolated applications. Information wasn't shared, so a citizen who moved would have to tell multiple departments about the new address. Recognizing the problem, provincial executives began a "Better Systems Initiative" five years ago to cooperatively integrate applications and information exchange.

The first step was an end-to-end architectural blueprint that could guide purchases and development for the next decade. Core technologies included Java, IBM WebSphere, Advanced Edition and DB2.

The first integration challenges involved the personal property registry, which registers liens against property used to secure loans. The existing system involved manual data entry of more than 1,500 transactions daily by both province personnel and provincial personnel. The transactions were sent by a Vancouver-based clearing house called Canadian Securities Registration Systems.

After rejecting proprietary, queue-based solutions via a Virtual Private Network (VPN), the Province of Manitoba looked at XML. Although XML was just emerging as a business data integration technology two years ago, it had the backing of the World Wide Web Consortium (W3C), and had stemmed from Standard General Markup Language (SGML), a long-time standard. Additionally, it was platform-neutral, compressible, extensible and encryptable.

"XML fit in well with our efforts to develop a general solution that we could use for [Canadian Securities Registration Systems] or any other business partner in the future," says Greg Boettcher, chief software architect.

Because the province was an early adopter, it had to develop its own Document Type Definitions (DTDs). DTDs, which have since been superseded by schemas, use plain-text, ASCII-like tags to describe the formatting and content of a valid XML document. Similar to HTML tags, XML tags identify data so that it can be appropriately mapped within an application. Now that task is much easier, since industry consortia and other groups have developed freely available DTDs and schemas.

Launched in September 2000, the solution included replacing the mainframe-based personal property registry with a Java-based system that uses servlets and Enterprise Java Beans (EJBs) running on WebSphere Advanced on an AIX-based IBM S/80. Data updates involved an XML document that was "wrappered" inside another "dispatcher" XML document. The dispatcher could dynamically determine the appropriate application and create the necessary link. Security was handled via SSL.

The XML-based system produced numerous benefits. Previously, up to 20 clerks were involved in manually transferring lien information.With the elimination of manual data entry, department size has been reduced by six clerks. Only three clerks are needed to run reports and monitor transactions. Turnaround time has been slashed from weeks to hours. The benefits prompted Canadian Securities Registration Systems executives to observe that the province "had leapfrogged from the last jurisdiction to automate, to the one with the most elegant solution."

—N.W.

Such examples are growing more common as XML knowledge grows, standards coalesce, tools multiply and Internet-enabled collaboration increases. Interest is high in XML because it's a "meta-language" that represents an agreed-upon transfer protocol at the application level or layer, whether the transfer is between applications or hardware. One benefit of such integrations is that business logic can be written once and used many times, reducing application development time by perhaps a third.

XML can also enable supply chain connectivity. This connectivity occurs almost at once if multiple companies adopt the same XML "rules." By contrast, the granddaddy of business integration, EDI, requires integration of one company at a time. [See "Whither EDI?"—Ed.] XML has other advantages over EDI. While EDI costs $25 to $50 per transaction, XML costs about $5. As a result, Zona Research predicts that business-to-business transactions using XML will rise 40 percent by 2003. In a Forrester survey of 51 global companies, 71 percent were focused on using XML.

Multiple endorsements have made the World Wide Web Consortium's (W3C) XML standards an ecumenical common ground for IBM, Microsoft, Sun, Oracle and other companies that have butted heads in the past. The significance of such agreements increase when you consider that no major vendor is producing Web pages with W3C standards-compliant code.

Standards Drive Acceptance
Such standardization, plus a reference tool for expanded usage, is driving XML acceptance. Engineering the drive toward standardization is the Electronic Business XML (ebXML), sponsored by the United Nations, and the Organization for the Advancement of Structured Information Standards (OASIS) and the W3C. ebXML has incorporated Simple Object Access Protocol (SOAP), an XML-based messaging protocol used by Microsoft's BizTalk and other applications. SOAP is also a building block for Web Services Description Language (WSDL), which forms the basis for creating and using Web services. Although originally developed by Microsoft, IBM and other vendors, SOAP standards development has been taken over by the W3C, which recently published working drafts of SOAP version 1.2.

"SOAP is like an envelope for the XML data inside. It's extensible for additional functionality by adding headers to support security, reliability and routing. These are all areas that will be standardized for Web services," says Bob Sutor, director of e-business standards strategy for IBM.

The W3C announced the XML Schema Recommendation in May. The schema specification, which is written in XML itself, is more suitable for e-business than the older Document Type Definition (DTD) specification. DTDs lacked a methodology for describing data format such as a date. Schemas provide the same functionality as DTDs, but also describe formats in more detail, enabling validation. In other words, while DTDs could indicate a place for a number, schemas can define a currency, complete with the number of decimal places. For some data integration, neither DTDs nor schemas are required.

More than 2,000 companies have signed up for the global Universal Description, Discovery and Integration (UDDI) registries, the equivalent of XML-centric Yellow Pages. These registries can expand XML usage by enabling companies to outline their capabilities to conduct e-business and search for potential trading partners. UDDI usage will undoubtedly grow since Microsoft's .NET initiative for Web services is based on UDDI as well as SOAP.

Because of XML's ability to simplify inter-organizational data transfers, public and private exchanges have been one of the most enthusiastic adopters of XML. For example, Escrow.com helps b-to-b exchanges ensure that fulfillment and settlement terms get executed to both parties' satisfaction. Initially, Escrow.com had to go through the process of developing schemas and ensuring proper mapping.

"Initially, there was a lot of learning between us and the exchanges," says Paul Hodgetts, director of product development at Escrow.com in Santa Ana, Calif. "We had to work closely with them to understand what kind of information they had, learn how it was described and then to bridge the gap between that and what we needed. Much of this discovery revolved around business process issues. Despite the simplicity of XML, there were still challenges to ensure appropriate formatting without omitting required elements."

To ensure transactions ran smoothly, Escrow.com even arranged for XML training for business partners. Enabling XML didn't require any major changes to its Sun Web or application servers or its J2EE platforms. XML translation tools are being used to ensure acceptance of almost any XML-based document and conversion into the format that Escrow.com requires.

XML is also useful for internal integration. Bayer, whose North American headquarters are in Pittsburgh, Pa., has more than 10,000 healthcare, chemical and animal health products. Like other companies, Bayer has discovered that SAP's power is locked within proprietary formats. This complicates integrating SAP data with Web applications.

Running on an IBM AS/400, Bayer's portal provided veterinarians with animal health information. "We needed to make the information on the Web portal accessible to SAP without doing a lot of infrastructural coding," says Sujan Thanjavuru, technical project manager for Bayer.

The proof-of-concept, using WebMethods and IBM MQSeries, eliminated the worry that XML might interfere with processing. SAP is now able to easily access the AS/400 data. The primary obstacles were educational, not technical. "There's not a lot of understanding of XML and its potential," Thanjavuru says, noting that Bayer is also using XML with two b-to-b exchanges. "It's still confused with HTML."

XML is also useful for presentation issues, which are growing more important as the world turns toward Web-centricity. IT personnel at the California State University in Hayward could view the results of mainframe-based data mart extracts and other user-run information via a browser. However, pages loaded slowly and there were minor problems with conversions from EBCDIC to ASCII. Using WebFOCUS from Information Builders, the data was bound to XML, and HTML code was bound to XML fields. That meant the data, not the Web page, got refreshed. "It's pretty instantaneous, although we're having some minor issues with Netscape browsers," says Bob Hughes, lead database architect.

Whither EDI?

Is the white knight of XML about to unseat the aging dark lord of b-to-b integration, EDI (Electronic Document Interchange)? XML has a lot in its favor. It's a child of the Internet, while EDI primarily relies on expensive value-added networks (VANs). XML is relatively easy to implement, since it's optimized for ease-of-programming and Web server operations. In contrast, EDI is machine-readable only, the message format can take months to master and an EDI server can cost up to $100,000.

But rather than being adversaries, experts say, XML and EDI can coexist quite happily together within enterprise networks. In the long term, EDI's role as kingpin of business data exchange may be diminished, but it's a long way from being a figurehead.

In fact, a recent report by AMR Research indicates that about 80 percent of b-to-b data exchanges still use EDI. XML usage will continue to grow, the report predicts, until it represents about half of all business data exchange in 2003. Those figures echo findings from Peregrine Systems Inc., a major EDI vendor. Analyzing customer usage, Peregrine now finds about 85 percent of data sent via Web forms uses EDI, and estimates that EDI will still represent about 40 percent of such transfers in five years. "That's not because of the decline of EDI but because of the growth of XML," says Steve Gaylor, vice president of product marketing in Peregrine's Emarkets group.

One reason, of course, is that EDI is a long-established incumbent. Although difficult and expensive to implement, EDI networks are working well now. Knowing well the perils of new systems, network managers are loathe to fix what ain't broken, especially when the pressure's on to maximize ROI from existing investments.

Another reason is that the bloom has faded from XML to some extent. Although XML has a lot of advantages, there are still issues with mapping, trading partner integration and messaging. According to Amy Hedrick, senior analyst at AMR Research: "Despite the hype, you can't call XML implementation a walk in the park." While EDI boasts rock-solid levels of security, sequencing and acknowledgement, questions still linger about emerging Internet standards in these areas.

Finally, integration efforts have reduced the need for an either-or solution. XML standards efforts such as Electronic Business XML (ebXML) and the Organization for the Advancement of Structured Information Standards (OASIS) have incorporated full support for legacy systems, facilitating the growth of EDI/XML transformation tools. Additionally, XML documents can be wrappered within EDI networks.

Long-term, EDI may become the COBOL of data communications, vital and useful but less and less common. As supply chains become more complex and collaborative, XML will likely become the technology of choice because it's better suited for many-to-many connectivity and business process integration.

"People assume that XML is the holy grail, but there is still a lot of work to be done to match EDI in some areas," says Hedrick. "As a result, look for parallel EDI and XML systems for a number of years."

One sign of EDI's staying power: Several years ago, Peregrine removed EDI mentions from its marketing material. It's recently added them back.

—N.W.

Concerns About Side Effects
As useful as XML is, it's important to remember that even miracle drugs have side effects. The tag-embedded information creates a lot of overhead. "The tags can expand the data by as much as 10 to 15 times. Management of that overhead in memory presents numerous challenges. The same expansion also consumes network bandwidth, "although that's not as big an issue as it used to be," says Paul Roth, CTO at CommerceQuest, a solutions provider in Tampa, Fla. Some of the issues, he adds, can be alleviated with compression.

Some fear that XML will degenerate into a variety of incompatible dialects. Part of XML's strength comes from its standardization. While such standardization cannot meet the specific requirements of every industry, the proliferation of industry-specific XML standards raises the specter that companies in one industry may not be able to communicate with other industries. A similar objection comes from the Semantic Web movement spearheaded by Web legend Tim Berners-Lee. Semantic Web adherents believe that their vision of an agent-based Web will be undone by all the flavors of XML, making it harder to deliver specific answers to questions.

Despite these issues, a larger perspective is needed, argues Clive Finkelstein, author of numerous books on information engineering, including his most recent, "Building Corporate Portals With XML."

Finkelstein points out that corporate computing benefits whenever key technologies involving communications, platforms or data are freed from proprietary shackles. In the early 1990s, the -Internet opened up communications. In the mid-1990s, Java was a portable language that could be executed on any platform. XML represents the ability to make data portable.

On one level, portable data eliminates costs associated with redundant data entry or complex integration to ensure organizational synchronization. Other savings result from lower business process costs. For example, computer manufacturer Dell estimated that its largest accounts can save up to $4 million annually with XML-based procurement systems integrated with Dell's systems. But more valuable, according to Finkelstein, is the ability of XML to unlock business logic and functionality that has been trapped in platform- or OS-specific databases or applications.

"Billions of dollars of code in proprietary platforms can be made platform-agnostic. Code modules and functionality can be invoked using XML messaging wherever they reside around the world," says Finkelstein.

The Marriage of XML and EDI

Electronic Data Interchange (EDI) has for many years offered the promise of connecting corporate systems internally and externally to share business data. With EDI, business can tap directly into vendor and customer inventory and ordering systems, automating and lubricating business processes.

The hype around Web services suggests XML schema will supplant EDI as the standard for business-to-business automation, but businesses with expensive EDI systems in place are hardly going to rip out a proven technology simply on the basis of Microsoft and others' prognostications. EDI will stick around for years to come.

Smaller businesses are drawn to commodity software like Microsoft's BizTalk Server, which enables business integration through XML. Given that, and with so many large enterprises with EDI systems in place, a market clearly exists for EDI-to-XML connectors that will allow EDI systems to talk to XML systems.

One company, Global eXchange Services, a subsidiary of General Electric Corp., offers a standalone product for getting EDI and XML to talk.

The GXS message broker sits between a legacy system and an XML system and converts the EDI fields to XML schema. It has adapters to talk to common business platforms such as SAP. General Electric offered traditional EDI services for enterprises, then spun off GXS as a separate unit to tackle the XML market.

Some software vendors offer XML transaction servers that enable administrators to convert EDI data to XML. CommerceSight from Enigma sits as an application integration layer between systems, converting data to XML. Enigma's focus is end-to-end XML integration, and provides EDI functionality to that end.

Other companies that offer EDI functionality with XML transaction servers include Autonomy Corporation plc, Extricity, which was bought by Peregine Systems Inc., Software AG and VelociGen Inc.

Business needs seem to find open source projects as much as commercial products these days, and EDI-to-XML translation is no exception. The XML-Edifact project offers a Perl module for translating well-formed EDI documents into XML and vice versa. Unlike the commercial products, it's freely available.

—Christopher McConnell

Web Services Build on XML
In essence, this is the Web services vision that builds on XML capabilities. Companies will be able to buy or sell functionality on a per-use basis. They then can orchestrate that functionality to achieve a specific business goal, without worrying about integration or other complexities. Harbingers of that approach can be seen in Microsoft's efforts to sell Office XP through subscription licensing.

"The IT industry today resembles the auto industry early this century. Back then, cars were individually handcrafted; today, applications are essentially handcrafted. In 10 years, university graduates will look with amazement at how we wrote incompatible programs for different platforms. IT functionality will result from a series of Lego-like building blocks, paid for as necessary, and assembled in different ways by different enterprises to achieve required results," believes Finkelstein.

Technological infrastructures within companies and supply chains today resemble the "Star Wars" bar scene. XML promises to turn that chaos into one, big, happy family. The industry is currently cooperating well to make XML a universal peacemaker, and notable successes have been scored. However, it's still an open question whether a vision of universal computing brotherhood can be achieved while the standards are still evolving, and each industry insists on special variations. Yet XML is vital because it's a potential key to unlocking the functionality and information hidden within applications, thus opening the door to powerful new applications and collaboration. No one knows the outcome of the XML revolution, but no one can afford to sit on the sidelines.

Must Read Articles