HTML and XML: The Internet Team

About a year ago in this column, I considered the current state of eXtensible Markup Language (XML) and advised readers to "keep an eye on it," but not to plan for immediate implementations. A year is a long time for the Internet, and a reader recently asked me to reconsider.

All too often XML is simply viewed as HTML on steroids: A markup language that is the successor to HTML in flexibility and strength. In fact, XML is aimed at an entirely different set of problems. HTML is a universal language for displaying data and, with its companion Cascading Style Sheets, is likely going to be the way we address the issue of displaying data for the foreseeable future.

While HTML is a reasonable way to ensure vendor-independent presentation, it is a total failure at supporting applications. When you send a query to a remote database, the server must still return an HTML page. The data, and any accompanying formatting information, reside in the HTML page. The HTML page is particularly dumb: It knows nothing about the embedded information and can’t take action on the data. Instead, HTML tags can only be used to describe how the data should eventually be displayed.

Unlike HTML, XML describes information, not the user interface. XML allows any author to create his or her own tags that describe the data. The standard allows you to transfer structured information from server to client across the Internet. This means application designers can use XML to send data over the Internet to other applications without losing the meaning of the data. The benefit works in the other direction, too. Clients, instead of using today’s fairly cumbersome methods of sending information to servers, can now structure the data using XML and be assured that a server will be able to parse and act on the data.

Although developers can use any tag they like, there is an increasing consensus that a common set of XML definitions will emerge for specific applications. In fact, some industries have already developed their own XML standards.

XML promises to propel Internet applications into a new era of data-aware, client/server computing and out of the current document publishing model. It may have a dramatic effect on the way all applications developers work with data. Still, there is no default presentation for each tag. It’s very important to understand that XML complements HTML; it doesn’t replace it. You cannot use conventional XML to specify how data is to be rendered at the user’s workstation. One way to do this with HTML is to simply combine the XML with HTML. HTML then gets used to format the contents of the document and XML is used to define the structure of the data.

A better approach is to use Style Sheets to define how each piece of the XML data should be formatted. While the Internet’s Cascading Style Sheet language can be used to do this, a new technology, called eXtensible Stylesheet Language, has been designed exactly for this purpose.

For XML to work in early deployments, the desktop, specifically the browser, must be able to make sense of XML. The embedded engine for understanding XML is called a parser. To date, only a few parsers have been developed. Microsoft’s current parser, called MSXML, is shipped as an ActiveX control in Internet Explorer version 4. Netscape has yet to incorporate a parser into version 4 of Navigator, but in the next release of Navigator, Netscape plans to include XP -- a parser written in C and faster than the competitors that are commonly written in Java.

In practical terms, this means that XML applications can -- so far -- only be written for Internet Explorer. Worse still, in my opinion, is the fact that application development environments that take advantage of XML are small in number and huge in cost. As a result, costs are likely to be high for those attempting early XML deployments.

Things are sure to change. Microsoft Office 2000 will allow users to store the traditional .DOC, .PPT and .XLS proprietary binary formats along with a new XML/HTML companion format. New versions of browsers are set to appear this winter that are certain to have ever-improving support for XML.

Until then, I still believe that XML is one of the most important standards for corporations to watch. Is it time to jump on the bandwagon? For now, I say the answer is no. I would wait until the tools -- both for developers and users -- become more widely available. --Mark McFadden is a consultant and is communications director for the Commercial Internet eXchange (Washington). Contact him at