XML: The Web is Just the Beginning

How many different ways are there to represent stored data today? Off the top of my head, I can think of a whole bunch: relational tables, ISAM and VSAM files, ordinary flat files containing plain ASCII text and many more. Add to this the multiplicity of ways that data is represented when it’s transferred between systems -- such as HTML and the binary formats used to define packets for various protocols -- and you have the root cause of many sleepless nights for applications integrators.

All of these different ways of representing data were created to meet specific requirements, requirements that were usually driven by the need to make intelligent use of CPU cycles, disk space, or network bandwidth. But does this need still exist? The answer is yes, at least in some circumstances. But in a world awash in cheap MIPS, gigabytes of storage and multimegabit networks, it’s become much harder to justify the diverse data formats that exist today. And the hardware restrictions that motivated efficient but idiosyncratic data formats are rapidly eroding. Having one common way to describe all kinds of data, whether it’s stored in a database, sent across a network or used in some other way, would be a step forward.

We’re almost there. XML, the universal mechanism for describing data, has arrived. The first clear proof of this is in Web browsers, where XML support has become a critical feature.

But I think people place too much emphasis on XML’s use on the Web. It’s true that data described using XML will make up a larger and larger percentage of the bytes transferred between Web servers and browsers. It’s also true that bytes moved across the Web comprise an ever-larger proportion of data transfer in general. Yet innovative use of XML in data storage is every bit as important.

Consider, for example, the XML support in Oracle 8i’s Internet File System (iFS). iFS essentially provides a file system on top of a database. Clients can view an Oracle 8i database as a file server, with data appearing as ordinary folders and files on a Windows desktop. Since XML documents are ordinary text, iFS can store and retrieve them like any other file. But iFS can also read an XML Document Type Definition (DTD) that defines what tags are allowed in a specific class of documents. iFS can then automatically convert XML documents into relational records stored in an Oracle database.

How the various kinds of information in a document are stored in tables is controlled by user-defined mapping between the XML DTD and the relational database. For example, an XML-defined order might include tags such as <customer>, <item> and <price>. Oracle’s iFS could use these tags to copy the data in each order into a single record in a table, with the various values in the order -- customer name, item, price and so on -- stored in individual fields in that record. Alternatively, the text for several tags might be stored in a single field, or the entire order stored in its XML form as a single item in the database. However it’s stored, the information can now be accessed using ordinary SQL queries. When a user retrieves one or more of these orders via iFS, the result will automatically be converted into its XML form for processing by other software. Depending on how it’s accessed, the same information can be viewed as relational data or as an XML document. XML as a lingua franca for information storage and exchange is a powerful idea, and technologies like this one lead the way.

We are approaching a day when a majority of new data formats, whatever their use, will be defined using XML. Whether the problem is to define the layout of information in a database, specify header formats for a new protocol or represent information exchanged between applications, some XML schema will be used to describe the data’s format.

The next time you find yourself defining a new format for storing or transferring information, I encourage you to begin by asking yourself a simple question: Can I use XML for this? More often than you may think, the answer will be yes. And choosing this soon-to-be-ubiquitous standard whenever possible puts you on the road to the right future. --David Chappell is principal of Chappell & Associates (Minneapolis), an education and consulting firm. Contact him at david@chappellassoc.com.

Must Read Articles