XQuery Use Grows Despite Lack of Finalized Standard

After over five years in the making, the XML Query language standard is still missing in action

With Oracle Corp. making noises about dropping a native XQuery implementation into an upcoming release of its 10g database, and Microsoft Corp. talking up native XQuery support in both its SQL Server 2005 database engine and .NET framework, the XML Query (XQuery) language is once again back in the spotlight—even if it’s still a not-quite-ready-for-primetime player.

XQuery is the unified standard for querying structured and unstructured data that (as of this writing) has been more than five years in the making. Delivery dates for a finalized XQuery 1.0 standard have come and gone. The XQuery specification itself has been through more than half a dozen W3C Working Drafts and was revised three times last year, when it was finally elevated to “Last Call” status.

A Last Call announcement is generally among the final stages before a Working Draft can be approved for W3C Candidate Recommendation, but the XQuery Working Draft has also been updated once this year, according to the W3C, to incorporate feedback from comments received during the Last Call period.

When will the finalized XQuery standard appear? A representative from the W3C XQuery Working Group did not respond to repeated requests for comment, but to some extent, it’s a moot question. The lack of a finalized standard hasn’t deterred many organizations from experimenting and, in some cases, building production solutions on top of, XQuery. Enterprises have been ably assisted by Oracle, Microsoft, and other database vendors.

Oracle has provided an XML prototype for its flagship database for several years now, and Microsoft, Software AG, and other database vendors have also delivered XQuery prototypes.

Earlier this year, Oracle upped the ante, announcing plans to drop an XQuery facility into the second release of its 10g data store. Microsoft, for its part, has built an XQuery engine into its SQL Server 2005 database server and into the .NET framework’s Common Language Runtime (CLR) environment.

What’s more, users such as Simon Kelly, a former Web developer with the Karlsruhe Tritium Neutrino Experiment (KATRIN)—a scientific project based in Germany that’s sponsored by several European and U.S. academic and research institutions—are turning to the not-quite-ready-for-primetime XQuery specification to solve vexing data interchange and analysis problems.

Kelly, for example, tapped XQuery to help with the problem of gathering, storing, analyzing, and presenting very large data sets of from four to 150 GB.

If anything, he says, the lack of a finalized XQuery standard was the least of his concerns. Instead, the relative immaturity of the available XML databases was the biggest problem. “There are no XML databases that are stable enough to carry out” the gathering, storing, and analyzing of data with zero loss. So all the Control Data was being stored in an Oracle 9i relational database.”

The good news, Kelly says, is that Oracle’s XQuery deliverable worked as advertised—with one major caveat, of course: “The XQuery method of creating output XML was a great success. However, there is one major problem with the Oracle prototype, and others later used as replacements, in that it is very CPU- and memory-intensive,” he comments. “When searching the data, we are sometimes required to look at up to a day’s worth of signals at a time.”

KATRIN researchers often work with more than 17,000 samples, each clocking in at between four and 150 GB of data. “As you can well imagine, this creates very large XML documents even when each detector is accessed separately and only the critical signal times are taken,” Kelly says. “Coupled with the required detector data and schema format, Oracle’s XQuery prototype choked when the XML element count approached 100,000.”

Kelly eventually had to abandon XQuery in favor of another Oracle solution, called XSU. He stresses that his experience with XQuery should by no means deter users who expect to use the technology with smaller data sets, however.

“I would have rather used the XQuery implementation, as it is far simpler and cuts down on the amount of subsidiary work required to get the end result,” he comments. “And I would highly recommend it to anyone who will not have to exceed the 100,000 element threshold.”

Of course, other customers have had more luck with the still-to-be-finalized XQuery standard. Swiss ERP vendor Pro-Concept, for example, tapped XQuery to produce, transform, and consume XML documents that represent complex and hierarchical data into ERP categories (e.g., “Customers,” “Items,” “Bill of Materials,” and so on). Although there has always the possibility that the W3C could make important changes between the XQuery draft and finalization stages, Pro-Concepts was undeterred by the absence of a bona-fide standard.

“I think that even if the standard is not yet adopted, the drafts and the use cases won't change a lot. [The W3C] may enlarge the standard … [but they will not change] the base structure of XQuery,” said Pro-Concept developer James Somers, in an interview last year.

Why are developers turning to XQuery? In part, they say, because XML has emerged as a de facto standard for exchanging data between and among heterogeneous applications. Simply put: if applications are exchanging information in XML, doesn’t it make sense to pose queries against them in XML, too? As a result, some adopters see XQuery as a technology standard fraught with possibility.

“The possibilities are really stunning,” Pro-Concept’s Somers said. “If you try to work with XML documents the way we did with relational databases, the answer is XQuery.”

Related Articles:

Last Call for Xquery
http://www.tdwi.org/research/display.asp?id=6923

XQuery Standard Inches Closer to Reality
http://www.esj.com/Business_Intelligence/article.aspx?EditorialsID=6727&t=y

XQuery: Could Finalized Standard Appear This Year?
http://www.tdwi.org/research/display.asp?id=6608

About the Author

Stephen Swoyer is a Nashville, TN-based freelance journalist who writes about technology.