Microsoft’s Open Data Mining Specification

Major database vendors are adding data mining to their products. Oracle Corp. is enhancing its database to enable customers to reap the rewards of data mining tools. IBM Corp. is working to add the capability to DB2 so users won’t need third-party products. Informix Corp. is planning to ship a data-mining engine by the end of the year.

Microsoft Corp. is taking a different tact: The company is working on an initiative to extend OLE DB data access interfaces with the aim of arming ISVs with an open interface for integrating data mining tools and applications. The OLE DB for data mining interface is a joint effort between the Microsoft SQL Server development group and Microsoft Research.

OLE DB for data mining builds on Microsoft's Universal Data Access strategy, which includes the OLE DB for OLAP interface for multidimensional database access.

"The whole concept is to allow for a more flexible mode of communication between databases," says Tom Kreyche, a SQL Server product manager at Microsoft.

The OLE DB for data mining interface will enable diverse data mining products to more easily exchange data and results. Microsoft claims this will provide customers with transparent access to solutions such as fraud detection, credit-risk analysis, marketing campaign management, one-to-one marketing and adaptive Web content from existing line-of-business applications.

The company also explains that the specification is being created because the current data mining market is fragmented, thus it is difficult for ISVs to integrate various tools.

Data mining tools require different data formats from tools in relational or multidimensional databases. The data mining extensions to OLE DB will provide a format common to existing tools and applications, such as statistical analysis, pattern recognition, visualization products as well as data prediction and segmentation methods.

"A common interface for data mining will enable developers to embed data mining capabilities into their existing applications," said Tod Nielsen, vice president of marketing at Microsoft’s developer division, in a statement.

Microsoft is clearly a software company with a market presence that can help it drive a specification to a de facto standard. But the company has been criticized in the past, primarily by competitors, for writing standards that are less robust than using a combination of competitors’ tools. The OLAP Council (www.olapcouncil.org), for instance, backed by a number of vendors, had the MDAPI specification before Microsoft pushed its OLE DB for OLAP specification, version 2.0 of which is currently being reviewed.

Dwight Davis, an analyst with market research firm Summit Strategies (www.summitstrat.com), says Microsoft’s OLE DB for OLAP specification basically pushed the MDAPI to the back burner, much to the chagrin of the OLAP Council and some of its more prominent backers, namely Oracle. The most common criticisms are directed at its robustness and performance.

"Administrators can always do a little more, a little faster, with the various interfaces written specifically for certain products. But with a standard interface, they don’t have to juggle all those different interfaces," Davis says.

Davis points out that there is a downside to a Microsoft OLE DB for data mining standard. "Whether there can be a way to access databases robustly with this kind of a generic standard is always a challenge," he says.

The data mining extensions to OLE DB are being examined by a group of data mining software vendors as part of Microsoft's review process. Final publication of the specification is expected in the second half of 1999.