Mining Specification Promises Rich Returns

Finally, data mining -- that sophisticated and esoteric process that only statisticians could love -- is coming to the masses.

Microsoft Corp. (www.microsoft.com) released into beta its OLE DB for Data Mining specification, an API for integrating data mining tools and capabilities into applications.

The data mining API -- based on the SQL query language -- will provide capabilities similar to that of the ODBC API, says Amir Netz, architect and development manager of SQL Server 2000 Analysis Services at Microsoft. "Client applications can be written to work against a generic OLE DB. Developers no longer need to do custom coding for specific proprietary interfaces. The uniform concepts and data representations of the API will help drive data mining into the mainstream and expose a much larger developer community to data mining technology."

Dave King, CTO of Comshare Inc. (www.comshare.com) agrees. "Just having a single vendor-proposed standard is going to go at least a ways toward promoting wider spread use," he says. "So far, almost all data mining solutions have been proprietary."

Data mining has been hyped for several years, but its practice has been limited mostly to large-scale Unix sites with trained statisticians. "You had to be pretty much a hard-core numbers cruncher to deal with data mining tools," says Clay Young, vice president of marketing at Knosys Inc. (www.knosysinc.com). "With this specification, we can now bring data mining down to the average decision-maker's desktop."

Netz foresees data mining running in conjunction with Web sites to pull customer data for optimum ad placement, cross selling, personalization, fraud detection, and pricing opportunities. "The scenario is especially lucrative as the customer is interacting with a computer which can apply predictions in real-time," he says.

The data mining API is not related to the OLAP Services component of SQL Server 7.0, but "the Analysis Server in SQL Server 2000 integrates both OLAP and data mining technologies," Netz notes.

"At the end of the day, look at what ships on SQL Server -- virtually every line of business app in the world," Young says. "Having a data mining engine, an OLAP engine, and an English language query engine underneath these line of business apps means technology is going to rapidly become pervasive."

While mining unstructured data is still a challenge, Microsoft's data mining provides a way to mine structured OLAP data, Young says. "With structured data, an average decision maker can walk through a wizard, and look for relationships between customer groups and sales," he explains.

As business intelligence vendors build data mining into their tools through the specification, end users may not be aware they are using data mining. "There's probably some places where it will be transparent to end users that they're doing data mining," King says. For example, end users will not have to enter complicated formulas or statistical queries to link data, but will be able to use simple English language queries, he states. "You will almost be able to ask, 'tell me what's important here,' rather than asking for a 'predicted value,'" he notes.

About a dozen business intelligence vendors have indicated they will employ the new protocol, including ANGOSS Software Corp., Appsource Corp., Comshare Inc., DB Miner Technology Inc., Knosys Inc., Magnify Inc., Megaputer Intelligence Inc., Maximal Innovative Intelligence Ltd., NCR Corp., PolyVista Inc., and SPSS Inc.

Must Read Articles