Case in Point: Democratizing Data Mining

AxCell puts Intelligent Miner 8 to work so users can analyze data without having to know programming or math.

AxCell Biosciences, a Cytogen Corp. subsidiary, has discovered a winning formula in IBM Corp.’s Intelligent Miner data mining engine and DB2 Information Integrator middleware. Not only have Big Blue’s technologies saved AxCell time and money, they’ve also served as a catalyst for the firm’s life-saving research into human cellular biology.

According to Lubing Lian, bioinformatics manager, AxCell required a technology that would enable users to query, access, and compare multiple data sources from a single, consistent interface. Secondly, it needed a GUI-based data-mining tool that would be intelligible to users who don’t have backgrounds in programming, mathematics, or statistics. To remedy its data integration woes, AxCell tapped DB2 Information Integrator, a new middleware product that IBM Corp. touts as a solution for enterprise information integration (EII). On the data mining side, AxCell opted for Big Blue’s Intelligent Miner, a GUI-based tool that boasts integration with IBM’s EII product.

Although it’s a biomedical research firm, AxCell’s case is by no means unique. In fact, says Lian, the problems his company was wrestling with affect, to one degree or another, all enterprise IT organizations. And even though Lian is using Big Blue’s DB2 Information Integrator in a limited fashion. AxCell is almost entirely an Oracle shop, so data integration was a pretty straightforward affair --he says that IBM’s EII entry should more than deliver the goods in large enterprise IT organizations with a heterogeneous mix of data sources.

“This is definitely advantageous for a bigger company, especially one that has a bunch of different data sources,” he comments. “We have data sitting in an Oracle database across different systems, so the data variance problem isn’t too bad for us. But for a bigger company that wants to present all of this [data from different sources] in one consistent layer, it is very advantageous.”

IBM promises that DB2 Information Integrator facilitates access to data from relational database management systems (RDBMS), flat files, and unstructured sources. From the user perspective, Lian says, DB2 Information Integrator presents a single, virtualized view of all of the data in an enterprise. “From this layer, you see all of the data from one layer, so you don’t have to worry about where you’re running your query, because all of those databases have their proprietary features,” he comments.

Big Blue’s EII product is an important tool, Lian allows, but it was the Intelligent Miner product which sealed the deal. This data mining tool gave AxCell a GUI-based environment in which even non-technical users could feel at home. As an added bonus, its integration with DB2 Information Integrator meant that users could transparently run data mining algorithms against all of the information in AxCell’s environment. “The difficult part for data mining, first of all, is to get data ready into the right format,” explains Lian. “[DB2 Information Integrator] helps with that because it takes a variety of different data sources and almost lets you see a virtualized database, and from there you can mine it and see the results.”

Here again, Lian says, Intelligent Miner distinguished itself, especially when IBM released a version 8 upgrade last summer. The revamped product shipped with new GUI tools that made it more accessible for non-technical users. “You have to be able to see the data before you can mine it, and certainly when you’re trying to understand the results,” he comments. “With the Intelligent Miner, the main advantage is that it’s graphical-user-based, so the user doesn’t have to have any programming background, or any professional statistics background or mathematics background to use it.”

In particular, Lian says that Big Blue improved the way data mining results are presented in Intelligent Miner 8. “The new version came out with some big old graphical tools to help users interpret the mining results, and that helps a lot in the interpretation process.”

For AxCell’s purposes, Intelligent Miner 8’s biggest selling point, by far, is a GUI-based environment that lets users initiate data mining operations without coding complex algorithms.

“This sort of thing requires a programming background, to have a pretty in-depth statistics background, and although we do have people here who can do it, you can imagine that with a graphical interface, more people without the really extensive programming or mathematics background can do it,” Lian points out, noting that Intelligent Miner ships with canned data mining algorithms that are designed for the life sciences, along with other vertical industries.

Prior to its combined Intelligent Miner/DB2 Information Integrator solution, AxCell researchers had to program custom, in-house data mining algorithms in a variety of different programming languages. Because it has reduced or eliminated this need altogether, Lian says, Intelligent Miner 8 has democratized the data mining process at AxCell. “The biologists in the lab…can use with it without the having to know the programming or mathematics,” he concludes.

Although AxCell has been live with its implementation for just a few months, Lian confirms that the solution has already born fruit. “As far as bioinformatics is concerned, we can now use mining to do electronic prediction about what’s worked in the lab, which has helped increase the success rate in the lab.”

About the Author

Stephen Swoyer is a Nashville, TN-based freelance journalist who writes about technology.