In-Depth
Teradata Revamps Data Mining Toolset
Warehouse Miner 5 looks like Teradata’s most ambitious release to date.
Users of Teradata’s data mining tools got an early Christmas present of sorts: last month, just ahead of the 25th, the company announced general availability of version 5 of its Teradata Warehouse Miner. With a bevy of new improvements – including enhancements to its data profiling, dataset generation, model management, and model import capabilities, Warehouse Miner 5 looks like Teradata’s most ambitious release to date.
Which begs an interesting question—at least in view of Teradata’s partnerships with data mining powerhouses SAS Institute Inc. and SPSS Inc. If Teradata wants to promote Warehouse Miner 5 as a best-of-breed data mining toolset, doesn’t that risk upsetting its vendor partners?
Not necessarily, say Teradata officials, who stress that when used in tandem with third-party data mining tools, Warehouse Miner 5 and its Teradata Warehouse underpinnings come into their own.
“The combination of the Teradata in-database performance with other vendors’ data-mining tools can speed model development as much as two to three times faster and runtime of data-mining models up to 25 times faster,” said Randy Lea, vice president for Teradata products and services, in a statement. “This in-database performance is extremely valuable to our customers and partners because they use it to gain more insight, through more analytic models, executed against more customers’ data, across the organization in real time.”
There’s a lot for data mining aficionados to like in Warehouse Miner 5, says James Kobielus, a principal analyst for data management with consultancy Current Analysis. For one thing, Kobielus notes, Teradata has enhanced its Profiler tool with drill-down functionality, which enables users to retrieve detailed data for activities such as record-level analysis, and also facilitates deep investigation and resolution of data-quality issues.
Elsewhere, Kobielus notes, Teradata Analytic Data Set Generator’s data pre-processing capabilities have been tweaked, too, thus allowing new variable creation, variable dimensioning, and ADS-building features. Finally, Teradata Model Manager was given a thorough revamping, too: it features new model scoring, model import, and model management improvements.
Teradata officials might make much of Warehouse Miner 5’s complementary nature, at least vis-à-vis third-party— and best-of-breed—data mining tools; but Kobielus says it’s also an important competitive deliverable that helps Teradata better measure up to SAS, SPSS, and others.
“Teradata’s move was a necessity for the vendor to compete with SAS and SPSS in providing tools for life-cycle management, scoring, and optimization of predictive analytics and data mining models,” Kobielus comments. For example, he points out, Warehouse Miner 5 now supports a “full model-governance life cycle,” which lets users perform rich analytic data profiling and exploration; dataset development, preprocessing, loading, and linkage; and model development, scoring, publishing, scheduling, promotion, and version control.”
It also boasts a substantially revamped model management tool, which Kobielus says can help customers manage multi-vendor modeling environments that consume models created in third-party data-mining tools, including vendors such as SAS, SPSS, Fair Isaac, and KXEN. Add to that a dataset generation tool that helps business users cut down on analytic preprocessing times; and an improved data profiling tool (which helps developers ensure that they’re linking to high-quality datasets) and you’ve got an Rx for a data mining competitor.
That’s the good. The bad, or the not so good, Kobielus says, is that Warehouse Miner 5 slightly complicates Teradata’s competitive outlook. “Teradata’s model management tool competes against those of two of its partners [SAS and SPSS], both of which are best-of-breed in data mining and have recently enhanced their respective model-management offerings,” he points out.
Veteran industry watcher Mike Schiff, a principal with data warehousing consultancy MAS Strategies, says that whatever the potential for competitive fractiousness – SAS, he notes, aggressively pushes its own model management tool—Teradata’s ‘leave-your-mining-to-Teradata-Warehouse’ strategy makes a lot of sense. “They want to sell the Teradata data warehouse, which in theory means they will sell more hardware. So what they’ve said is, it’s not a question of just picking our tool, but that they have some data profiling and some analysis preparation [technology] that’ll work with anybody’s stuff. So you can use Warehouse Miner to do ADS generation, which is basically to create the analytic data set,” Schiff comments. “The idea is, if they’re already a Teradata user, Teradata is like, hey, another reason to use the Teradata Warehouse!”
Schiff, for his part, says this development is especially credible coming from a vendor such as Teradata: “They were one of the first database vendors to take data mining seriously. Back in the days when Oracle was like, it’s a niche thing, we don’t care about it, [Teradata] emphasized just how important they thought [data mining] was.”
About the Author
Stephen Swoyer is a Nashville, TN-based freelance journalist who writes about technology.