Drill Down: From the Experts...
AutoZone is a leading auto parts retailer with around 2,700 stores in 39 states. The company is also a pioneer in using data warehouse technology to improve its inventory management and business decision making. Four years ago, the company moved to an IBM RS/6000 SP platform environment to host its corporate database and data warehouse. This year, the company decided to re-architect both the software and the hardware for its database and data warehouse. To that end, it engaged in an extensive research and development process with its consultant, Richard Winter of Winter Corporation, at the IBM Teraplex Integration Center in Poughkeepsie, N.Y.
I recently talked to Doyle Sanders, Director of Data Services and Customer Satisfaction; Michael Embry, lead analyst for data warehousing; and David Whittenberg, lead analyst for decision support services at AutoZone, about their experiences.
According to Sanders, AutoZone undertook the project for several reasons. First, he noted that the company was using technology that was four years old. "We wanted to upgrade for performance and capability," he said. The company planned to dramatically increase the size of the data warehouse from around 300 gigabytes to three or four terabytes. And while the company has already committed to using SP 2 hardware, Sanders wanted to make sure that the upgrade would be conducted correctly and fully exploit the new hardware.
AutoZone intended to use IBM DB2 Universal Database 5.2 and Sanders and his team were particularly interested in a new Automated Summary Table (AST) feature in the database. The feature enabled frequently-used query elements to be pre-calculated and stored by the database engine, under the control of the database administrator. The pre-calculations are automatically used by the DB2 Universal Database wherever they will accelerate the processing of queries. The feature is supposed to shield the end user from the entire process. Users need not know where queries are directed. All they should see is an improved time.
Although the AST feature looked very attractive, the AutoZone team had several questions. Was it appropriate for AutoZone’s purposes? Could it refresh in the available time window? Would it scale adequately? Moreover, the AutoZone team wants to understand how to correctly design appropriate ASTs.
To answer those questions, Embry, Whittenberg, their senior database administrator and consultants from the Winter Corp. spent three weeks on-site at the Teraplex Integration Center. There, they worked with IBM consultants, as well as researchers at the Almaden Research Laboratory in California and the DB2 Development team in Toronto, to conduct a proof of concept for their approach. At the center, they had access to a 76-node RS/6000 SP and 7 terabytes of storage.
The testing ultimately paid off handsomely. At the Teraplex Center, the AutoZone team ran a test involving 3.5 years of sales data or 1.5 billion records. A query that once took several hours to complete was finished in minutes.
The time savings, however, was not the most impressive part of the technology, Whittenberg observes. "I was impressed that the optimizer is intelligent enough to reroute the query to the AST without me doing it." Moreover, he can change ASTs without having to rewrite the query.
And the entire process truly is transparent to the user. At AutoZone, end users generally create their queries using tools either from Cognos or SAS.
Since the proof of concept, AutoZone has begun to implement the new data warehouse. They are designing ASTs and as they build confidence in the data warehouse architecture, they hope to add different kinds of data that will be used by different users.
In fact, from Doyle Sanders’ perspective, usage will only increase, if users see added value in the system. With a long history of data warehousing, AutoZone has a small cadre of experienced users and many legacy queries already developed. If the new system does not help users to perform their jobs better, they will continue to rely on their current methods, Sanders noted.
In addition to assistance in implementation, IBM has also invested heavily in basic research in data warehousing and data analysis technology. Earlier this summer, it launched the Deep Computing Institute, a $29 million collaborative research initiative designed to bring together IBM researchers, academics and people in industry to study problems that require massive computation and sophisticated software algorithms.
A typical example in data mining and data warehousing, according to institute Director Dr. William Pullybank, would be to combine weather forecasts with information about the power grid to allow utilities to forecast the next day’s energy demands. With the deregulation of utilities, the information would guide decisions about how much energy to generate and whether or not to participate in the spot energy market. "It is hard enough now just to get the data in," Pullybank observed. To be able to combine huge data pools for real time decision support would represent a major leap forward.
About the Author: Elliot King is an Associate Professor of Communications at Loyola College in Maryland. He can be reached at (410) 356-3943, or by e-mail at eking@loyolanet.campuscwix.net.