Want a Data Warehouse? Test, Test, Test Experts Advise

Data warehouses combine server and storage hardware, database software and analysis tools that function as central collection points for information about a company's customers, products and transactions. The demand for data warehousing is being driven by marketers, salespeople, financial analysts and customer service managers as they assess the value gained from analysis, or "mining" performed on the data in the warehouse.

These results have come to be known as Business Intelligence--understanding what your customers want, what's happening in your markets, how you're making or losing money, and most importantly, what customers, products and channels are profitable.

More than one third of the data warehouses created before the end of the year will grow to exceed one trillion bytes of data-the fastest growing part of the market. Such data warehouses require large amounts of capital for hardware, software and services.

Richard Winter, president of Winter Corporation Consulting Company (Waltham, Mass.), recommends that those businesses interested in building a data warehouse first conduct large-scale tests on a simulated system using real data. This approach yields three main benefits:

  • Reduce technical and financial risk
  • Recognize business results sooner through faster implementation of large-scale business intelligence solutions
  • Gain valuable business intelligence skills and knowledge
Says Winter, "The AS/400 is rising in popularity as a data warehouse server; there are many areas where the AS/400 is the principal repository of the critical business information. Through advances in the AS/400 product line, it has become a credible platform for terabyte-scale data warehousing."

Many industries with large transaction volumes, such as chain retailers, companies that provide retail financial services, telecommunications, travel, hospitality, transportation, distribution and healthcare already have, or are currently implementing data warehouses. Winter believes that the next wave will be government and manufacturing.

"Since data volumes have grown at an extraordinary rate over the last several years, when companies start to implement a data warehouse solution, they often find that the volume of stored data they're dealing with is much larger than anything they've worked with before. The challenge of data warehousing is not just to store the data, but to be able to extract business value from it," commented Winters. "If you don't test first, there's a danger of making a big investment in hardware, software and building the interface, that may result in poor performance and not justify the time and money involved."

In addition, "It's important to consider the end users when implementing a data warehouse," adds Mike Schiff, director of data warehousing strategies at Current Analysis Inc. (Sterling, Va.). "Without consulting those who will actually use the data, you may deliver a solution that you think they need, while missing what they actually do need."

Winters summarized problems that may occur as a result of not testing before implementation:

  • Queries that should take a few minutes may take days or weeks
  • Data needs to be summarized each night-and it takes a week
  • The database cannot be built on your planned platform because it would take too long to implement
  • With your given workload, the system doesn't give you the means to manage it effectively
  • If large jobs are running, small jobs won't get done
"The most important thing to remember is to make large-scale, realistic tests of the performance of your data warehouse early in the project," says Winter. "It's difficult for companies to appreciate this because performance problems surface later in the implementation. When you're in the early stages of your project, you can see lots of things that require attention right now and testing is not high on the list or priorities. However, the most common technical reasons for failure are problems with performance and scalability."

"Correctly estimating the amount of data is also critical," Schiff continued. "A month's worth of data is not going to be one twelfth of the initial load-it should be greater. You have to plan for over-success, as people start using the data warehouse, it will gain credibility and additional data."

Winter Corporation, founded in 1992 by Richard Winter, a specialist in database technology, focuses on database scalability.

Recommended Steps Before Implementing a Data Warehouse

1. Carefully weigh all factors before selecting a platform (combination of hardware, software and database). You have to look several years ahead, because you can't afford to be in a position where you have outgrown your choice.

2. Select an architecture that can grow over a reasonable amount of time; such as three years. This should be reviewed annually to ensure it still meets critical business requirements.

3. Test your selection with a realistic set of data. Make sure your data simulates not only today's business scenarios, but also includes future needs and requirements. Some queries are very complex and present a performance challenge, which can be very significant if not tested during the initial stages of implementation.

4. Ensure that you can obtain performance, availability and scalability requirements from your system.

Must Read Articles