Data Warehouse Appliances: Cost-Effective and Growing

The appliances market is maturing, and as more companies explore this relatively new tool, the storage size of appliance solutions keeps growing.

While general-purpose relational vendors continue to highlight their capabilities as suitable for both operational and analytical environments, vendors like Netezza and DATAllegro are concentrating on the latter by offering specialized data warehouse appliances. These appliances consist of pre-integrated hardware, database software, and storage; and are optimized for very rapid query and retrieval.

When first introduced these appliances were positioned for use as low-cost, quick-to-implement, departmental and/or functional data marts that dramatically reduced the response time (sometimes from days to minutes) for traditionally long-running queries. However, as their products have evolved, the data appliance vendors are now targeting enterprise data warehouses as well.

For example, Netezza and DATAllegro initially marketed their appliances as low-cost line-of-business or departmental solutions but have now expanded the storage capacity of their appliances (and their marketing messages) to position themselves as suitable for enterprise data warehouses. Teradata, a division of NCR, first refused to acknowledge data warehouse appliance vendors as being its competitors. Teradata positions its high-performance Teradata Warehouse (at least when hosted on proprietary NCR hardware) as a vehicle for consolidating multiple, and perhaps uncoordinated, data marts.

Ironically, as Teradata was one of the few specialized database machine vendors to achieve commercial success, its very existence lends credence to the data warehouse appliance concept. And while Teradata has certainly proven itself in a multitude of very high-end implementations, it is certainly not inexpensive. Furthermore, while data warehouse appliances were first positioned by the major database vendors as a niche market offering, IBM’s DB2-based Data Warehousing Balanced Configuration Units, an integrated bundling of IBM hardware and software technology, are for all practical purposes, data warehouse appliances.

Netezza and DATAllegro are competing at much lower price points as they continue to increase the storage capacity of their appliances, capacities which are now in the 100 terabyte range. When comparing their products to data warehouse implementations that utilize traditional vendors such as IBM, Oracle, and Teradata, the appliance folks generally speak about orders of magnitude in performance improvements at greatly reduced costs. Both Netezza and DATAllegro can link multiple appliances together and thus achieve even greater scalability. These numbers are especially compelling as Netezza now has many marquee customers willing to serve as references, while DATAllegro is finally starting to publicize some of its reference accounts as well.

In addition to their impressive performance stories and price/performance ratios, other advantages that data warehouse appliances bring are quick implementation, ease of use, and reduced DBA support requirements. Some of this advantage can be chalked up to the historic use of appliances for specific purposes such as analyzing massive volumes of similar data, including call detail records, customer purchase data, and click-stream logs. As most data warehouse implementation teams know, determining the data needed to analyze a single or small set of subject areas, is much easier than designing a massive multi-subject data warehouse that meets the needs of an entire enterprise (sometimes referred to as an attempt to “boil the ocean”).

Although data warehouse appliances were once criticized for their lack of data integration and business intelligence software, Netezza, for example, has partnerships with most of the leading players, the majority of whom have certified or validated their products for use with its appliances. And appliances were also criticized for their proprietary architectures and “single purpose” nature. These arguments are something of a red herring, because although data warehouse appliances will likely never be used for OLTP systems, neither will Teradata or Sybase IQ.

Organizations seeking a powerful analysis platform should not let this deter them, especially as the cost-of-entry for a data warehouse appliance is relatively low. That said, organizations evaluating data warehouse appliances should check customer references and ask if these references have acquired additional appliances and/or expanded their initial configurations. A degree of skepticism is healthy, especially when dealing with new entrants seeking to emulate Netezza’s success; make sure vendor references have purchased their appliances and are not just deploying it in a proof-of-concept trial. In general, the credibility of appliances would be further enhanced if one of the vendors would submit a TPC-H benchmark.

The bottom line is that data warehouse appliances have gone from a novelty item to a proven offering. And while is may be premature to consider them for use as a company’s all-encompassing enterprise data warehouse, they are certainly suitable for analyzing massive data volumes across a well defined subject area. Given their economics, certainly consider their use them as part of your organization’s overall data warehouse architecture.

About the Author

Michael A. Schiff is a principal consultant for MAS Strategies.

Must Read Articles