In-Depth

Solid State Disk: A Business Case Develops in Data Warehousing

Mention solid state disk - disk drives comprised of memory chips rather than mechanical platters and read/write heads - and many information technology practitioners will look at you as though you've just awakened from a 20-year cryogenic freeze. To those who remember them, solid state disks (SSD) are often regarded as a part of computing memorabilia - like the dial telephone, the modem coupler or the Intel 8086 processor.

In the '70s, "virtual disks" could be configured on desktop PCs using utilities available in the Microsoft and IBM PC operating systems. By allocating a portion of system RAM to perform hard disk-like functions, users enabled tangible input/output speed improvements in their doggedly slow 4.77 MHz PCs. In some respects, these virtual disks were the forerunners of modern SSDs.

Discussions of virtual disk technology largely disappeared over the years as the processor and backplane speeds of PCs and servers improved, hard disk capacities grew and costs dropped, and the market became more comfortable with the conventional hard disks. SSDs continued to find a niche, however, in telecommunications switching systems, where their speed could be leveraged to store call accounting data in real time, until it could be written to permanent storage on hard disk.

Today, there are signs that SSD technology is on the verge of a renaissance, according to Dennis Waid, President of Peripheral Research (Santa Barbara, Calif.). He notes that companies are increasing storage in their enterprise systems at a rate of 50 percent or more annually as large-scale, multi-terabyte databases, providing online access to stored data, become increasingly commonplace. In addition, the largest databases of them all - those comprising the corporate data warehouse - are evolving into real-time decision support platforms.

As storage-dependent applications pan out, says Waid, companies are beginning to encounter performance limitations in mechanical disk-based storage. Given the falling cost of memory and the capability of SSD to speed access to data, Waid argues, the business case for SSD is becoming more compelling: "The price of memory for solid state disk and caching applications was about $200 per megabyte in the early 1980s. Today, it is about $16 to $18 per megabyte, dropping from $50 per megabyte in just the last year. While that is substantially more expensive than the pennies per megabyte for hard disk, companies are beginning to find two important applications for SSD technology that offset the cost."

Waid says SSDs can be used to speed up disk subsystems by providing a staging or caching area for data prior to its permanent write to mechanical disk: "That is a very popular application. Another application is in databases and data warehousing. A database that is designed to utilize SSD could use solid state disk to store indexes and other frequently accessed files. Storing indexes on the memory-based disk can speed the processes for locating data that is physically stored on the mechanical disks. That can improve overall performance."

Debate Lingers

Peripheral Research's projections are music to the ears of Quantum Corporation (Milpitas, Calif.) SSD Product Line Manager, Tom Fisher and Database Excelleration Systems (DES, Santa Clara, Calif.) President and CEO, Gene Bowles. Together with Imperial Technology, Inc. (El Segundo, Calif.), Quantum and DES products account for the majority of the estimated 4,100 SSD products shipped in 1997, totaling approximately $61.6 million in revenues for the companies. The three companies are also expected to divide the lion's share the $119.4 million in revenues from shipments of 28,4000 units, anticipated by Waid for Year 2001.

While vendors are encouraged by analyst projections, most concede that substantial work will need to be done to educate consumers before anticipated market growth will be realized. They insist that opinions hostile to SSD, such as those from Microsoft NT Server Product Manager Tom Kreyche, are based on outdated and erroneous information. For SSD technology to succeed, "common wisdom" criticisms of the technology - that it is overly expensive and contributes little to improved application performance - will need to be offset by a fact-based business case analysis.

In a recent e-mail exchange, Microsoft's Kreyche argued that, "Solid state disks are of little or no value to virtually all database applications. The obvious reason is that SSD is expensive, while hard drives and array controllers are cheap and plentiful (not to mention that huge amount of system memory that you can cram on today's servers for data caching). The significant reduction in latency from using SSD is of little value in terms of database performance with the exception of the log devices. And you can get the same reductions in latency of the log drive by simply using a battery backed up write caching controller. We've worked several of them here in our labs on occasion and there is no issue with them working with Windows NT as they are just seen by the OS as a standard disk volume. Despite our experimentation with them here, we found little use for SSD from a performance standpoint over the years."

Kreyche goes on to cite a several-year-old test performed by Compaq Computer Corporation in which the company tried and failed "to find measurable performance gains in a variety of database applications using SDD [products from DES] on Compaq servers operating the NT Server OS. In the end, they found them to be of little or no value (and it was not for a lack of looking)."

Quantum's Fisher bristles at Kreyche's comments: "I don't want to get into a shouting contest with Microsoft, but I think that his facts are dated." Fisher is quick to point out that testing performed using SSDs a year or more ago does not reflect the advances that have been made in the drives themselves, which now fully support Fast-Wide and Ultra SCSI interfaces and faster I/O performance.

He further explains that SSD and memory caches are not the same, though many are confused on this point, "SSD and caching memory use the same components, but SSD is a memory peripheral that resides on the I/O bus of a system. This distinction is important in the context of a distributed operating environment because an SSD is treated as a disk drive and can be shared among servers. You can't do that with memory."

Fisher goes on to suggest that the full potential of SSDs may be best exploited by applications running in a UNIX environment, "The majority of large-scale multi-user distributed database systems are currently hosted on UNIX platforms. By contrast, a small, single-threading, NT Server configuration might not show the full benefits of SSD."

DES CEO Gene Bowles agrees with this assessment. "It doesn't surprise me that the early thinking of NT Server product managers about SSDs is that they do not improve performance," Bowles says. "NT Servers are not usually tasked with challenging applications that can fully utilize SSD functionality. As the applications installed on NT scale upwards, they will begin to encounter I/O bottlenecks, just like what happened with high-end UNIX applications. SSDs provide an answer."

Bowles states that SSDs can be put to several uses within large-scale distributed environments. In the context of the database, they can be used to store "hot files," which he defines as "a small minority of database files that are accessed more frequently than other elements."

"There are three types of hot files," Bowles explains, "including temporary spaces for queries and sorts, transaction logs and index files." He is quick to add, "Some databases lend themselves more easily to SSD than others."

NCR Corporation Chief Architect for Storage Mark Morris agrees and suggests that SSDs offer little value to a database that is not designed specifically to use them, "A few of our customers have tried SSD to resolve specific problems. It is still relatively expensive, plus the data volumes are so large that Winchester disk drives seem like the only solution right now. Even the indexes are large. The primary index structure in our Teradata RDBMS-based data warehouse is stored logically, not physically. Typically, index files are disbursed across the data warehouse storage platform. You can't simply decide to store the index on an SSD, so the benefits of SSD just are not there for us."

A Case Study

Fortunately, for Debra Chrapaty, Senior Vice President and Chief Information Officer with E*TRADE Technologies (Palo Alto, Calif.), an Internet-based investment firm, the architecture of E*TRADE's database is amenable to SSD performance improvements. E*TRADE processes "tens of thousands of trades submitted by more than ten thousand clients daily."

Chrapaty explains the need for speed at E*TRADE, "Often 20 to 30,000 log items (trade orders), submitted overnight, need to be traded at market open. With online trading, execution time is everything. It impacts customer pricing and represents the difference between profit and loss. Trade price quotes are guaranteed to the customer, so trades must execute within a couple of seconds to ensure that the quoted price is used. Solid State Disks are used to give us a performance edge, and has helped us move from seventh to second in the Gomez Advisor Ratings (www.gomezadvisors.com, Boston) of online investment firms."

Chrapaty states that E*TRADE's system platform has been architected to provide redundancy and 75 percent excess capacity as a safeguard against sudden surges in trade orders, "We use 15 Sun Solaris E/4000 servers on the front end of our system, [tasked] as Web servers and to provide session layer management. The back-end system is based on DEC Alpha servers and handles equity, option and margin databases [that record and process trade orders and customer accounts]. The back end system is mirrored between two locations, each consisting of nine Alphas. [BEA] Tuxedo middleware connects the front end with back end platforms."

Chrapaty emphasizes that the back end systems are "high-volume transaction systems with high disk I/O." With the assistance of integrator Headlands Associates (San Francisco), E*TRADE is constantly deploying new technology to improve the performance of its massive database storage subsystem.

According to Headlands Associates Director, Carl Wolfston, "The back-end systems originally utilized multiple RAID 5 disk arrays from Digital's StorageWorks product family, totaling more than a terabyte of online storage. Each Alpha server was equipped with three or four ranks of five disks each. At first, E*TRADE had been relying on the RAID controllers to speed look-ups in their database, but the controllers were beginning to bottleneck. We recommended using SSDs to hold RDB/RMS database indexes and to alleviate I/O bottlenecks. They have since deployed in excess of 30 Digital EZ69 SSDs, which are actually 980MB Quantum Fast Wide SCSI units. The SSDs can do four to seven thousand I/Os per second."

"It was like pushing storage away from the server," Chrapaty recalls, "plus, the SSDs let us maximize our total cost of ownership for the existing platform by reducing the need to deploy extra servers to improve the performance of more and more storage."

To eek out even more performance from the E*TRADE systems, says Chrapaty, the company is currently migrating from "a classic" Alpha RMS/RDB database to a new multi-tier Oracle database. She reports that the new database deployment is being designed specifically to capitalize on the performance-enhancing capabilities of SSD.

SSD: Not for Everyone

Gene Bowles sees applications, such as E*TRADE's investment database as an ideal opportunity for SSD. He believes that the technology will also experience broad penetration into the data warehousing market as data warehouses evolve from their current state into real-time, business critical, decision support systems: "Right now, people doing data warehousing may not see difficulties in the I/O processing of their systems. You put in a query and pick up the answer tomorrow. But the technology is transitioning from ivory tower to real-time decision support tool. When that happens, they will need to identify and resolve all bottlenecks in their systems to obtain maximum performance."

Bowles adds that not all database performance issues are I/O-based. Poor performance may be linked to network issues or to server processor and backplane architecture.

The Bottom Line

Quantum's Fisher emphasizes that SSD is not a replacement for conventional magnetic hard disk technology, but a compliment to it under certain circumstances: "SSD is a technology that can provide substantial performance improvements when other technologies reach their limits. For example, there are practical limits to disk caching, whether you use system memory or special cache memory."

If analysts are correct, the market for SSD technology will grow by several orders of magnitude over the next five years. In the meantime, it will be up to SSD vendors and their integrators to identify how the solid state disk - an archaic technology in the minds of many IT professionals - offers value within the context of modern enterprise computing. Failing to do so may result in SSD remaining what DES President Bowles calls "the best kept secret in Silicon Valley."


ABOUT THE AUTHOR:
Jon William Toigo is an independent writer and consultant specializing in business automation solutions. He is the author of eight books and more than 500 articles on data processing and internetworking. He can be reached at (813) 736-5367, or via the Internet either at jtoigo@intnet.net.

Must Read Articles