Storage 2008: Flashy Storage

Learning the difference between single layer cell (SLC) and multi layer cell (MLC) Flash memory technologies opens the door to better understanding solid state drives.

The big news in late 2007 was that solid state drives (SSDs), emulating rotating disk but created from memory chips, would be usurping hard disks by the end of this year.t Despite announcements from a few hardware vendors that they would be integrating memory-based storage in future products, all of the flashy talk produced little substance.

The combination of Flash memory and hard disks was driven initially by complaints about the tendency of the Microsoft Vista OS to drain laptop batteries with periodic logging and posting, even when the system was hibernating. Several drive manufacturers introduced hybrid drives, but there were comparatively few takers. The real culprit, to hear the drive manufacturers tell it (mostly off the record to avoid earning the ire of Redmond), was Microsoft's failure to deliver a software methodology for driving its log data to the memory components of the hybrid drive. Just having a layer of memory on the drive wasn't sufficient if data write operations weren't smart enough to take advantage of it.

The SSD phenomenon also appears to have been stalled by infighting between the proponents of Flash memory-based drives and more volatile and more expensive (but less wear-prone) DRAM-based units. Confusion was also generated by the most rudimentary discussions of SSD. Flash memory conjured to mind small USB keys and tiny memory chip cards for devices such as digital cameras and MP3 players: certainly not an enterprise-capable technology unless you were sticking them into routers that could be filled with data once, then just read over and over. Your typical Flash memory chip could only do about 100,000 writes before wearing out. That seemed to give the edge to DRAM-based SSD, which could support a much more robust write workload.

While this debate flourished, would-be users had to learn the difference between single-layer cell (SLC) and multi-layer cell (MLC) Flash memory technologies. SLC chips were subject to the memory wear limitations noted by the DRAM SSD contingent, but advances in MLC technology were enabling a new order of write endurance in Flash memory. At last report, MLC chips were capable of 500,000 writes. Moreover, according to marketers, when a Flash memory cell location quit accepting writes, the data written to it was permanently recorded. That no data was lost was emphasized as a way to turn the main foible of Flash into a feature.

Although 500,000 may still seem paltry, with some added RAID-like functionality, a modular replacement method, and some decent write operation monitoring, MLC Flash drive advocates argued that SSDs could be delivered to market at a comparatively low price when compared both to DRAM-based SSD and "enterprise class" hard disk drives (aka dual-ported Fibre Channel, parallel SCSI, and SAS disks). The latter idea -- that of replacing expensive high-end disk with memory-based drives -- is where folks such as Hubbert Smith, director of enterprise storage marketing for Samsung Electronics, wants to drive the discussion.

Smith wants to move the debate away from chip types and toward the more practical issues of storage cost, performance, and power consumption. He makes a good case, saying that IT managers should be considering the relative cost of disk and SSD based on both acquisition price and watts consumed -- which he views as a gauge of longer-term costs of ownership.

Smith began by pointing out that a significant gap exists in the area of metrics to aid storage consumers in making intelligent product choices or architectural decisions. He correctly observed that about the only measure presented to consumers by vendors of storage products is a basic I/O measurement covering read, write, and random accesses using large and small block data. These are somewhat useful at the drive level and less so at the array level, where system components tend to be "black boxed" to give the finished goods vendor a good show for his wares.

Efforts such as the Storage Performance Council and lack universal participation among storage array vendors. In addition, vendors inhibit the free discussion of the costs and performance of storage solutions deployed by their customers by placing veritable gag orders into their warranty or maintenance agreements, they say, to prevent customers with special workload issues from circulating stilted performance reports.

Smith said that everyone knows already that memory-based storage blows the proverbial socks off of spinning rust from an I/O operations per-second standpoint. He quoted numbers from that rated SSD performance at about 700 I/Os per second versus the fastest 15K SAS drive at about 300 I/Os per second less. That's the old news.

What was new in Smith's analysis were four additional metrics: GB per dollar (capacity cost) and IOPS per dollar (performance cost), as well as GB per watt and IOPS per watt (cost of ownership). His analysis showed that Samsung's SSD actually delivered about 1800 IOPS per watt while the closest SAS hard disk delivered less than 400. From a capacity perspective, the SSD provided about 51 GB per watt versus less than 10 GB per watt for the SAS drive. Outstripping all competitors in this analysis on a GB per watt basis was a TB size SATA drive that provided 125 GB of capacity per watt.

What these measurements reveal, in Smith's view, is that simple costs per GB and cost per IOPS are inadequate for building storage to meet company needs. He provided a comparison from an Internet software company to make his case clearly. Serving between 3000 and 5000 concurrent users of a database-driven application used for medical office scheduling, billing, and insurance claims filing, the customer wanted to analyze the value of an SSD-based approach. A leading vendor had proposed a storage infrastructure providing 8.6 TB of capacity in the form of 60 15K RPM Fibre Channel drives. Smith suggested an infrastructure providing the same capacity, but leveraging SSD technology exclusively (a "performance optimized" configuration), or in conjunction with 1TB SATA disks in a 10/90 split configuration (10 percent SSD, 90 percent SATA, a "capacity-optimized" configuration).

The disk-only configuration proposed by the original vendor provided 15.1 IOPS per watt, .29 IOPS per dollar, 8.7 GB per watt, and .16 GB per dollar. The acquisition price of $52,257 combined with an annualized power cost of nearly $4,900, made the solution costly to deploy and operate.

Smith's "performance-optimized" all-SSD solution provided 376 IOPS per Watt, 1 IOPS per dollar, 38.5 GB per Watt, and .1 GB per dollar. The nearly $84,000 acquisition price was offset only slightly by an annualized power cost of about $1,100, which was less than a quarter of the power costs of the FC disk array.

The "capacity-optimized" configuration, mixing SSD and disk, provided an interesting alternative. With 10 percent of the 8.6 TB total capacity provided by Flash SSD and the balance provided by 1 TB SATA disk, IOPS per watt was twice that of the FC solution at 28.4 and IOPS per dollar were only slightly higher, at .31, than the FC drive solution. GB per watt were nearly 10 times better than the FC storage, at 83.5, and the GB per dollar measurement of the configuration was eight times better than FC disk, at .91.

Overall pricing of the "capacity-optimized" configuration was an eye opener: at $38,824, the SSD-SATA configuration was more than $13K cheaper than the FC disk solution. At $2,081 annually, the power cost of the hybrid solution was double that of the SSD-only solution but $1,592 cheaper than the FC disk storage configuration.

In truth, Smith's metrics provided the first sensible basis for configuration comparison that I have seen in a long time. Certainly, it places the real world value proposition of SSD into sharper focus.

What is missing from the analysis is, first, a "skew factor" to capture SSD drive replacement frequency, since Flash SSD does eventually succumb to memory wear. How frequently maximum data-write counts are meted out by the application that is outputting data to the SSD components will determine the rate of replacement of those components. That could add up to a significant cost accelerator of the solution, possibly suggesting cases where DRAM SSD would be preferred to MLC Flash SSD. One could postulate than high-read/low-write applications would be the best candidates for Smith's capacity-optimized configuration (think NAS and file server workloads). In any case, this skew factor is not currently captured by Smith's preliminary metrics.

The other issue that needs to be addressed is the collateral cost of each configuration. What will be the comparative expense of RAID, management, and other "finished frame" feature/function attributes of each product? If frequent SSD component replacement is required, it would be useful to understand the maintenance requirements of each configuration both in terms of maintenance contracts and on-staff labor costs.

Still, Smith has done us a favor by adding a third dimension to the statistical analysis of storage technology, which typically dwells simply in the X-Y axis of cost and capacity/performance. He has introduced a significant cost component associated with ownership over time -- namely, electrical power. In the past two years, power costs have risen 23.2 percent across the USA. More hikes are projected.

In the contemporary data center, storage is the big power pig and capacity demand shows little sign of decreasing. Perhaps by exposing the "per-watt" costs of storage, in addition to the "per-dollar" acquisition price, Samsung can compel more vendors in the industry to apply these metrics to their own wares, giving us all some additional data to help guide our decisions. That could lead to better storage architecture combining various flavors of SSD and disk with green media such as tape and optical that will both "right-size" storage infrastructure and save us all a lot of money in the process.

Your comments are welcome: