The Truth About Tape Backup
Tape … disk … tape … disk! Round and round we go.
Frankly, I am astounded and perturbed by campaign of fear, uncertainty, and doubt (FUD) being waged against venerable tape technology by purveyors of cheap disk arrays. It bothers me when an IT person quotes Gartner on its 1-in-10 tape failure rate statistic—a number that has no correlation with any hard statistical evidence gathered on planet Earth.
Recently, an IT manager told me he was looking for a disk-based backup solution because he was afraid of the potential for losing a lot of data, given the capacities of tape technology today. If a single tape went bad, he stood to lose a much bigger chunk of data than five or ten years ago, given the relative capacities of media then and now. And now that Gartner was saying that 1-in-10 tapes fail … well, you do the math.
When I asked him if he was experiencing anything like the Gartner failure rate, he responded with a confident “No.” If one tape in a thousand went bad in his 11,000 media program, it would be a significant event.
Fear, uncertainty, and doubt are emotional responses, not the foundation for a technical or business case for tape replacement, I offered. He shrugged. Everybody seemed to be moving from tape to disk, so he was going to follow the pack.
I am seeing a lot of this behavior, and, frankly, I wonder if disk to disk is truly a trend or just a bubble. I’m not alone in wondering about this.
A vendor insider recently told me that Vin Cerf, one of the founders of the Internet, was invited to see the company’s disk-to-disk-to-tape solution. Cerf watched the presentation, I’m told, and shook his head in disbelief. He uttered a comment like the following: So, I have to do six hops to move data to tape, then to restore it from tape and return it to the application server. Seems like a lot of hops, a lot of data movement, for such a simple task.
When disk-to-disk was revitalized as an architecture a few years ago, I was all for it. I thought there was a lot you could do on that second tier of disk, from cleaning out the duplicates and contraband data from your backup set, to screening it for hygiene with anti-virus and malware detection software, to compressing it down to take less space (and less time to backup to tape), to encrypting data without introducing latency into primary I/O operations, and maybe even to implementing a data classification scheme. I never bought the idea that disk would replace tape. I saw Tier-2 storage as a place to optimize the use of tape by ensuring that the data it would store would be the most distilled and pristine data possible.
I also never agreed with the idea that the appropriate role of disk was as a tape surrogate, that the data directed to Tier 2 would be a backup stream—a bucket of anonymous data. To my way of thinking, disk should be treated as disk, not as a virtual tape drive.
That thinking ran afoul of many tape and disk folk who were seeking to introduce DDT into a market already dominated by D-to-T. They argued that disk as tape would speed up backups. You could basically field Tier-2 disk as a virtual tape library with a lot of virtual tape drives so that streaming data could be transported more efficiently. Secondly, you could leverage the read/write speeds of disk to record data faster than you could with traditional tape at the time. Both were compelling arguments, of course, to IT folk who were having difficulty doing backups in the window of their operations schedule for doing this kind of work.
Further bolstering the case for DDT was the notion that disk was becoming cost-competitive with tape, especially with the introduction of cheap SATA drives. For between a $1.80 and $3.00 per GB, not a lot more than tape, you could have the reliability of spindles. This may have made sense, too, until you looked into the cost of ownership.
Now that we have a few years of DDT propaganda behind us, I believe we can begin a sober discussion of the strategy’s reality. Here are a few important facts to consider.
1. The need for virtual tape is by no means universal.
Truth be told, most organizations are well served by the current tape components they have. They can get the backups done in a reasonable period of time. If data volume has grown beyond the time frame available for doing backups, maybe what you need to do is to look at the data itself.
Files are being replicated 8 to 10 times in most companies. Culling out the duplicates alone can shrink the volume of data that needs to be backed up.
Applying even greater discipline to data, such as a data-naming scheme, can help to further segregate what needs to be backed up from what doesn’t. That can dramatically reduce the burden on tape resources, since only the data required for regulatory compliance or disaster recovery of mission-critical operations would need to be backed up. Data management could also help identify stale data, that could be deleted altogether.
Compressing data before it is sent to tape is another way to reduce the load and improve the efficiency of tape backup processes. Look at what Avamar or Rocksoft or any of the other compression companies can do with high performance algorithms and you get an idea of what 18 to 1 compression could mean to your tape solution.
Bottom line: there is a lot we can do to give tape backup greater efficacy, even in 24x7x365 environments and even in the face of data growth. You don’t necessarily need to front end your tape with disk.
2. Information on the cost of DDT is unreliable and subject to considerable hyperbole.
Manufacturers may get their disk for a buck-eighty a gig, but that’s not what the consumer pays. After the manufacturer adds software and includes his profit margin of, conservatively speaking, 20 percent, the SATA array costs about $5700 for 3.2 TB. Next, the distributor tacks on a profit margin and handling fees. Then the reselleradds a profit margin and handling fees. Finally, that array costs somewhere around $20,000 or $6.25 per GB, not including the price of array management, components, and networks or fabrics to connect to it, backup software, and soft costs such as labor.
Tape solution acquisition costs are much lower, even when you factor in high-end autoloaders and robots. That said, tape’s cost of ownership has a potentially large labor cost component, primarily for tape-media management. Like most labor cost centers, however, this one can be adjusted very effectively with the right software and services.
Instead of having folks pull tape cartridges for distribution to off-site storage, why not write your second tape copy directly to a remote vault across a network? Services exist, and have been discussed in this column, to facilitate that modus operandi. They will even send the tape electronically to your recovery center, if you prefer.
You may also want to consider that SATA II is not the same beast as SATA. Unlike the original “cheap and plentiful” disk technology that was the star of the storage industry in 2004, SATA II is becoming a political nightmare. Vendors are selectively implementing the standard and calling their products SATAextended or ultrawideSATA, and there is no guarantee that these products are going to plug and play with one another.
I’m afraid that we are heading for another litany of plug-fests and interoperability demos resembling those we have seen with Fibre Channel. Such things always happen when standards become balkanized. (My test labs are setting up conformance testing regimes for both SATA II and SAS. With or without vendor permission, it is my intention to test every product I can and publish the results so consumers will know which products conform to the standards and which don’t. Stay tuned.)
3. Gartner’s scare statistics notwithstanding, there is no evidence that tape is more prone to failure today than in the past.
I disagree on this point with a new friend in Toulouse, France, Fernando Moreira, president and CEO of Hi-Stor Technologies. Moreira's company does a lot of technology development for brand-name storage vendors. Says Moreira, his field engineers are seeing more defects today in tape storage systems than they have in the past -- a situation he attributes to the pressure on tape technology manufacturers to keep up with disk capacities and performance and to introduce new drives every 18 months or so, and the inability for drives to stream due to backup infrastructure inadequate dimensioning and tuning. Vendors are pushing their products out earlier and earlier in their test cycle, he says, and that has led to some quality issues in the early shipping phase.
I would never argue with Moreira, who has spilled a lot more blood in this arena than I have, but I like his idea for coping with the quality issue. This month, his company releases version 2 of its StorSentry product, a unique tool set for monitoring the performance and quality of tape components and backup operations that I would encourage all data protection folks to look at.
Unlike products such as Tek-Tools and Bocada, which plug into backup software and collect report data on backup completion and performance from that source, StorSentry monitors hardware operations on the tape side, collecting data on the number of cycles and functioning hours of drives and media, as well as on the absolute values of Bit Error Rates and transfer rates manifested by systems during backup and restore operations. The product then analyzes this data to produce guidance on potential problems that can be addressed by tuning or component replacement.
Moreira’s ultimate goal is to have this technology play a broader role in infrastructure fault prediction and self-healing. StorSentry is an interesting product that could help drive even more cost out of tape operations.
In the final analysis, the storage industry is at odds with itself with respect to data protection, and tape backup is caught in the middle. Half the world wants you to buy more disk, the other half wants you to manage data better. Both are counting on you to suspend reasoning and just buy into whatever marketing spin is handed to you. Let common sense be your guide.
Comments can be directed, as always, to firstname.lastname@example.org.
Jon William Toigo is chairman of The Data Management Institute, the CEO of data management consulting and research firm Toigo Partners International, as well as a contributing editor to Enterprise Systems and its Storage Strategies columnist. Mr. Toigo is the author of 14 books, including Disaster Recovery Planning, 3rd Edition, and The Holy Grail of Network Storage Management, both from Prentice Hall.