Storage Half-Truths and Deceptions

The Wild West adage “Never trust nobody” should be a storage manager’s creed.

The storage industry sometime strikes me as the last bastion of the Wild West—that semi-mythical place where the operative counsel was “Never trust nobody.” In what other area of information technology is terminology as slippery as it is in storage?

SANs aren’t really networks: Fibre Channel is a channel protocol. Networked storage isn’t really networked: even NAS is technically just a thin operating system server bolted to an array—direct-attached storage, any way you cut it. And that’s just for starters.

Two new areas of slippery storage marketing are in the areas of regulatory compliance and data security. It seems that when you put storage marketecture together with either one, you begin to see the makings of a perfect storm.

Case in point: Some vendors are calling their storage products “compliance certified,” which you could readily interpret to mean that some regulatory agency has approved the product as a repository for data that fulfills the obligations imposed under a specified law or regulation. The problem is that there aren’t any certifying authorities to approve (or disapprove) any product as a suitable platform for anything.

Content addressable storage (CAS) platforms, which combine an engine to index files with a proprietary controller and commodity disk—enabling a huge mark-up in terms of solution cost, are steadily making inroads into the marketplace. At least two vendors identify their CAS products as “certified” for storing data in a regulation-compliant manner. If you are a health care organization that needs to comply with HIPAA regulations on long-term data preservation in patient health care records, parking your data on such a platform seems like a no-brainer.

The problem with the claim is that there is no certifying authority established by HIPAA, the SEC, or any of the other regulatory bodies for data retention/deletion as far as I can tell. One industry insider told me that the way the vendor makes such a claim is simple: it sends a letter describing its product and methodology to the agency or department involved, requesting certification. If they receive no objections back within a reasonable time (say, 30 days), they assume that the product is approved and claim that it is compliance certified. Only—and here’s the catch—there isn’t anyone to approve or disapprove the request for product certification, so the vendor community is assured of supposed certification approval by the old adage, “Silence is assent.”

My advice: the next time a vendor tries to sell you a “compliance certified” solution, ask him whether that means that his company will stand between you and the regulator if and when you are audited or handed a subpoena. It’s fun to watch sales people squirm.

Data Security

The same basic guidance applies when a vendor seeks to get your purchase order for a product that makes data “secure”—usually by making it unreadable on disk or tape. Given the frequency of accidental data disclosures of late, companies are rightfully concerned about whether enough is being done to secure data—especially data in flight: traversing the wilds in laptop or PDA hard disks or traveling the highways and byways on tapes loaded onto off-site storage trucks. Storage encryption products are popping up everywhere to meet the increasing demand.

Many vendors are selling “encryption appliances,” which are mostly aggrandized PCs that sit in the Fibre Channel fabric somewhere between the switch and the storage device, whether a disk array or tape library. Sounds like a simple way to encode data with a cipher as it heads to its target media.

As with most storage gear, hidden costs abound in these solutions. All these appliances generate considerable heat as their overclocked processors struggle to encrypt data at line speed before it reaches the media—raising the temperature and the HVAC costs in your data center. Additionally, you typically need a lot of them to get the port counts required to match your storage devices. You also need spares you can install off-site with your backup media to read the data encrypted by the original appliances should you ever need to recover from your backups.

Costs aside, the problems with these storage-encryption solutions tend to fall into the categories of “too much” and “not enough.” Vendors try to outdo each other by selling consumers more encryption than their competitors, as though the more bits in the cypher, the more difficult (in theory) it is to break the cipher and the more secure the data. Some vendors are now saying that anything less than 1000-bit encryption can be broken, supporting this claim with a reference to the successful breaking of 512- and 660-bit schemes at the University of Bonn last year. Actually, the RSA factoring test (the name of the contest that yields monetary rewards for individuals or organizations successfully cracking encryption schemes) that took on strong encryption schemes required a distributed attack by many computers working in carefully-organized concert.

Paying more for more encryption bits falls into the category of overkill. In the case of the U.S. government, which seeks to encrypt “top secret” data, the National Institute of Standards and Technology insists a 256-bit encryption key is enough. They estimate that it would take over 100 years to break the key, and that’s after factoring in rates of processor improvement and Moore’s Law.

“Not enough” refers to another problem with storage security. Encryption is often implemented inconsistently and is therefore jeopardized by a lack of follow-through or oversight. Hand a road warrior an encrypted laptop and he will probably shut off the encryption in short order because of the difficulties it presents in daily work (e.g., decrypting files for e-mail attachments, longer waits on file loading, etc.).

User awareness of the need for encryption is usually proportionate to the amount of time elapsed since the last time that company data was exposed. That needs to change.

On the other hand, users are not entirely witless about security. They may know something that the storage guys don’t. For one thing, encryption products aren’t foolproof, as Kevin Mitnick, a convicted hacker, observed at a data security conference in Lisbon last year. The self-styled guru of security breaking applied an interim update of the software he uses to encrypt his own laptop, PGP, and found that his decrypt key no longer worked. He scrambled to deliver his presentation at the conference in Lisbon on a backup laptop.

Deleting Data

Closely related to the marketecture around compliance and data security is the marketecture spinning up around the newly perceived need to delete data once its usefulness or required retention period is over. A number of vendors are jumping on this organized data-deletion bandwagon, hoping to collect money from the same companies to which they sold compliant and secure data-retention systems last year. As it turns out, “certified compliant” data retention solutions did only half the job: they helped companies retain data—for whatever amount of time was required to meet their Sarbanes-Oxley, HIPAA and SEC data management requirements. Now everyone is looking for a foolproof way to delete data completely—every copy of the data, including the files on production disk, in archives, and in backups.

Currently, just finding every instance of data is a daunting challenge. Deleting data once found completely is doubly difficult because, simply put, there is no effective way to completely delete data. This was the finding of the National Bureau of Economic Research in a paper written in 2003 and updated several times since.

According to the paper, overwriting data with new data (the preferred approach, since a normal delete command merely invalidates the name of the file, but leaves its blocks intact and available for retrieval) doesn’t work very well. When a one bit is written over a zero bit, the "actual effect is closer to obtaining a .95 [erasure] when a zero is overwritten with a one, and a 1.05 [erasure] when a one is overwritten with a zero. … Given that a read head is 20 times as sensitive as the one [used to write on] the drive, and given the pattern of overwrite bits, one could recover the under-data.”

Acknowledging the risk that deleted data might be recovered using “under-data,” the U.S. Department of Defense has a project running with Georgia Tech Research Center to perfect a technique for absolutely ensuring data erasure from a hard disk in less than 5 seconds. Apparently, software based “data shredders” such as Norton WipeInfo don’t do an adequate job. Bad sectors of a hard disks that have been marked for exclusion from new data writes by disk electronics are ignored by the erasure process too. Since some valuable information might persist in these sectors, another approach, dubbed “Guard Dog” by developers, is being tried that leverages a 125-pound magnet and a hand crank to completely obliterate disk data in all sectors.

Commercialization of this technology by L3 Communications, and of other “crypto shredding techniques,” is pending. Until products appear in the market, it might be very difficult to demonstrate a routine program of data deletion that would hold up in court.

There are a couple of exceptions to the above generalizations about storage security wares, of course. DISUK, for example, provides an encryption appliance that isn’t based on a PC chassis at all. Its brain is a piece of specialty silicon engineered just to do the job of encryption. Our labs at TPI Technologies will be testing it against the NeoScale and Decru appliances shortly.

Also on the horizon is a specialized encryption router from Crossroads Systems. Based on core storage-routing technology pioneered by Crossroads and additional intellectual property acquired with the recent purchase of Tape Laboratories (TapeLabs), the product will be worth a look.

Also, lend an eye to BitArmor, a start-up with big ideas and good IP from Pittsburgh, PA. They propose to encrypt in software running on servers. Moreover, they add a tick box under Windows to go alongside other attribute markets (HIDDEN, ARCHIVE, etc.) already provided by Microsoft, that will enable administrators and end users to identify what files need to be encrypted, and are encrypted before they are saved to disk.

In the final analysis, storage remains the Wild West, with vendor claims either exaggerating the benefits of many of their technologies or understating their foibles. The rule, as in the old Western movies, is: Never trust nobody.

Your thoughts are welcome.