Tape's Diamond Jubilee Part 2: The Power behind the Throne
IBM's Linear Tape File System (LTFS) is touted by enthusiasts as a "game changer" for the fortunes of tape technology in contemporary IT. LTFS is generating the kind of buzz that one normally finds around iPhones and other "cool" consumer tech.
The reason is simple. If upwards of 55 percent of data generated today are files (rather than block output from databases), and if the preponderance of this data - as much as 70 percent -- is rarely if ever re-referenced, developing a way to move it off disk and onto a still-accessible tape-based solution makes a lot of sense, both from a cost containment and an energy efficiency standpoint. Having the means to do so could blunt the spiking disk capacity demand curve, pegged by IDC and Gartner respectively at 300 and 650 percent by 2014 in virtual server environments.
Rather than trying to develop a coping strategy for out-of-control disk storage consumption, TapeNAS holds out the real possibility of creating and sustaining a storage reclamation strategy. That's great news for tape media and automation vendors; not so good for disk array vendors with no tape in their product families. This is perhaps what has motivated EMC to embrace tape at long last -- albeit from surrogate provider Spectra Logic.
A Level Set
Let's be clear: LTFS is not the first file system ever proposed for tape, nor the first format for tape cartridges. It just happened to hit the marketplace at roughly the same time that the first "partitioned" LTO standard cartridge (LTO 5) was announced. This was important because tape media must have at least two partitions in order to record indices containing file metadata and file start markers alongside the traditional partition where file data itself is stored.
The availability of LTFS with the arrival of a partitioned LTO cartridge provided the underlayment for TapeNAS. Taken together they enable mounting an LTFS tape "as though it were a USB key." Since USB keys are not considered robust enterprise storage solutions, LTFS is often described -- to borrow from Spectra Logic CTO Matt Starr -- as "an enabling technology, rather than a product."
Starr's observation makes sense. At best, a basic LTFS driver can show, one tape at a time, the contents of the tape in a fairly traditional (though not very user friendly) file tree. What makes the file listing less than friendly is that its contents are displayed in a file folder that uses the cryptic bar code of the cartridge that is being read as a name. Clicking on that folder name provides a more conventional display of all of the files contained on the tape.
To create an industrial strength "NAS on Steroids" solution with LTFS, additional elements must be bolted on to the basic technology. For one thing, LTFS must be used in combination with data access software such as a media asset management system, a hierarchical storage management (HSM)/archive system, or a file system name space capable of storing and indexing file system metadata from all cartridges in the library to make this kind of storage truly usable. Since the resulting storage usually needs to be shared with users, LTFS and its user-friendly access software needs to be staged on a server that delivers the storage kit as a network share by providing connectivity to users via the Network File System (NFS), CIFS/SMB, HTTP, or something similar.
Moreover, the business requirements for data access may or may not be satisfied by the server alone. Time to first byte of tape storage in LTFS ranges from 20 seconds to two minutes, which is approximately the same delay accrued to accessing a file across the World Wide Web. In some cases, this may seem too slow, creating a need to build a caching capability into the LTFS host platform, using either disk or memory, that will buffer the data on the tapes and shorten fetch times. This bit of spoofing is already done today on disk platforms, including NAS storage, to speed both ingestion and retrieval. The good news is that, for long block files such as video, once the starting point of the file has been reached, the streaming rate of tape far exceeds that of disk, so playback efficiency is better overall.
The bottom line is that IBM may have understated the issues in some of their public talking points around LTFS: "Just download LTFS for free from the IBM Web site or the site of the LTO Consortium and install it on a Red Hat server and you're in business." Actually, a deeper discussion with IBM tape mavens reveals that more work needs to be done. Building out an LTFS server front-end for a tape library on a do-it-yourself basis can seem more like a science fair project than an enterprise-ready storage solution. The good news is that the cobble that is LTFS has opened the door to vendors to build pre-integrated TapeNAS "heads" that can be used to front-end LTO 5 (and beyond) libraries. The clear leader in this space is Crossroads Systems with its StrongBox appliance.
Some frustration has issued from trade press writers and storage bloggers regarding the ongoing limitations of basic LTFS technology. IBM first sought to address these by announcing LTFS LE (Library Edition) for which it charges money. LTFS LE still lacks the asset manager, archive, or user-friendly file system that people seem to want, but it can index all of the tapes in a library up to a set TB size. Delays in getting the whole enchilada available and drool-proof to install, however, threaten to dampen the enthusiasm that LTFS is creating in tape -- the likes of which haven't been seen in a decade.
The big advantage of LTFS on LTO tape, according to Starr, is that it provides an "exchangeable format" that enables LTFS-formatted LTO tapes to be read on any LTFS-enabled LTO library, regardless of the vendor brand on the outside of the kit. This builds on the value case for LTO itself, which was originally contextualized as a "universal tape cartridge."
Getting to this goal of format exchangeability, however, doesn't happen overnight. (Witness how long it took for physical LTO cartridges to work interchangeably in tape libraries from the three members of the LTO Consortium: throating specifications -- the angle of tape insertion by robots -- created biased tape writes that created issues when attempting to read a tape recorded with vendor A's drives in one LTO library in another LTO library that used vendor B's drives.)
From our video interviews with IBM (at http://youtu.be/eCi_b5-n1e4
http://youtu.be/LKnpl7aGz4g), it was clear that Big Blue took the initiative and deserves credit for the development of LTFS, but it was equally evident that they were treading cautiously. For one thing, there is a misperception that LTFS was "donated" to the LTO Consortium (it wasn't) and that it is already a standard (it isn't). Moreover, proprietary drivers such as IBM's FUSE are used to actually write LTFS format to a tape. Unfortunately, other vendors have their own drivers and may not use the FUSE drivers from IBM, likely creating new obstacles to perfect format exchangeability.
Third, sensing a threat to their leadership mantle, IBM has plans to develop LTFS as a formal standard -- using the Storage Networking Industry Association (SNIA), a standards body only by the most elastic interpretation of the definition, as the vehicle for thrashing out the standard language. Some worry that SNIA's deliberations about standard LTFS will produce unpredictable results. Recalling how the original definitions for the Storage Management Interface-Specification (SMI-S) fell prey to inter-vendor politics within SNIA, and how it was radically watered down by the big vendors in the association, doubters may have a point.
In any case, given the agenda laid out by IBM, first to get the LTFS technology ratified as "the one true LTFS," then to create some mechanism for certifying third-party products for their adherence to the standard, it could be years before a pre-cobbled product such as Crossroad's StrongBox reaches the market from IBM or other tape system vendors. Indeed, early developers, such as Crossroads and some of the media asset manager software vendors, are seeking to tack around the headwinds of uncertainty by enabling TapeNAS solutions that are tailored to the requirements of very specific industry verticals, rather than fielding solutions for general-purpose mass file storage requirements.
The key concern is whether the heart of a disk array vendor is really in TapeNAS. During our time at IBM Edge 2012, off-line conversations with IBM resellers and direct sales operatives revealed the central problem. Questions were already being raised about what a highly scalable TapeNAS solution might do to profit margins realized from the sale of disk arrays featuring functions such as on-array tiering, compression, and de-duplication. Doubtless some research will need to be conducted by IBM and others to project the impact that a robust TapeNAS solution might have on the profit garnered from the sale of disk storage systems. Similar to the effort to create a standard, this is also likely to take time.
Another question is whether IBM wants a future LTFS product to support all tape libraries or to work only with its own kit. Some competitors openly speculate that IBM has targeted enterprise-class LTFS TapeNAS for use with its proprietary Jaguar tape formats and products. To avoid lock-ins to any particular hardware vendor, companies such as archive software maker QStar Technologies, have created their own device driver wares instead of using IBM's FUSE driver. A proliferation of formatting drivers could well Balkanize and potentially derail the progress toward an open TapeNAS platform.
The Bottom Line
What intrigues us about the promise of TapeNAS is its low cost per TB and its extraordinary capacity per watts metrics. We are talking today about a 35TB storage solution for roughly $26,000. With tape capacity improvements owed to FujiFilm's Barium Ferrite coatings, this could easily scale to more than a petabyte in the same footprint within a year or two -- using soon-to-market 32TB capacity cartridges (without compression) for about the same money. We dare you to show us a disk rig with those numbers.
Your comments are welcome: email@example.com.
Jon Toigo is a 30-year veteran of IT, and the Managing Partner of Toigo Partners International, an IT industry watchdog and consumer advocacy. He is also the chairman of the Data Management Institute, which focuses on the development of data management as a professional discipline. Toigo has written 15 books on business and IT and published more than 3,000 articles in the technology trade press. He is currently working on several book projects, including The Infrastruggle (for which this blog is named) which he is developing as a blook.