Of File Systems and Databases, Part 2: Clustered File Systems

Clustered file system advocates have positioned their solutions as alternatives to the monolithic file systems (such as WAFL).

Last week I discussed the acquisition of Spinnaker Networks by Network Appliance, a move that seems counterintuitive if the Sunnyvale, CA NAS leader (Network Appliance) wishes to integrate Spinnaker’s technology with its own Write Anywhere File Layout (WAFL) technology—a very different, and in many ways, contrary approach to scaling.

Clustered, extensible, file systems have garnered many adherents of late both among start-ups and stalwarts in the industry. Rather than embracing the vertical scaling approach inherent in the Network Appliance offering, advocates of clustered file systems seek to grow storage horizontally – by using many NAS heads organized under a clustered operating system to provide huge file system extensibility improvements while presenting a single image of the file system itself to applications and end users.

Who needs it? Advocates such as Yotta Yotta, Spinnaker Networks (pre-acquisition, of course), Silicon Graphics, Panasas, and a few others, have long positioned their solutions as alternatives to monolithic file systems that reveal their inherent performance and capacity bottlenecks as file systems begin to scale. This is another way of saying that such solutions are needed by anyone with a lot of files under management.

Some have jumped on the bandwagon around Information Lifecycle Management and regulatory compliance to suggest that a well-managed and horizontally scalable file system storage solution enables files that must be retained on line for a protracted period to be migrated to less-expensive hardware over time without losing accessibility (by migrating data off to archival tape).

The earliest adopters are the obvious ones: from oil and gas companies that keep enormous quantities of engineering files and exploration and monitoring data on line at all times to grid computing and supercomputing cluster advocates at national research laboratories that seek ready access to millions of files to facilitate nuclear detonation simulations, weather analysis, and other scientific applications.

Clustered NAS also provides a method for enabling the use of SANs as file system repositories: it is one interpretation of the airy architectural concept “NAS-SAN hybrids.” Indeed, the use of NAS heads as the access points for data (files) stored in SANs (which has been discussed in this column in the past) has merit, as this strategy may surmount the many limitations in SAN management that plague Fibre Channel fabrics today. However, to hear the interpretation of this strategy offered by vendors of NAS clustering systems and in given their lack of block storage protocol support, it seems that they are really describing a “super NAS” rather than a hybrid platform.

In any case, the Spinnaker acquisition gives Network Appliance a stake in the ILM marketecture game. But how they will combine WAFL with Spinnaker’s clustered, extensible file systems remains difficult to discern.

Why they would want to combine the two technologies is yet another boggle. While there is certainly a niche for high-end, massively scaling NAS, there seems to be quite a crowd in this market already competing for comparatively few (though potentially lucrative) opportunities. What’s more, if Microsoft, Oracle, and IBM have their way, file systems are about to go the way of the dinosaurs—replaced by databases in which data is stored as binary large objects. Adding a row atop the objects enables a common method to support data self-description, something that is required for real ILM and difficult to implement in a universal and non-disruptive way in a heterogeneous file system world.

That’s why the smart money is on adding block storage protocols like iSCSI or iFCP to NAS heads. The vendor that does this will be able to sell product into both legacy file-system-based data storage environments as well as newbie database-with-files-as-binary-objects environments that will soon present themselves in the market if Microsoft’s Longhorn OS succeeds.

Those are my thoughts. What do you think? Write me at jtoigo@intnet.net with your observations and your experiences with clustered file system products.

About the Author

Jon William Toigo is chairman of The Data Management Institute, the CEO of data management consulting and research firm Toigo Partners International, as well as a contributing editor to Enterprise Systems and its Storage Strategies columnist. Mr. Toigo is the author of 14 books, including Disaster Recovery Planning, 3rd Edition, and The Holy Grail of Network Storage Management, both from Prentice Hall.