Spoofing Finds A New Angle
Network-based file caching may provide a means for real-time data sharing and disaster recovery in file systems according to storage start-up Tacit Networks.
In one of my earliest writing projects for the IT trade press, I investigated the many techniques of mainframe channel extension for a magazine called IBM Mainframe Journal. After many changes of ownership, the magazine eventually became Enterprise Systems and spawned an on-line journal and this column, which have outlived the print book. That article on channel extension was published nearly 15 years ago, but its relevance is resurfacing today.
Mainframe channel extension was (and is) a set of techniques for extending the connection between processor and peripheral device across great distances. Originally, it was conceived as a set of solutions for placing printers and direct attached storage devices (DASD) further away from the mainframe back plane than IBM said was possible. Big Blue had warned that placing a peripheral more than 10 meters away from the CPU would cause it to fall off the edge of the world, like Christopher Columbus’ little-known fourth ship.
Of course, they were proven wrong. Several vendors, many of whom became the SAN switch vendors of today, came up with schemes to enable the extension of the mainframe channel across great distances. Some used spoofing to get the job done: that is, tricking the remote peripheral into believing that it was locally-attached and tricking the mainframe into believing that it was communicating with a box on the same raised floor.
The technique had the merits of surmounting the distance-induced latency associated with WAN communications. Well, sort of. By combining local caching and a bit of emulation intelligence in a “channel extender” unit at each communicating end point, speed-of-light delays became operationally irrelevant.
It was a good approach that continues to serve the mainframe world well. However, in the distributed computing world, another problem persists. While we may be able to use iSCSI or Fibre Channel for shared block-level storage access across distances, using the same basic approach as mainframe channel extension (mainframe operating systems treat all data as block data, after all), this approach does not extend into the domain of files.
For years, we have tried to extend file systems over networks, with Network File System (NFS) being the most dominant method. However, over an extended distance, NFS is a much too chatty protocol to be useful. The alternative—one that has been used in many collaboration schemes advanced via the Web—is to copy and download a “shared” file to local disk, a laboriously slow process and one that loans itself to the vicissitudes of unversioned copies.
Basically, most remote “file sharing” approaches have actually been file replication schemes, and all of them have proven terribly slow, even as the bandwidth of communications pipes has increased. Moreover, that nagging old Einstein thing gets in the way: it’s not the size of the pipe, it’s the speed of light, that is the issue in data sharing over distance.
Enter Tacit Networks, a new firm helmed by Tim Williams (formerly of CrosStor Software). Williams, a long time storage innovator and one of the first champions of NAS/SAN hybrids, sold CrosStor to EMC in 2000 for $300 million. There, however, his innovative ideas were sucked into the notorious black hole reported to exist around Hopkinton, MA. It wasn’t long before he was ready to try something else.
Linking up with fellow smart guy Shirish Phatak, a Ph.D. researcher with the Dataman Mobile Computing Research Labs within the Department of Computer Science at Rutgers University (and one of the folks behind the groundbreaking Intermezzo file system project), Williams has set out to spoof file systems through the use of a proprietary protocol and caching scheme.
In effect, Tacit Networks is developing file caching appliances interconnected by the company’s own secret sauce protocol that enable files to be shared “by proxy” across multiple, distance-insensitive locations. It is as though all users are located in a small LAN, according to Williams, who adds that his new technology provides the means to conceal the complexity of write-back caching and other voodoo from the users.
If a user seeks to access a file that is not in the current appliance cache, which boasts about a half terabyte of RAID 5 storage, Tacit’s solution will make it available within 15 seconds. The shared file is completely version controlled so two or more persons working on the same file will not overwrite each other’s changes.
Williams, and his angel round investors, see demand for the solution among organizations using applications with big sharing and coherency requirements. Think entertainment, media, CAD/CAM, medical imaging, and similar file intensive applications.
He says that set-up is a breeze, involving the assignment of several IP addresses to identify one node to another and to its LAN-based users. The whole solution can be monitored via Simple Network Management Protocol.
Five end-user pilots are ongoing and should conclude in January. As the concept proves its value, Williams anticipates becoming the “Brocade or McData of storage caching.” We’ll keep an eye on his progress.