In-Depth

Herding CA(T)S Caringo Launches Software-Only CAS Solution

The fur will fly as Centera flaws are exposed

One feature at the upcoming Disaster Recovery and Data Protection Summit in Tampa (May 31 through June 1) will be the unveiling of seemingly innocuous piece of software for content addressable storage (CAS). Paul Carpentier, Jonathan Ring, and Mark Goros—whose patronymics form the Austin, TX-based company’s name “CARINGO”—are behind the announcement, which, if I were EMC, would have me quaking in my boots.

About three and a half years ago, EMC released Centera, a platform for storing data in an indexed way for a really long time. The original theory behind the product was that companies (health care organizations in particular, by virtue of HIPAA, mainly) were looking for a way to store patient data for several decades. Given the annual failure rates of disk and continuous improvement in disk technologies, it was likely that these businesses would seek to migrate their records from one array to another over time. The prevailing concern among IT folk in medical service shops was that just keeping track of musty old bits when all of these migrations were occurring would result in a huge headache.

So EMC, with the help of software from the Belgian company FilePool that they acquired the previous year, sought to apply an indexing scheme to the data. Content addressing would help you track bits as they migrated across frames.

It might have been a good plan, if Hopkinton had not decided to join the FilePool content addressable storage (CAS) software to a hardware controller on a low-end EMC array. Doing so created one of the biggest lock-in strategies seen in the industry. Having encoded your data into a Centera array, you could only buy storage for additional indexed storage going forward from Hopkinton. If you were in neo-natal health care and had to hold on to data for as long as the patient walked the earth, you were basically tied to EMC for the next 70 or 80 years.

I took exception to the strategy when I first learned about it, and raised alarms in this column and elsewhere when EMC’s considerable success with this product spawned similar products from other vendors who unabashedly referred to the wares as “sticky technology.” I ranted about it, among other things, on stage at a conference in Frankfurt, Germany last Fall. Afterwards, while taking in the Autumnal air outside the venue, I struck up a conversation with a man standing next to me, only to learn that his name was Paul Carpentier, the inventor of CAS.

Carpentier had been the CTO at FilePool. In the two years following the acquisition of his company by EMC, he had kept a low profile. But now he seemed energized, bursting to talk about CAS, Centera, and the future of indexed storage. During our informal chat, two things became immediately clear: 1) he didn’t much care for the implementation of CAS technology by hardware vendors generally, and 2) he worried that hardware CAS itself was placing data assets at high risk.

Carpentier, whom I hold in high esteem, is not your typical vendor scorpion. He has considerable Old World charm and the demeanor of someone whose hobby might well be collecting butterflies. His partners, Ring and Goros, are also nice men who seem to be motivated as much by the physician’s rule, Do No Harm, as they are by business goals and objectives. This makes for a great combination and offers a stark contrast to a lot of the people I encounter in the storage trade these days.

Carpentier said that he had isolated several key flaws in his original software concept that might limit its scalability and effectiveness in tracking data over time so it could be located and retrieved quickly if needed. He observed that additional problems had been introduced by implementation of MD5 hashing in hardware CAS solutions. He was deeply concerned about the resulting “solutions.”

He had worked since the sale of FilePool to resolve the problems in his original software concept, but was rebuffed when he approached EMC to discuss potential problems and the fixes. From what I understand, EMC turned him away—Centera was selling well, regardless of any problems that might exist in the header hashing scheme, and there was no interest in messing with a good thing.

Centera had hit the streets just in time to solve a bigger problem than health-care records retention. Regulations such as Sarbanes-Oxley and new SEC rules (among others) had contributed to making organized records retention (and deletion) a hot button for the Front Office in corporate America. EMC was quick to recontextualize Centera as a drop-in fix for long term data retention and quick retrieval requirements, especially those imposed by regulators and legal departments.

This marketing message has been perpetuated by a slick marketing campaign representing Centera as “certified” by regulatory authorities as a “compliant” storage platform. (We have done some checking, and there appears to be no certification authority which judges storage products as compliant or not. Vendors taking this marketing tact simply send a letter of inquiry to an agency or department, describing their product and asking whether it provides an adequate solution for hosting data, and when they receive no response within 30 days, the lack of disagreement is used to connote approval.)

Speakers from Hopkinton have, at various conference venues, suggested that Centera has garnered between $200 and $460 million in sales revenue since its introduction—money desperately needed by a company best known for selling Big-Iron wares into an increasingly flat-lining market. Bottom line: if customers didn’t know that Centera was broken, nobody was going to fix it.

That seems to have triggered Carpentier, Ring, and Goros to get religion and go native once again, forming Caringo. Their product, CAStor, will debut next week; it is a software-only CAS play that works with any hardware you want to use. I’ve seen it and CAStor is an elegant bit of code. My hope is that Microsoft will buy it and build it into Vista, Redmond’s next-gen operating system. One thing is for sure, however, it is going to spoil Hopkinton’s day.

I have a suspicion that the fur will fly once Carpentier begins enumerating the flaws with Centera, a topic which he can address with absolute unassailability as the father of CAS. Moreover, he knows exactly where the bodies are buried in the EMC wares, and taking him on would not be a good idea. It should make for some entertaining diversions during the Summer of 2006.

If you want to learn more about CAStor and Caringo, or to learn more about the latest techniques and technologies for data protection and business continuity, you are invited to meet Carpentier and company at the free Disaster Recovery and Data Protection Summit in Tampa, FL. Details and registration are at http://summit.datainstitute.org, and seats are still available.

Comments are welcomed at [email protected].

About the Author

Jon William Toigo is chairman of The Data Management Institute, the CEO of data management consulting and research firm Toigo Partners International, as well as a contributing editor to Enterprise Systems and its Storage Strategies columnist. Mr. Toigo is the author of 14 books, including Disaster Recovery Planning, 3rd Edition, and The Holy Grail of Network Storage Management, both from Prentice Hall.

Must Read Articles