The Case for Rationalizing Your Data Integration Toolset

Should organizations step up their efforts to rationalize their collection of DI tools? Yes, Gartner researchers say, but not for the reasons you might think.

New research from Gartner Inc. concludes that by implementing a "substantial data integration architecture," organizations can cleanse their portfolios of often redundant DI tools and (over time) move to a shared services model.

The incentive for doing so goes right to the bottom line. According to Gartner, organizations that rationalize their DI toolsets can save up to $500,000 annually.

Rationalization describes the process of cataloging one's disparate DI assets; identifying areas of overlap or shortcomings with respect to functionality; reconciling overlap by eliminating superfluous tooling; redressing shortcomings by deploying new (or enhanced) tools; and standardizing (where possible) on specific tool or suite offerings. On paper, DI rationalization sounds like a no-brainer. What's not to like?

Gartner, for its part, goes to far as to assign a cost to DI heterogeneity: a redundant DI toolset can cost upwards of $250,000 per tool annually, analysts say, citing software licensing, maintenance, and skill costs.

"Organizations often purchase and implement new data integration tools in a fragmented way without considering extending investments already made in other parts of the business, resulting in multiple tools from various vendors," said Ted Friedman, vice president and distinguished analyst at Gartner, in a statement.

"The first step is for IT teams focused on data integration to save money by rationalizing tools. Further, there is a greater longer-term opportunity to substantially reduce costs and increase efficiency and quality by moving to a shared-services model for the associated skills and computing infrastructure."

The Hard Bone of Contention

Quite aside from the specificity of its ROI claims -- data management and text analytic guru Seth Grimes points out that Gartner's deduction of more than $500,000 in annual cost savings seems extremely vague -- the report raises larger questions. For example, what costs and benefits must shops balance as they rationalize their toolset? What does Gartner mean by a "substantial" DI architecture? Moreover -- and perhaps most pointedly -- is DI rationalization itself worth it?

The answer to the last question is a qualified yes -- if you have an architecture. In other words, if you have a DI architecture, you have a framework for rationalization; change or disruption can be mediated (assimilated) through the DI framework. If you don't have an architecture -- if all you have is a collection of disparate tools, processes, and ad hoc procedures -- you don't have a starting point for rationalization. Almost everyone -- from Gartner to Grimes to TDWI senior research manager Philip Russom -- seems to agree on that point.

The reason the answer is a qualified "yes" is because this virtual consensus disintegrates once you get down to brass tacks. For example, what constitutes a "substantial" DI architecture? Is an out-of-the-box "architecture" -- such as those touted by data integration platforms from IBM Corp., Informatica Corp., and others -- sufficient? Must a "substantial" architecture involve more in the way of complementarity between tools and interoperability among technology, people, and processes -- or (conversely) can a "substantial" architecture comprise approaches (such as scripting and programmatic SQL) that unify resources, connect people with data, and provide a framework for managing change?

There's also the question of the purpose of rationalization. Some experts, such as Grimes, champion a pragmatic (not a dogmatic) approach to rationalization, stressing that rationalization isn't so much a function of product consolidation or standardization -- that is, of reducing functionality overlap or of downsizing one's DI portfolio -- as it is an issue of ensuring reliable, available, and functional access to data.

"You start by understanding the task and the tools you're using to make it happen. Create an architecture, but do be pragmatic about it rather than doctrinaire," Grimes urges.

For example, you could standardize on a single, relatively simple DI "architecture" -- such as programmatic SQL -- to stitch your environment together. This architecture could, in turn, function as a focal point for rationalization: because you're using programmatic SQL, you can cut out your disparate (and frequently overlapping) ETL, data profiling, metadata management, and data quality tools -- along with other extraneous DI middleware.

Such an approach leaves a lot to be desired, particularly with respect to automation, ongoing administration, information lifecycle management, adaptability, and future flexibility (to say nothing of its inability to address complementary services like event processing and application messaging). It is, however, indisputably the product of a rationalization process and it does deliver demonstrable reductions in both tool heterogeneity and software licensing costs.

It's one example of what Grimes calls the "doctrinaire" approach to DI, which ascribes undue -- perhaps even unseemly -- importance to rationalization as a means of consolidating software tooling and reducing licensing and skills costs. There's an undeniable attraction to big ticket cost cutting -- Gartner's $500,000 figure is a case in point.

Slashing costs shouldn't be the sole (or even a primary) reason for DI rationalization, experts argue. Putative benefits, after all, are vague. As Grimes points out, $500,000 in savings (or $250,000 in annual savings per rationalized tool) might make sense for a large multinational enterprise, but should a small regional company expect to save as much?

Gartner's research identifies some excellent (if obvious) rationalization opportunities. For example, it recommends that when possible, enterprises should tap native (in-database) DI tools; eliminate (costly) redundant ETL or data integration tools; identify DI skills and consolidate DI expertise in a shared services team model.

Gartner seems to prioritize cost cutting over other drivers, which makes seasoned data management pros wince.

"Rationalization should be about making sure you're getting all the data you need for operations and analyses from the originating systems to the target systems in a timely, completely, clean, and functional manner," Grimes concludes. "When you do look for tool consolidation possibilities, don't set out to hit an arbitrary target [cost-wise]. Simply set out to do the work effectively."