Survey Reveals the Staying Power of Homegrown Data Integration Tools

Studies have consistently highlighted the persistence of hand-coded DI. Why won't homegrown DI go away?

Microsoft Corp.'s SQL Server-based data integration (DI) toolset leads the market in terms of penetration according to a new survey sponsored by DI specialist SyncSort Inc., which found that Microsoft's DI tools are used by almost three-fifths of respondents. The runner-up, Oracle Corp., has a DI portfolio that includes Oracle Warehouse Builder (OWB), and Oracle GoldenGate, along with Oracle's data quality (DQ) and data profiling (DP) tools; its tools are used by over one-third (38 percent) of shops.

What's perhaps most remarkable is the persistence of hand-coded DI tools, which are still used in almost one-third (29 percent) of shops, which is good enough for a third-place showing. Although that may be remarkable, it isn't surprising.

Studies have consistently highlighted the persistence and popularity of hand-coded DI. In a recent survey, for example, TDWI Research -- the research arm of The Data Warehousing Institute (TDWI) -- found that just under one-fifth (18 percent) of shops still use hand-coded DI technologies.

Were it not for the popularity of free DI tools, hand-coded DI would probably be used even more extensively.

It's easy to see why Microsoft and Oracle finished first and second in SyncSort's survey. Microsoft has bundled a DI tool with SQL Server for almost 15 years; Oracle bundles OWB with its flagship database and (more recently) integrated both OWB with Oracle Data Integrator (ODI).

In a growing number of enterprises, this makes for an inescapably simple calculus: if your DBMS comes with a built-in ETL tool, why wouldn't you use it? By the same token, if your DBMS comes with built-in ETL technology, why are you still using a hand-coded tool? That, experts say, is the key questions.

To put things into perspective, hand-coded DI tools are more popular than are DI offerings from SAP AG -- which fields a full-blown enterprise information manager (EIM) thanks to its acquisition of the former Business Objects SA. (That's to say nothing of SAP's acquisition of the former Sybase Inc., which gave it best-in-class replication technology, along with creditable ETL and data federation assets.) Hand-coded DI is likewise more pervasive than are offerings from Informatica Corp., SAS Institute Inc., Pervasive Software Inc., Ab Initio, iWay Software, Talend, Evolutionary Technology Inc. (ETI), and SyncSort.

Finally, hand-coded DI is also more pervasive than any single data integration offering served up by IBM Corp. Big Blue -- which acquired many DI technologies over the last half-decade -- is actually counted twice in the SyncSort survey: once for its bread-and-butter DataStage DI business (which devolves from its acquisition of former ETL vendor Ascential Software Corp. and which accounts for 14.8 percent of the market); and once for its goup of complementary DI technologies (including technology it acquired from the former DataMirror Corp.).

Combined, IBM tools are found in just under 30 percent of shops.

Several ETL vendors, including SAS and Pervasive, get lumped into the "Other" category, along with other prominent (or once-prominent) names. All told, "other" DI technologies account for almost one-tenth of the tally, a not-insignificant chunk when you consider that DI vendor Informatica is present in only about 15 percent shops. On the other hand, half a decade ago, SAS sat broadly athwart -- ranked just behind or even slightly in front of rivals market-leading rivals Informatica or Ascential -- the enterprise ETL segment.

The "Other" grouping also includes traditional DI vendors such as iWay and rising stars such as Talend, WhereScape Inc., and Expressor.

Hand-Coded Staying Power

DI pure-plays tend to deflect questions about the presence or staying power of hand-coded data integration technologies. If you believe the DI vendors, the vast majority of shops transitioned away from hand-coded DI a long time ago. There's doubtless some truth to this, inasmuch as enterprises have worked to replace legacy or hand-coded DI functions with third-party tools.

At issue is the fact that legacy tools aren't just hanging on but (paradoxically) enjoy sizeable position in most organizations. One way of interpreting this is to argue that the use of hand-coded DI tools persists because no third-party tooling, used either singly or in combination, can duplicate in a cost-effective manner the functionality provided.

As one industry veteran who spoke on condition of anonymity put it, "Why break what works? I know [of] one company that has a SQL Server 2000 server still running, [with] T-SQL chugging away. Nobody wants to touch it until the system is decommissioned. There's a lot of this 'nouvelle legacy' out there on outdated Windows and Linux boxes, to go with the old AS/400, pSeries, and mainframe applications," this observer says.

This sentiment is echoed by Philip Russom, research director for data management with TDWI Research. "[S]urvey data shows that migrating from hand-coding to using a vendor DI tool is one of the strongest trends, as organizations move into the next generation," he explains in a recent TDWI Best Practices report, Next Generation Data Integration.

In most cases, Russom says, shops are mixing and matching (but inescapably weaning) themselves off of or away from hand-coded DI tools. "A common best practice is to use a DI tool for most solutions, but augment it with hand coding for functions missing from the tool."

Shops should look to pre-packaged DI to eliminate hand-coded tools, Russom advises. "Users want to reduce the amount of hand coding," he notes. "Only 18 percent of respondents report depending mostly on hand coding for DI." Russom concedes that this tally "seems low, compared to other surveys TDWI has run." Nevertheless, most shops expect to eliminate their dependence on hand-coded DI at some point in the near future.

"With this survey population, hand coding will drop down to a miniscule 1 percent. Migrating from hand coding to tool use as the primary development medium is, indeed, a prominent generational change for DI," he writes.

There's another wrinkle here, too, however: what exactly constitutes "hand-coding," anyway? "Using SSIS is a lot like hand-coding," observes the industry veteran we spoke with. "Unless you try to use the tools, you don't see it. Lots of people use SSIS. Lots of people write or insert copied-from-a-Web-site code snippets. How is using an ETL framework that makes you script some parts of the work different from hand coding?"

SyncSort's survey collected responses from 359 participants, running the gamut from small shops of up to 250 employees (collectively, more than one-third of participants) to large shops of more than 10,000 employees (just over 22 percent of participants); mid-market shops accounted for 42 percent of the sample. Most participants hold IT staff positions (46 percent), with IT management positions -- including IT executives or mid-level IT managers -- accounting for almost 40 percent of responses.