In-Depth

When Does Enterprise ETL Make Sense?

Many customers are dong just fine with scripting or programmatic SQL. So when is an enterprise ETL tool the right choice for you?

With IBM Corp.’s $1 billion purchase of Ascential Software Corp., the once sleepy data integration space is suddenly the hottest ticket in town.

What’s surprising, however, is that many customers still aren’t using enterprise-class ETL tools of an Ascential or Informatica-type complexity. In fact, some are tapping scripting, programmatic SQL, or other comparatively low-tech approaches to accomplish nuts-and-bolts ETL processing. Others are using application- or database-specific ETL productss that, in many cases, are bundled free of charge or available at steep discounts from their software vendors.

Consider retailer BJ’s Wholesale Club, which recently implemented a new financial intelligence system based on the Essbase OLAP server from Hyperion Solutions Corp. For some of its nuts-and-bolts data integration, BJ’s tapped Hyperion’s Essbase Integration Services (EIS), a suite of graphical tools and data integration components that link applications written to exploit Hyperion’s BI platform with data stored in relational databases.

But how is BJ’s getting data from its mainframe systems into Essbase? You guessed it: Scripted file transfers via FTP. For BJ’s, the ability to load mainframe data directly into Essbase, even in the context of a once-daily batch process, was nevertheless a significant improvement over the status quo.

“We were really trying to have one place to go to so that everyone is getting the same information, because currently our financial systems are mainframe systems, and we have to write programs to retrieve data from the mainframe and then do some kind of file transfer or downloading into Excel, so that people can do their typical analysis,” said Jill Reid, manager of accounting systems with BJ’s Wholesale, in an interview last year.

When Enterprise ETL Makes Sense

For customers in richly heterogeneous environments, especially those who are consolidating information from dozens (or even hundreds) of different data sources or applications into an operational data store or data warehouse, an ETL tool from an Ascential, Informatica, or another enterprise ETL vendor makes a great deal of sense. Data integration pure-plays such as Ascential and Informatica, especially, provide canned support for all major relational databases, along with pre-built adapters for many different enterprise application suites (such as Oracle, SAP, and Siebel), and connectivity to a variety of different non-relational or semi-structured repositories.

Why, then, aren’t enterprise ETL tools used in some organizations? The biggest issue, analysts say, is price: All of this connectivity, coupled with (in the case of Ascential and Informatica) parallel-processing capabilities, metadata management, the ability to perform sophisticated transformations, and the availability and reliability safeguards associated with the extraction, transformation, and loading of data, can cost a customer beaucoup bucks.

There’s another consideration: data integration sophistication necessarily entails some degree of complexity with respect to the process of tying everything together. This often translates into a requirement for specific human expertise to design, implement, and oversee an enterprise ETL deployment. Contrast this with a script-driven ETL solution, which in most cases can take advantage of capabilities (e.g., Perl scripting) and resources (e.g., FTP connectivity) that are already present in most IT organizations.

More E and L Than T

The simple reality, of course, is that for many data-integration scenarios, the most important pieces of the ETL puzzle are in fact the E and the L—that is, extracting data from one standards-compliant data source and copying it, often with only simple transformations (or without any transformations at all), into another standards-compliant database. If you've a customer billing system that supports ODBC, for example, and if you want to load data from it into a SQL Server or DB2 data warehouse, your “ETL” process can consist of simply reading data from the source system and loading it into the relational database. That’s it.

In cases where source data must be reformatted, many such transformations are relatively simple—e.g., converting text to numeric, text to date, or merging two or more fields and concatenating names—and can be accomplished by means of scripting. For more complicated transformations (e.g., looking up data in another table and replacing it, splitting one field into many), many DBAs also opt to use scripting or programmatic SQL.

This is a point even many ETL vendors are willing to concede.

“People want to take advantage of what they've got, and most of them have some [form of ETL] already. It's more often not an ETL tool; it's hand coded,” says Darren Cunningham, product marketing manager with Business Objects SA who promotes the company’s Data Integrator ETL tool. “We hear from customers all the time who don't want to throw away those scripts they've already got. They're asking us to [support] these with our tool. They want to go along with what we're doing, but they don't want to rip out what they're doing.”

You may recall that Cognos Inc. acquired its own ETL tool, from the former DecisionStream, several years ago. Since then, of course, the BI giant has done little, if anything, to promote that offering. According to Karen Williams, vice-president of marketing for enterprise BI with Cognos, there’s a reason for that: Cognos positions DecisionStream as a value-add to its core BI offerings. “We’re not out there promoting DecisionStream as an ETL tool, but we are using it in a lot of accounts where they don’t already have preferred providers,” she says, noting that DecisionStream features connectors into SAP R/3 and other packaged ERP applications.

Even then, Williams acknowledges, DecisionStream is used primarily in conjunction with the Cognos BI platform. “If they don’t already have a preferred [ETL] provider, they probably have some custom [ETL] solution, and, yes, part of that is probably using scripting or some other approach.”

Next week, we’ll take a look at several reasons why the market may be coming to enterprise ETL and data integration.

About the Author

Stephen Swoyer is a Nashville, TN-based freelance journalist who writes about technology.

Must Read Articles