In-Depth

Talend's Data Integration Power Play

Open source data integration specialist Talend markets an intriguing alternative to proprietary solutions, and for some uses, it's free.

Although plenty of data integration vendors have been subsumed by larger vendors, it doesn't spell the end of the small DI vendor. In fact, many upstart companies are working hard to keep things interesting.

One such vendor is open source data integration specialist Talend, which -- with more than 250 adapters into relational databases, enterprise applications, legacy systems, and other data sources -- claims to match or exceed the breadth of connectivity touted by market leaders Informatica Corp., SAS Institute Inc., and IBM Corp. Because Talend is all open source, it easily beats the best-of-breeds on pricing, officials claim.

"As far as connectivity, we have 250 native connectors, which means we will connect to pretty much all databases natively. [This gives us] connectivity to all of the big proprietary [databases] as well as the smaller ones, [along with] connectivity into all major ERP and CRM systems," says Yves de Montcheuil, vice-president of marketing with Talend. That's not counting the breadth of adapter support in the open source community, which Talend's DI platform can also consume -- with a little customization, of course.

"The fact that we're open source means that we can leverage what's already out there. In the Perl community alone there are 38,000 connectors. These [connectors] will have to be repurposed slightly to be used with Talend -- it will be need to be repackaged to [use] the APIs that we provide -- but if you need it, you can get it to work."

Talend comes in two flavors: Talend Open Studio, which it distributes as a free, single-user DI environment; and Talend Enterprise Suite, a pay-for-use version of the Open Studio product. Open Studio, obviously, is limited to single-user applications, even though it bundles all of the connectivity features that ship with the brawnier (with no user restriction) Enterprise Suite. Like a competitive offering from Evolutionary Technology International, or ETI (see http://www.tdwi.org/News/display.aspx?ID=8958), the Talend DI platform produces extracted, cleansed, and transformed data in the form of a Java binary. In this respect, one of ETI's biggest value-adds -- the fact that its DI executables boast an extra level of data security (by virtue of their integrated checksums) -- is also one of Talend's strongest selling points.

"That's one of the big advantages of a [code] generation program: you've just got a job file and a batch execution [process]. You can run it from the UI if you want, and you can embed the scheduling and all of that stuff in it." It also makes it easier for Talend to fit into SOA environments, Montcheuil argues. "We can design data integration processes and then expose them as services, [so they're] executing on the fly whenever you want to get the data."

Its open source underpinnings are a boon to Talend, particularly with respect to pricing, but pricing alone rarely determines when Talend wins deals, Montcheuil maintains. "There are always many reasons why we win a deal. Price is always a factor, but people also consider our connectivity, our ease of use, our flexibility. The fact that we're open source doesn't mean that we're free; it just means that we're less expensive. With us, you know what you're going to get in the long run. Down the road you're not going to have to buy additional per-CPU licenses from us, [nor will you have] to buy additional connectors from us."

IBM, Informatica, Oracle, SAS, and other players are proud of the collaborative capabilities they're building into their DI platforms. Collaboration is key, these vendors argue, because business analysts and other power users are demanding to have a voice in the data integration process. Talend delivers the goods in this regard, too, claims Montcheuil, thanks to a graphical modeling environment (namely, Talend Business Modeler) that's designed for business analysts, in addition to vanilla DI facilities for integration architects.

"Actually, people tell us they don't find that kind of [collaborative] stuff in the proprietary tools. With [Business Modeler], we make it so that business analysts can design the workflows without delving into the technical details. The end result is [Java] code that runs on any platform."

Montcheuil himself is a Sunopsis veteran, so he knows a thing or two about ELT -- or extract, load, and transform -- which Sunopsis arguably did better than any other vendor. In Montcheuil's admittedly biased opinion, Talend's ELT capability -- which customers can use interchangeably with its vanilla ETL facility -- compares favorably to the Sunopsis technology.

"Talend has built-in ETL and ELT, so you can mix and match [either technology] within a process," he comments. There are scenarios for which each technology is optimal, he notes. "If you need to get data from flat files you want to do ETL; if you need to do a massive update in a data warehouse with lookups on all dimensions, then obviously etl is going to be overkill. ELT would be the better choice."

Talend is just one of a bevy of interesting new (and not-so-new) data integration or data warehousing players. Others include Expressor Software, Illuminate, Software Labs, DataWatch (steward of the venerable Monarch DataPump), and others. We'll take a look at each of these vendors in future updates.

About the Author

Stephen Swoyer is a Nashville, TN-based freelance journalist who writes about technology.

Must Read Articles