In-Depth

Advanced Analytics on a Budget, R-Style

The draw of the open source R statistics environment, according to vendors, is cost -- it's simply cheaper than premium offerings.

In statistical circles, "R" is the name of an open source programming language for statistical analysis. These days, it might also be shorthand for "rock star."

Several established business intelligence (BI) and data warehousing (DW) powers say they've adopted R as an increasingly compelling way to perform advanced analytics at lest cost. A number of upstarts field R-based advanced analytic products, too.

The draw, according to vendors, is cost: the open source R environment is less expensive than premium offerings from IBM Corp. (which purchased the former SPSS Inc. last July) or SAS Institute Inc.

However, even its proponents concede that R isn't nearly as polished. "It's still evolving. We have to be cognizant of that, and we can't try to tackle too much at once," says Jeff Erhardt, chief operating officer with analytic newcomer Revolution Analytics, which -- until May of this year -- used to call itself Revolution Computing.

From a technological perspective, Erhardt argues, R offers a full-fledged environment for statistical analysis. It's right up there with SAS and SPSS, he maintains. In terms of polish -- particularly with respect to usability, self-service, collaboration, connectivity, and other enterprise-oriented features -- Erhardt concedes that Revolution Analytics has had to work to smooth out R's rough edges.

He cites his company's ongoing effort (which he positions as part of a phased road map) to replace its existing, Visual Studio-oriented R GUI with an improved, thin-client user interface. "The idea is to bring a modern GUI to R that's not static or not fixed. The key … is to bridge the gap between R's power and its attractiveness to these expert users and the need for these expert users to deploy their models or their packages to non-expert audiences," Erhardt says.

The GUI is Key

Over the last decade, analytic giants SAS and SPSS labored to retrofit their existing offerings -- which used to be associated more with statistics junkies than with iPhone-toting sales or marketing managers -- for use by "lower-level" users. This wasn't an overnight effort, however: with each subsequent revision of their analytic offerings, both companies would tout additional usability features (including a bigger focus on self-service) or usability-oriented refinements.

It perhaps isn't surprising, then, that one of the more polished R GUIs is marketed by an existing BI vendor: Information Builders Inc. (IBI), which integrates R with its WebFOCUS reporting environment.

Michael Corcoran, IBI's chief marketing officer, cites what he describes as "a really surprising level of interest" in R among Information Builders' customers.

He points to support for the language from several other BI and DW players -- including both Netezza Inc. and JasperSoft Inc. -- which likewise tap R as a predictive analytic engine for their bread-and-butter BI or DW products.

IBI dubs its R entry RStat, which sports a WebFOCUS-like front-end GUI for R. It's a no-brainer value-add, according to Corcoran. Although the R environment itself is free, it's still a (mostly) command-line-only proposition. R GUIs exist, he contends, but -- from the perspective of business users (or lower-level analysts) -- most aren't ready for prime time.

It's technology is strong, Corcoran argues; even established heavyweights such as SAS now support R in their own analytic offerings. What's more, Corcoran points out, the R community has developed thousands of application- or industry-specific models.

It has the analytic features, he claims; it was just missing the sheen.

"The R technology is so well adopted, it's almost a no-brainer for customers," said Corcoran, in a interview this Spring. "If you look at it, SAS now integrates with R … because they have to. … It's not just universities anymore that are using [R]; it's financial services, it's retail. Where a few years ago [R] might have been at a disadvantage [relative] to SAS because it's in memory, now that's an advantage because you have these systems with gigabytes of memory."

New Wave R

If IBI is an example of an old-guard vendor that's embraced R, Revolution Analytics is part of a post-R new wave. It was conceived precisely as an effort to develop a business-ready (or business-safe) version of R.

That vision is still very much a work in progress. Right now, Revolution Analytics markets Revolution R 3.0, which grafts a bare-bones GUI (based on Microsoft Corp.'s Visual Studio IDE) on to its implementation of R.

As Erhardt explained, Revolution Analytics is currently working on its next-gen GUI; that offering isn't expected to be available until the end of this year at the earliest, however. By late summer, Revolution expects to deliver new "Big Data" (able to analyze petabytes) and "Web Services" features. Both were developed with business users in mind, Erhardt maintains. "Big Data … is a framework to allow the analysis of arbitrarily large -- petabyte-sized data sets -- at speeds that are orders of magnitude faster than some of the alternatives," he explains. "The Web Services framework [is an] engine that's basically a high-performance platform for delivering R over the Web or into any client that can consume Web services," he continues. "The end client doesn't have to be just a Web browser, it can be things like our GUI, but it can also be things like Excel or a business intelligence tool."

Erhardt says R has a number of benefits in addition to its lower price. For example, most college students take one or more courses in statistics before they graduate; increasingly, more of these students are using R instead of SAS and SPSS. There's a built-in (and expanding) R knowledgebase among new and recent hires, he says.

Like IBI's Corcoran, Erhardt cites the availability of thousands of industry- or domain-specific, R-ready models. On top of this, he observes, R has both a thriving user community and a burgeoning partner community.

At the same time, Erhardt concedes, most customers have been attracted to R because of its low price.

"We will be a fraction of the pricing of those guys. We don't want to have a public pledge [concerning our pricing] per se, but we understand and we hear day in and day out that they are dying under these maintenance contracts, that the prices keep going up year after year for very little incremental capability," he comments. "We don't have any false expectations that we're going to come in and compete with these guys, replace them, etc., but we'll be a complementer."

Is price enough, particularly with a Revolution R offering that (as far as fit and polish are concerned) is still a work in progress? Erhardt thinks so.

"We don't have visions of doing a rip and replace of the major existing providers. It's just not realistic. Our pitch to [customers] is: 'Look, do the next project with us and avoid the next million-dollar bill.' You can do an incremental [project-by-project approach] with us as a way to get your foot in the door and at the same time start developing the ability to move in this direction without taking a million-dollar risk," he says.

"It's also about innovation. With all of the growth in R, [companies are] starting to see their peers use R and … they're also getting R creeping into their organizations by virtue of these recent grads who are using it," Erhardt continues. "Our pitch is to come in and say, 'Hey, the world and business is going in this direction, whether you like it or not. We have a solid product built on this base-level of R, which is so well accepted, and over the next six months we'll be exponentially expanding our capabilities."

Must Read Articles