Vertica Appliance Puts Data Warehousing in the Cloud

With turnkey deployability and enterprise scalability, officials say Vertica-on-the-Cloud is a quick, painless, and inexpensive way to get into data warehousing

One of the most fascinating aspects of the data warehousing (DW) appliance segment is its sheer diversity, which makes a vendor's ability to distinguish itself vital. Vertica recently explained what gives it the edge: its ability to run on the distributed computing "cloud."

The "cloud" is one of computing's hottest trends. The term connotes a way to distribute computational power. Co-founder Andy Palmer stresses that on-demand data warehousing doesn't get much easier -- or more scalable -- than Vertica running on top of Amazon's EC2 service.

"The way that Vertica was designed was to operate natively on this grid kind of computing architecture. For us, our customers were looking at various deployment options. One of the things that was really incredible to us was how seamless it was to operate on the commercial offerings that have become available. We've been working on Amazon for quite a while, and what we found was that the way Vertica was fundamentally designed was to run on a cloud, so we see very little difference in running on the cloud versus running [on-premises] in an enterprise data center."

Back in the day, "grid computing" was a comparatively tough sell, at least for many bread-and-butter enterprise applications, where it was sometimes dismissed as a really cool technology solution in search of a market.

Cloud computing is a different proposition, Palmer says, in part because it isn't premised on a technology-centric but on a service-centric pitch: EC2, for example, is just one of several Web services that Amazon markets under its AWS (or Amazon Web Services) umbrella. The salient point, Palmer maintains, is that Vertica's EC2 offering isn't a solution in search of a market but a service that Vertica developed specifically in response to market demand.

"This product actually started as a need within Vertica. We didn't originally go into this saying, 'Let's put Vertica on the cloud.' We were just running out of our own IT and hardware capacities, especially running proofs-of-concept, so we started basically booting up Amazon nodes [to do evaluations or proofs-of-concept]. Customers were coming back and saying, 'You know, if we can do evals on the [EC2] cloud, can't we just keep some of them running?'"

In other words, he says, there are situations -- frequently involving "auxiliary" or non-mission-critical data -- where customers don't want to bring certain kinds of information into their enterprise data warehouses, but nonetheless need to query and analyze it.

"The type of data that people are coming to us with is not necessarily stuff that you'd find in the enterprise data warehouse. It's auxiliary data. It might be public data -- for example, data that's published by financial services firms. It might be data that [a customer] purchased from some other provider. In most cases, the time [in which] they have to analyze it is very small, so they need to be able to quickly bring it up and run their analytics against it."

In this sense, Palmer maintains, Vertica-on-the-Cloud is a completely customer-driven service offering.

"That's really what prompted us to do it. We were getting so much interest [from customers], so we figured, instead of doing it [on an] ad hoc [basis] and putting some scripts together to boot up a few instances, why not offer it as a [pre-packaged] service?" he explains.

"Nowadays, we're seeing the majority of our proofs-of-concept [deployed] on the [EC2] cloud." In some cases, he concedes, that's led to customers opting for cloud-based (as opposed to on-premises) Vertica deployments. "They do the proof-of-concept on the cloud, and then when they decide that they're going to purchase [Vertica], many of them kind of look at each other and say, 'Do we really need to go and buy all of this hardware?' Some of them say, 'Yes, absolutely,' but there are a lot of them -- a surprising number, actually -- that look at each other and shake their heads [no]."

Deploying Vertica on EC2 is relatively straightforward, Palmer says: customers sign up at Vertica's Web site. Rates range from $2,000 per half-TB of data (which runs on a single EC2 node) to $4,000 per TB (which runs on three EC2 nodes). Vertica's EC2 interface can accommodate any BI tool, database, application, or service consumer that supports SQL or the Java Messaging Service (JMS), he asserts.

In this respect, Vertica's on-demand analytic offering differs from full-fledged on-demand analytic services marketed by Visual Mining Inc. and other SaaS players: it doesn't expose a GUI of any kind. This might hurt it, Palmer concedes, but there are certainly cases in which it helps. "There are a bunch of these sort of host-it analytic companies out there now, where they sort of deliver whatever GUI they want you to use, and we've seen a lot of demands [from customers] saying, 'I have my data, I have the tools I want to use, I just want a really screaming-fast database that doesn't cost a lot of money.'"

More to the point, Palmer maintains, Vertica-on-the-Cloud provides a quick, painless, and comparatively inexpensive way for the as-yet-uninitiated (or DW neophytes) to get into data warehousing.

"There are no big upfront costs that you have to incur in order to try it out. We have seen quite a lot of interest from the market in [what we're calling] Analytic SaaS. You know, from people who develop an expertise in doing analysis, they host the hardware somewhere [right now] and this really gives them the next level of offering. It's on-demand, there's no commitment, no upfront cost, and they get a really fast database," he points out.

"It's immediately available. It's [billable] on a monthly basis, so you can disconnect it when or if you want to."