Teradata Will Offer Cloudera Option for Hadoop Appliance

Teradata said users of its Big Data analytics appliance will soon be able to use Cloudera Inc.'s Hadoop distribution, joining the current Hortonworks Data Platform 2.3 option.

Cloudera Enterprise 5.4 will be added as an option for the new Teradata Appliance for Hadoop 5 some time in the third quarter of this year.

Teradata said its Hadoop appliance is the first in the industry to be configurable, facilitating customization to meet enterprise analytic performance or data capacity requirements. "Running Hadoop on an appliance offers significant benefits, but as Hadoop workloads become more sophisticated, so too must the appliance," said company exec Chris Twogood in a blog post recently. "Our new appliance has evolved alongside Hadoop usage scenarios while giving IT organizations more freedom of choice to run diverse workloads."

Configuration options are available for: performance, optimized for live number-crunching; capacity, for "cold" data rarely accessed; or a balance between the two.

With the performance configuration, Teradata said, "The appliance can be optimized to run streaming applications like Spark, Storm, and SQL-Hadoop engines such as Presto and Impala. It is optimized for intensive computational workloads with more CPU and memory, and smaller storage disks. The appliance leverages Intel Core Processor (Haswell) technology that delivers significant performance and robust analytics."

The Teradata Appliance for Hadoop 5
The Teradata Appliance for Hadoop 5 (source: Teradata)

The capacity configuration is aimed at enterprises wishing to store a lot of data for archiving or other purposes where the data is infrequently accessed, using lower-cost storage and enough computing horsepower for long-running extract, transform and load (ETL) jobs and analytic workloads, for example.

The balanced configuration lies in between those two options, providing a performance/capacity compromise. "It provides cost savings with high capacity drives and is well-suited for ETL and analytics that are more CPU intensive with less demanding I/O requirements," the company said.

The appliance also includes optimized versions of SUSE Linux 11, connectors to facilitate high-speed data transfer, and a 40 Gb/s InfiniBand BYNET V5 network. Each node running on an appliance is supplied with 4 TB capacity HDD drives.

Teradata said the appliance works with the company's Unified Data Architecture (UDA) framework for enterprises to work with all types of data and multiple Teradata systems.

"We are seeing increasing interest in appliance-based deployments of Hadoop as more large enterprises adopt appliances to reduce the total cost of ownership and increase availability," said Cloudera CEO Tom Reilly. "We are excited that Cloudera Enterprise will be offered on a Teradata appliance with its advanced hardware engineering and world class support. Cloudera Enterprise 5.4 reflects critical investments in a production-ready customer experience through governance, security and performance. It also includes support for a significant number of updated open-standards components -- including Apache Spark 1.3, Impala 2.2 and Apache HBase 1.0."

About the Author

David Ramel is the editor of Visual Studio Magazine.