Cloudera Offers Docker Container to Get Started with Hadoop

To provide another simplified onramp for getting started with enterprise Apache Hadoop solutions, Cloudera Inc. is now offering a Docker container image.

Cloudera, one of the leading distributors of Hadoop-based software, is adding the Docker image -- as a beta -- to its existing solutions that help companies explore or test the technology. Those existing solutions include a QuickStart VM (virtual machine) and the Cloudera Live demo cluster running on the Amazon Web Services Inc. (AWS) cloud.

"Docker is different from other platforms you may have used: it works with Linux containers," Cloudera's Sean Mackrory said in a Cloudera Engineering blog post this week announcing the new product. "While 'virtual machine' software typically simulates or isolates access to hardware so a guest operating system can run, a 'container' is really just a partition of the host operating system. Each container has its own view of the filesystem and its own set of resources, but it's really running on the same Linux kernel as the rest of the system. This approach is similar to that of BSD jails or Solaris zones."

Cloudera said it's providing the Docker image as an alternative QuickStart solution to complement the success of its QuickStart VM, which was originally developed just to serve as a demo environment but which evolved into a more general-purpose environment for developers and others, offering:

  • A way to ramp-up on and self-learn new CDH features and components.
  • An easy-to-deploy Hadoop training environment for newcomers.
  • An appliance for continuous integration/API testing.
  • A sandbox to prototype new ideas and applications.
  • A platform for demonstrating your own software product.

While the QuickStart VM has been available on numerous virtualization platforms such as VMware, VirtualBox and as a disk image usable by KVM and others, Docker has emerged as an alternative to traditional VM images.

"Therefore today, we're pleased to announce the availability of a Cloudera QuickStart Docker image!" Mackrory said. "If you or your organization is using Docker, this image may provide the ideal lightweight, disposable environment for learning and exploring new technology, playing with new ideas, and for doing continuous integration before testing at scale. (However, Cloudera recommends using a more realistic test environment before moving to production.)"

The image and accompanying documentation are available on the Docker Hub. The Docker Hub cloud service, just upgraded in September, provides a registry for storing and sharing more than 200,000 Docker images, along with other services such as private repositories, automated builds and organizational collaboration tools.

"Just like the QuickStart VM, this Docker image (currently a beta) includes all of CDH, and you can optionally add-on the free edition of Cloudera Manager or even a 60-day trial of Cloudera Enterprise," Mackrory said. "Have Docker map port 80 to your host and hit it with your browser, and you'll also find an end-to-end tutorial with sample data included in the image."

About the Author

David Ramel is the editor of Visual Studio Magazine.