News

Enterprises Increase Use of Apache Kafka for Streaming Big Data Analytics

A brand-new survey shows enterprises are increasingly using open source Apache Kafka technology for real-time, streaming Big Data analytics.

The study was published today during the first-ever Kafka Summit by Confluent Inc.

The company, which is hosting the summit and which was founded by the technology's creators who developed it while working at LinkedIn, provides the Kafka Connect tool.

Apache Kafka was open sourced in 2011 and is now available on GitHub.

Enjoying a dramatic hike in popularity -- Kafka is a multi-faceted tool that does many things, including facilitating the real-time processing of streaming Big Data. Or, as Confluent says:

Apache Kafka is an open source technology that acts as a real-time, fault tolerant, highly scalable messaging system. It is widely adopted for use cases ranging from collecting user activity data, logs, application metrics, stock ticker data and device instrumentation. Its key strength is its ability to make high volume data available as a real-time stream for consumption in systems with very different requirements -- from batch systems like Hadoop, to real-time systems that require low-latency access, to stream processing engines that transform the data streams as they arrive. This infrastructure lets you build around a single central nervous system transmitting messages to all the different systems and applications within a company.

However you describe it, it's becoming increasing indispensable to enterprise analytics, according to the new study. "Eighty eight percent of the survey respondents indicated that Kafka would be a mission-critical part of their data and application infrastructure by 2017," Confluent said in announcing survey results. The company earlier this month commissioned Researchscape International to conduct the survey, which polled more than 100 Kafka users worldwide coming from 16 different industries. Most respondents were developers or architects.

Data Sources
[Click on image for larger view.] Data Sources (source: Confluent)

"We see more and more organizations embracing real-time data, and stream processing is at the heart of that shift as it enables them to understand their data the instant it arrives," said Jay Kreps, one of Kafka's co-creators and the CEO and co-founder of Confluent. "Kafka has become increasingly popular as a central platform for managing this stream data because it lets adopters create new products and services to provide added value for users and their customers."

Kafka users most often conduct stream processing (reported by 72 percent of respondents) with the technology, a use case that's going to gain traction in the next year (according to 68 percent of respondents). Other use cases include messaging, data integration and log aggregation, all reported by 52-57 percent of respondents.

Applications powered by Kafka include:

  • Application monitoring (60 percent).
  • Data warehousing (51 percent).
  • Asynchronous applications (47 percent).
  • System monitoring (39 percent).
  • Recommendation/decisioning engines (35 percent).
  • Customer preferences (27 percent).
  • Security/fraud detection (26 percent).
  • Internet of Things (IoT) applications (20 percent).
  • Communications systems (16 percent).
  • Dynamic pricing applications (12 percent).

Respondents reported several benefits from using Kafka, including:

  • 67 percent said "Kafka helps our applications work together in a loosely coupled manner."
  • 59 percent said "We use Kafka as underlying data infrastructure for stream processing."
  • 58 percent noted the "improved scalability of applications” Kafka brings.
  • 51 percent said "High volumes of data are now available in real-time -- we've been able to move beyond batch processing."

As far as challenges reported in developing with Kafka, there was no strong consensus except for lack of talent: 19 percent of respondents reported "lack of developers with Kafka skills" and 16 percent reported "access to qualified technical support for developing and operating Kafka." The next-highest category (except for "other) was "no significant challenges."

Kafka Satisfaction
[Click on image for larger view.] Kafka Satisfaction (source: Confluent)

To address that skills shortage, 65 percent of responding organizations plan to hire employees with Kafka skills in the next year. The largest plurality among the projected hirers (36 percent) indicated two to three employees will be hired. Some 10 percent will reportedly hire more than six employees.

"We're overwhelmed with the response we've received from the Kafka community -- their feedback is instrumental to how we approach the development of new capabilities that will meet evolving demands and use cases," said Neha Narkhede, co-founder and CTO of Confluent and one of Kafka's co-creators. "As Kafka Summit begins, we're excited for members of the community to have the chance to share their stories and learn from one another as we work together to ensure that Kafka reaches its full potential as core infrastructure technology."

About the Author

David Ramel is an editor and writer for Converge360.

Must Read Articles