Q&A: Akka 101
What makes Akka so valuable for building today’s distributed applications?
What makes Akka so valuable for building today’s distributed applications? How is it different from other middleware technologies, and where is it headed? To understand the basics of his middleware toolkit and runtime solution, we spoke with Jonas Bonér, the co-founder of Typesafe and creator of Akka. Jonas has also been an active contributor to open source projects including the AspectWerkz Aspect-Oriented Programming (AOP) framework and the Eclipse AspectJ project.
Enterprise Strategies: What is Akka and what problems does it aim to solve?
Jonas Bonér: Akka is an event-driven middleware toolkit and runtime for building high-performance, scalable, and fault-tolerant distributed applications. It is implemented in Scala but has APIs for both Java and Scala. Akka decouples business logic from low-level mechanisms such as threads, locks, and non-blocking IO and makes it easier to write applications that scale both up and out.
How is Akka different from some of the other middleware technologies currently available?
In my opinion, most middleware products available today have two major problems. First they try to shield the programmer from the network by providing different levels of "transparent" distributed computing, using RPC, distributed objects, etc. Second, they try to emulate consistency and atomicity on top of the network using 2PC, XA, Paxos etc. These two problems make systems very hard to scale and reduce resilience to failure.
Akka takes a very different approach by embracing distributed computing at its core and making the essence of distributed computing -- asynchronous message passing and eventual consistency -- first-class constructs using Actors. This explicit, programming model makes it easier to understand what is going on and easier to reason about things because you get a system that is not trying to pretend to be something it is not. This also allows you to create very dynamic, elastic, loosely-coupled, event-driven systems with resilience to failure built in.
How does it help programmers of high-performance computing systems?
Akka provides a rich toolkit for addressing concurrency, scalability, and high-availability concerns. Akka's Actors are extremely lightweight (you can easily create millions of Actors in a single application), reactive, and event-driven processes with a dedicated mailbox (message queue). They communicate using asynchronous message passing and are managed by a scheduler that runs the Actors on shared thread pool.
Actors do not consume any other resources than memory when waiting for new messages, and the fully asynchronous paradigm enables the creation of applications that never block threads needlessly. This means that Akka makes full utilization of your hardware out of the box and your Akka application is, by definition, highly concurrent and parallelizable. If you outgrow a single box you can easily stretch out your application to span multiple boxes, all through simple configuration.
In addition to its Actors, Akka has a whole suite of other concurrency tools such as Futures, Dataflow, Agents, STM, Transactors, and much more. All these tools raise the abstraction level and make it easier to write, understand, and maintain concurrent, scalable, fault-tolerant applications. Instead of messing around with very low-level constructs, you think in terms of higher-level concepts such as message flows and transactions.
What is usually solved by use of low-level plumbing in standard enterprise applications becomes workflow in Akka. You start to think about how the data flows in the systems rather than how to get the concurrency and scalability exactly right.
Are there any drawbacks to using Akka?
One potential drawback is that not all developers are used to programming using asynchronous message passing -- i.e., without a call stack -- which can take a while to get used to. Another potential drawback is that programming with Akka is all-in in terms of concurrency. You can only communicate with an Actor by sending it a message, so even though it is nice that you get concurrency "for free" as part of the model, you also get it everywhere, which is something that you might not always want.
Also, testing concurrent systems is much harder than testing sequential procedures, though Akka’s TestKit leverages the power of the actor model to make this problem much more tractable than for lock-based concurrency.
What is the “let it crash” model?
One problem in Java is that you are only given a single thread of control, so if this thread crashes with an exception, you are in trouble. This means that you need to make all error handling explicit within this single thread. To make things worse -- exceptions do not propagate between threads -- so there is no way of even finding out that something has failed. This leads to defensive programming with error handling tangled with business logic scattered all over the application. We can do better than this.
Instead, in Akka you architect your application in what we call “supervisor hierarchies” where each Actor is a (lightweight) thread of control. For example, when you create an Actor C (for child) from within an Actor P (for parent), P becomes the parent and supervisor of C. This means that if Actor C fails with an exception, Actor P will be notified and can take action, such as restarting Actor C (i.e, the exceptions are propagated between threads). What this gives you is a tree of supervised Actors, a so-called supervisor hierarchy.
This naturally yields a non-defensive way of programming in which when an Actor fails, instead of trying to trap the error and recover within the thread of control, the Actor simply dies and lets someone else, the supervisor, deal with the error -- a fast fail approach that is also called "let it crash."
If a supervisor can’t recover from an error, it will escalate the error by propagating it up the hierarchy, and hopefully an Actor higher in the hierarchy will be configured to deal with the error. If not, then the whole application will fail. What you get is a system that can heal itself at runtime because failure has become a valid and well-supported state of each computational unit.
Where is Akka headed in, say, the next year or two?
With Akka 2, we have reached a solid foundation upon which we have started building layers of more advanced functionality.
For example, what we are working on right now -- clustering -- will give our users an even greater degree of fault-tolerance, dynamism, and elasticity by providing cluster membership where nodes can join and leave the cluster at any point in time and the cluster will automatically repartition and rebalance the system according to the number of available nodes at a particular point in time. It will also support recovery from node crashes by failing over the Actors that resided on that node to a new node, forwarding all messages that were sent during the migration, and finally redirect all references to the Actors to point to the new location.
After that, we will work on things like Actor state persistence, built-in support for Event Sourcing/CQRS, support for compute grid, and real-time data analytics.
We are also putting a lot of effort into the Typesafe Console, our monitoring and management solution and will add things like Actor and cluster visualization as well as visualization of how messages flow between Actors, across nodes, etc. We will also work on feeding all these live statistics back into the running system to allow it to take more intelligent decisions. We are not running out of interesting and useful things to work on and explore.