Concurrency with Actors
The actor model of computing was introduced by Carl Hewitt in 1973. This model makes it easy to tackle the problem of concurrency and state management. The model treats an “actor” as a primitive unit of computing.
An actor is composed of 3 things - a mailbox, a behaviour and some state. I personally like to think of an actor as a small computer which can do only one thing at a time. An actor’s job is to process incoming messages in the sequence they arrive. It can process only one message at a time. Since actor can receive a lot of messages very quickly but may not process them as quickly, it needs to maintain a queue of messages. This queue is called the “mailbox”. What an actor does with a message is called the “behaviour” of that actor. Behaviour of an actor doesn't have to be static. A message can ask actor to change its behaviour for next message. The “state” of an actor is its internal … well … state, which can only be modified by sending a message to that actor.
Actors are lightweight
Actors are lightweight abstractions on top of Threads. We all know that threads are expensive. It is not possible to create of millions of threads on a single machine with typical hardware configuration. This is why we need lightweight abstractions over threads. Actors are cheap and you can create millions of them. This opens up new possibilities, for example, if you have a million entities in your system, you can create an actor for each entity and that actor is responsible for managing operations with that entity. An actor is sleeping by default and only wakes up and consumes CPU when it get’s a message. So even if you spawn millions of actors, only the ones which are processing messages are awake and rest of them are sleeping.
Actor are asynchronous & non-blocking
Unlike the usual “method invocation” interaction familiar to us, communication among actors can only be message based. Message based communication is asynchronous. When actor A sends a message to actor B, execution continues immediately after sending the message without waiting for any feedback or acknowledgement. Any acknowledgement or data from actor B will have to come as a message via mailbox. Meanwhile, Actor A is either sleeping or processing other messages. Asynchronous communication is the key to building highly performant systems. Since Actor A is efficient in its resource usage, it can very well sustain high amounts of load.
Sending a message to an actor does not mean or guarantee that message has been processed by another actor. It does not even mean that message has been received by other actor. It just means a message has been sent. So sending a message is very fast. Acknowledgements however, can be necessary in some use cases. In those cases, the receiver can send acknowledgement to the receiver just like any other domain message.
Actors allow thread safety
Thread safety for actor state is achieved by two properties of actors:
- The only way to read/write state of an actor is by sending a message to that actor
- An actor can process only one message at a time
This allows us to freely read and modify state without having to worry about thread synchronisation, race conditions and acquiring locks ourselves.
Allowing you to built concurrent & stateful systems this is the biggest selling point of actors.
Only one message at time… but concurrently
This was a tricky one and took me a while to understand fully. When you hear “one message at a time” for the first time, it’s easy to build a wrong mental model of actor execution flow.
Let’s take this for example. The actor has some messages in its mailbox and its processing the first one. As part of processing this message, it has to make an IO call. What should the actor do while IO call completes? Should it pick up the next message from mailbox? Or block the current thread of execution so that next message is picked up only after this message is “completely processed”?
Actor should pick up the next message in mailbox and not block current execution flow. As seen in the diagram, we still have only one line of execution and there is no parallel message processing at any given time. So it still only one message at a time but the actor is free to pick up and process new messages while it’s waiting for IO completions.
Note that IO calls can complete in different order than order of call initialisation. As a result response messages from actors (if any) can return in different order than order of message arrival.
Actors give you more control
The message passing communication of actors comes with its costs. It is definitely not as straight forwards as method invocation. More complexity means more effort and maintenance cost. What it offers in return is more control.
There are certain use cases which are better suited for messages. Let’s take this for example. We want to represent a bank account with withdrawal & deposit functionality in a system. Withdrawal should only happen if the account has sufficient account balance. Withdrawal and deposit both can take few seconds as the actor has to talk to external service. Business requires that while a withdrawal in is progress, another withdrawal request should wait until previous one is finished. However, deposit requests don’t have to wait for other deposit or withdrawal requests to finish.
Actors provide the great tools for such use cases. For example, in this case, once the actor receives
withdraw message, it can change its behaviour to
withdrawing while the IO call to external service is in process. In this behaviour, actor handles
deposit messages normally, but it keeps collecting all the
withdraw messages in a queue. Once the current
withdraw transaction is finished, it changes its behaviour back to
normal where it then plays all the collected
withdraw messages in the order they arrived before starting to accept new messages from mailbox.
In the next article, we will see actors in action with scala and akka.