Actor model

The actor model is a mathematical model of concurrent computation in which the universal primitive is the actor: an autonomous, isolated entity that owns private state and communicates with other actors only by sending immutable, asynchronous messages. The model was introduced by Carl Hewitt, Peter Bishop, and Richard Steiger in 1973 in the IJCAI paper A Universal Modular Actor Formalism for Artificial Intelligence, motivated initially by the control structure problems Hewitt encountered while designing the Planner programming language for AI, and by the rising importance of parallel and distributed hardware. Over five decades the model has become the conceptual backbone of fault-tolerant systems software (Erlang, Elixir, Akka, Orleans, Pony, CAF), large-scale online services (WhatsApp, Discord, RabbitMQ, Halo, Riak), and modern distributed Ray clusters that power distributed training, reinforcement learning, and large-language-model serving.

In the actor model, every computational unit is an actor with three essential capabilities: it can send a finite number of messages to other actors whose addresses it knows, it can spawn a finite number of new actors, and it can designate the behaviour that will be used to process the next message it receives. There is no shared memory and no synchronous handshake; concurrency is expressed entirely through message passing across opaque actor addresses. This radical simplicity, plus the operational discipline of "let it crash" supervision pioneered by Erlang, is what lets actor systems scale to millions of lightweight processes per node and survive both software bugs and hardware failure.

Origins and history

Carl Hewitt began work on the actor formalism in the early 1970s at the MIT Artificial Intelligence Laboratory, where he had completed his PhD in mathematics in 1971 under Seymour Papert, Marvin Minsky, and Mike Paterson. Hewitt had designed Planner, the first programming language built around procedural plans invoked by pattern-directed invocation, and the practical pain of implementing Planner exposed deep issues with control-flow abstractions, backtracking, and concurrency that conventional sequential models could not address. Hewitt has stated that one of the main motivations for the actor model was to understand and resolve these control-structure problems, and that he was also influenced by physics (general relativity and quantum mechanics) and by earlier languages like Lisp, Simula, and Smalltalk.

The formalism was published with Bishop and Steiger in August 1973 at the Third International Joint Conference on Artificial Intelligence held at Stanford. The paper proposed a single primitive, the actor (also called a virtual processor, activation frame, or stream), and argued that all computation, sequential and concurrent, could be defined uniformly as actors sending messages to other actors. Subsequent work in the late 1970s by Henry Baker and Hewitt produced the Laws for Communicating Parallel Processes and a fixed-point semantics. Will Clinger gave a denotational model in 1981, and Gul Agha's 1986 MIT Press book Actors: A Model of Concurrent Computation in Distributed Systems (based on his MIT dissertation) established the most widely cited foundation for studying actor systems formally and produced a generation of researchers who carried the ideas into practical systems work.

The model migrated from theory to industry through a separate route. At Ericsson's Computer Science Laboratory in Stockholm, Joe Armstrong, Robert Virding, and Mike Williams started in 1986 with a project to find better ways to program telecom switches. What began as an experiment in adding concurrency to Prolog became a new language called Erlang, named partly after Danish mathematician A. K. Erlang and partly as a back-formation of "Ericsson Language". Erlang's processes turned out to be a clean industrial implementation of Hewitt's actors, with the addition of an OTP library and supervision-tree philosophy for fault tolerance. Ericsson open-sourced Erlang and OTP in December 1998, and the AXD301 ATM switch (built in Erlang) reportedly achieved "nine nines" of availability across more than a million lines of Erlang code. Armstrong's 2003 PhD thesis at the Royal Institute of Technology in Stockholm, Making Reliable Distributed Systems in the Presence of Software Errors, codified the design principles that have since become the lingua franca of actor systems engineering.

Core concepts

The actor model rests on a small vocabulary that has remained stable since 1973.

Actor: a lightweight unit of computation with private state, behaviour, and a mailbox. An actor is not a thread or a process in the operating-system sense; modern runtimes typically multiplex millions of actors onto a smaller pool of OS threads.
Mailbox: an unbounded (or backpressure-controlled) queue of incoming messages owned by exactly one actor. The actor processes messages from its mailbox one at a time, which means there is no internal concurrency inside a single actor; race conditions on its private state are impossible by construction.
Message: an immutable value sent from one actor to another. Messages are the only mechanism for communication. There is no shared memory, no global lock, and no observable side channel.
Address: an opaque, unforgeable reference to an actor. An actor can send to another actor only if it knows the latter's address. Addresses are first-class values: they can be embedded in messages, stored in private state, and passed along, which is the entire mechanism for building dynamic topologies. Addresses are sometimes called actor references or PIDs (Erlang's term).
Behaviour: the function or rule that determines what the actor does next when it receives a message. Behaviour is not fixed for the lifetime of the actor; after each message the actor designates a (possibly different) behaviour for the following message. This is how stateful actors are modelled without mutation in the formal calculus.
Locality: an actor's state is strictly private. No other actor can read or write it. The only way to influence an actor is to send it a message. This is stronger than encapsulation in object-oriented languages because there is no synchronous method call.
Asynchrony: sending a message is non-blocking. The sender does not wait for the receiver to be ready, does not wait for an acknowledgement, and gets no return value. Request-response patterns are built explicitly on top, typically with a correlation identifier and a temporary reply address.

The three axioms

In Hewitt's formulation, when an actor receives a message it can perform exactly three kinds of action concurrently:

Send a finite number of messages to other actors whose addresses it has, whether received in the current message, retained in private state, or self-known.
Create a finite number of new actors, each with an initial behaviour. The creating actor obtains the new actor's address and may share it.
Designate the behaviour that will handle the next message in its own mailbox. This is what allows an actor's apparent state to evolve over time despite the underlying calculus being mutation-free.

These three axioms are the entirety of the actor's response semantics. Every other property (transactions, request-response, supervision, location transparency) is built compositionally on top.

Relationship to other formalisms

The actor model is often compared to lambda calculus and to the family of process calculi.

Lambda calculus models pure functional computation; it has no inherent notion of identity, time, or concurrency. The actor model can simulate lambda calculus but adds the notions of address, mailbox, and asynchronous send. Robin Milner observed that Hewitt's vision was that "a value, an operator on values, and a process should all be the same kind of thing: an actor".

The pi-calculus, developed by Robin Milner, Joachim Parrow, and David Walker in the late 1980s, is a process calculus in which channels are first-class values that can be passed along other channels. Pi-calculus and the actor model are closely related; both support dynamic communication topologies. The main difference is that pi-calculus communication is on named channels (which are themselves passed in messages), while in the actor model messages are addressed to a specific actor.

Communicating Sequential Processes (CSP), introduced by Tony Hoare in 1978, is the most-cited contrast. The headline difference is synchrony: CSP communication is fundamentally a rendezvous, where the sender blocks until the receiver is ready to accept the message on the same channel. Actor messages are asynchronous; the sender places a message in the receiver's mailbox and continues immediately. CSP processes are anonymous and communicate via named channels; actors have identities (addresses) and channels are not first-class. Go's goroutines and channels are a CSP-style design; Erlang processes are an actor-style design. The two approaches can simulate each other: bounded buffered channels turn CSP into asynchronous messaging, and explicit ack protocols turn actor messages into rendezvous.

Implementations and languages

Actor-style runtimes exist in nearly every general-purpose language. The table below lists the most influential implementations.

Implementation	Year / Author	Host language / runtime	Notes
Erlang / OTP	1986; open-sourced 1998. Joe Armstrong, Robert Virding, Mike Williams (Ericsson)	BEAM virtual machine	Reference implementation; processes are actors. Used in Ericsson AXD301, WhatsApp, Discord, RabbitMQ, RiakKV, ejabberd.
Elixir	2011, José Valim	BEAM (Erlang VM)	Modern syntax and metaprogramming on top of OTP. Used in Pinterest, Discord (Rust+Elixir), Bleacher Report, fly.io.
Scala actors	2006, Philipp Haller	JVM (Scala 2.1.7)	Original library actors in Scala; later deprecated in favor of Akka.
Akka	2009, Jonas Boner	JVM (Scala, Java)	Most widely used actor runtime on the JVM. License changed from Apache 2.0 to BSL 1.1 in September 2022.
Apache Pekko	2022, Apache Software Foundation	JVM (Scala, Java)	Apache 2.0 fork of Akka 2.6 created after the BSL license change. Top-level Apache project as of May 2024.
Akka.NET	2013, Roger Johansson, Aaron Stannard	.NET (CLR)	Port of Akka to .NET.
Microsoft Orleans	2010s, Microsoft Research	.NET (CLR)	Introduced "virtual actors" (always-on grains). Powers Halo 4 and Halo 5 backend on Microsoft Azure since 2011; open-sourced January 2015.
Pony	2014, Sylvan Clebsch (Imperial College London)	Native (LLVM)	Capability-secure type system, ORCA garbage collector. Used by Microsoft Research and in fintech.
C++ Actor Framework (CAF)	Dominik Charousset and others	Native C++	High-throughput actor library with native and network transports.
Proto.Actor	Roger Johansson, Asynkron	Go and .NET	Cross-platform actor framework.
Quasar / Pulsar	Parallel Universe (Ron Pressler)	JVM	Lightweight fibers and actors on the JVM.
Pykka	Stein Magnus Jodal	Python	Actor library for Python; widely used in Mopidy.
Dramatiq	Bogdan Popa	Python	Background-task framework with actor semantics.
Thespian	Kevin Quick	Python	Multi-platform Python actors.
JuliaActors	Julia community	Julia	Actor library for Julia.
Riker, Actix	Rust community	Rust	Actor frameworks in Rust; Actix is widely deployed via the Actix-web HTTP server.
Ray (actor API)	2018, RISELab UC Berkeley	Python, C++	Distributed framework for AI; actors are the stateful primitive. Stewarded by Anyscale.

Properties

Actor systems have a small set of well-known properties that explain why the model survived from a 1973 AI paper into modern cloud infrastructure.

Fault tolerance via "let it crash": Erlang's signature philosophy is that an actor that encounters an error should die quickly rather than try to handle every possible exception inline. A supervisor (itself an actor) monitors child actors and restarts them according to a declared strategy (one-for-one, one-for-all, rest-for-one). This pushes recovery logic out of business code and into a separate supervision tree, which Joe Armstrong's thesis showed produces dramatically more reliable systems.
Location transparency: an actor's address looks the same whether the actor is in the same process, on another machine in the same cluster, or behind a load balancer. The runtime serializes messages and routes them. Code does not have to choose at compile time between local and remote calls. This is the property that makes actor systems naturally distributable.
Massive scalability: Erlang routinely runs hundreds of thousands of processes on a single BEAM node, and WhatsApp publicly reported sustaining over two million concurrent connections per server, where each connection was its own Erlang process. Cluster-wide, actor systems scale by adding nodes; addresses move transparently.
Encapsulation: because there is no shared memory, an actor's state is private by construction. Data races on actor state are impossible. This makes reasoning about correctness much easier than threads-and-locks.
Single-threaded actor logic: each actor processes one message at a time, so the logic inside an actor is sequential and looks like ordinary code. Concurrency lives at the boundary between actors, not inside them.

Limitations

The model is not a silver bullet. Several practical weaknesses recur across implementations.

Mailbox growth and backpressure: in pure asynchronous send, fast producers can fill a slow consumer's mailbox without bound, eventually exhausting memory. Production systems require explicit backpressure schemes (Akka Streams, GenStage in Elixir, bounded queues, ask-with-timeout, reactive flow control).
Awkward request-response: asynchronous semantics make synchronous-looking code ("call A and use the result") clumsy. Idioms such as Akka's ask pattern, Erlang's gen_server:call, or futures wrap a temporary reply actor and a correlation token, but the wrapping is visible to the developer.
Debugging is hard: a message sent and lost, a deadlock between two actors waiting for each other, or an unexpected message ordering can be difficult to reproduce. Tools such as :observer, :recon, Erlang traces, and the Akka diagnostic tools exist precisely because conventional step-debuggers do not capture the cross-actor flow.
Order guarantees are local: Erlang and Akka guarantee that messages from actor A to actor B arrive in the order A sent them, but messages from A to B and from C to B can interleave arbitrarily, and across a cluster only causal-style guarantees apply. Programmers expecting global ordering often get this wrong.
Type safety: traditional Erlang and Akka Classic accept arbitrary message terms in a mailbox, which means many message-mismatch errors only show up at runtime. Akka Typed (introduced in Akka 2.6) and Pony's reference capabilities are attempts to fix this; they make the type system aware of which messages an actor can accept.

Use in AI and machine learning

The actor model has been part of AI from its first publication. The 1973 IJCAI paper was filed under "Artificial Intelligence", and the formalism was explicitly proposed as a foundation for AI computation. For decades the connection was mainly historical: Erlang and Akka grew up serving telecoms and the web, and AI systems were built mostly on shared-memory threads, MPI, or parameter servers.

That changed with the rise of distributed reinforcement learning and large-model training. The pivotal paper is Ray: A Distributed Framework for Emerging AI Applications by Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, and Ion Stoica, presented at OSDI 2018 from UC Berkeley's RISELab. Ray implements a unified distributed runtime with two primitives: stateless tasks (functions) and stateful actors (Python classes). Ray's architecture explicitly cites the actor model as the abstraction for stateful workers such as parameter servers, RL rollout workers, simulators, and serving replicas. The OSDI evaluation showed Ray scaling to 1.8 million tasks per second across 200+ nodes and outperforming specialized RL systems.

Ray's actor primitive is the basis for the Ray AI Libraries.

Library	Use of actors
Ray RLlib	Each rollout worker (collecting environment trajectories) is a Ray actor. A central learner actor performs gradient updates. EnvRunner actors and Learner actors compose the algorithm graph.
Ray Tune	Trial-runner actors host individual hyperparameter trials and stream metrics to the head node.
Ray Serve	Each model replica is a Ray actor with autoscaling, multi-model composition, and routing built on actor composition.
Ray Train	Worker actors coordinate distributed training (PyTorch DDP, Horovod, DeepSpeed, FSDP) and own GPU state.
Ray Data	Streaming data execution actors maintain pipelines for training and inference.

Ray is stewarded commercially by Anyscale, founded by the original Berkeley team, which reports that Ray is used in production at OpenAI (training infrastructure for ChatGPT), Uber, Shopify, Spotify, and others. Beyond Ray, modern LLM-serving stacks such as vLLM, TGI (Hugging Face), and SGLang use actor-like patterns internally for KV-cache-bearing replicas, prefix-shared workers, and batch schedulers, even when they do not expose an actor API to the user. The PyTorch and JAX ecosystems use actor-flavoured patterns through Ray, Modin (a Dask/Ray pandas replacement that uses Ray actors for partitioned dataframe state), Dask Actors (Dask's own stateful API), and tools like FairScale and DeepSpeed that wrap distributed-training rendezvous on top of NCCL.

Use cases beyond AI

Actor systems have powered some of the largest user-facing services of the last two decades.

Domain	System	Actor runtime	Scale notes
Telecoms	Ericsson AXD301 ATM switch	Erlang/OTP	Reportedly achieved "nine nines" availability; over 1 million lines of Erlang.
Chat	WhatsApp	Erlang/OTP	One Erlang process per user connection; millions of concurrent connections per server; tens of billions of messages per day.
Chat / voice	Discord	Elixir	Real-time chat backbone; Rust for performance-critical paths.
Game backend	Halo 4 / Halo 5	Microsoft Orleans	Virtual-actor cloud services on Azure since 2011, run by 343 Industries.
Distributed databases	RiakKV, RabbitMQ, CockroachDB internals (in part)	Erlang (Riak, RabbitMQ); Go (Cockroach uses actor-like patterns)	Each partition or queue managed as an isolated stateful unit.
IoT	Various	Akka, Erlang	Massive numbers of long-lived connections.
Web frameworks	Akka HTTP, Play Framework, Actix-web	Akka, Actix	Reactive HTTP servers built on actor cores.
Machine learning	Ray, Modin, Dask Actors	Ray	Distributed training, RL, model serving, analytics.

Comparison with other concurrency models

The table contrasts the actor model with the main alternatives in mainstream use today.

Concurrency model	Communication	State	Scheduling	Fault model
Actor model (Erlang, Akka, Ray actors)	Asynchronous messages to opaque addresses	Private per-actor; no shared memory	Cooperative scheduler multiplexes many actors onto few OS threads; per-actor mailbox processed sequentially	Supervised "let it crash"; isolated failure; restart via supervisor tree
Threads + shared memory + locks (POSIX, Java pre-Loom)	Direct reads and writes of shared variables; mutexes, condition variables	Shared, with locks for safety	Pre-emptive OS scheduler	Thread death may corrupt shared state; recovery is manual
CSP / channels (Go goroutines, Occam)	Synchronous send and receive on named channels (rendezvous)	Each goroutine has private stack; shared state is discouraged but possible	M:N scheduler (Go)	Panic in one goroutine kills the program by default
Software Transactional Memory (Haskell STM, Clojure refs)	Reads and writes inside atomic transactions	Shared, but accessed only inside `atomically` blocks	Library or runtime retries conflicting transactions	No fault tolerance built in; STM is correctness-oriented, not failure-oriented
Futures / promises (JavaScript, Java CompletableFuture, Scala Future)	Composable async values; chained callbacks	Closure-captured	Event loop or thread pool	Errors propagate through the future chain; no supervision
Async / await (Python asyncio, C#, Rust async)	Cooperative coroutines; await suspends the current task	Local to the task	Single-threaded event loop or work-stealing executor	Exceptions raise normally; no built-in supervision

The actor model and shared-memory threading sit at opposite ends of the spectrum. CSP and actors are close cousins. STM, futures, and async/await are largely orthogonal concerns that often live inside larger actor systems (Akka's Future, Erlang's gen_server:call).

Recent developments

The model has continued to evolve. The Reactive Manifesto, signed in 2014 by Jonas Boner, Dave Farley, Roland Kuhn, and Martin Thompson, codified the design principles (responsive, resilient, elastic, message-driven) that the actor community had been practising since Erlang. The Reactive Foundation followed in 2018 to host related projects such as RSocket.

The Akka licensing change in September 2022 was probably the biggest community shock to the actor world in recent memory. Lightbend moved Akka from Apache 2.0 to Business Source License 1.1, requiring a paid commercial license for production use by companies above a revenue threshold. The Apache Software Foundation accepted Apache Pekko, an Apache 2.0 fork of Akka 2.6, into incubation in October 2022; Pekko graduated to top-level Apache project status in May 2024. Many production users either migrated to Pekko, paid Lightbend's per-core fee, or rewrote against another runtime.

Virtual actors and serverless actors, popularized by Microsoft Orleans, have spread to other platforms (Dapr Actors, Cloudflare Durable Objects). The shared idea is that the developer never explicitly creates or destroys an actor; the runtime activates an actor on first use, persists state across crashes, and deactivates idle actors automatically. Cloudflare Durable Objects in particular look like a globally distributed virtual-actor runtime running on V8 isolates.

WebAssembly has become a substrate for actor systems. wasmCloud (a CNCF project) treats every function as a small actor running inside a Wasm sandbox, with capabilities provided by host plug-ins. Lunatic, lunatic.solutions's Erlang-flavoured Wasm runtime, similarly compiles Rust into supervised actors.

In the AI infrastructure space, Ray's adoption has accelerated. Anyscale reports that several large foundation-model training clusters (including parts of OpenAI's training infrastructure) sit on Ray, and the Ray ecosystem (Ray Train, Ray Serve, Ray Data, RLlib) has become a standard way to compose stateless tasks and stateful actor replicas for both training and inference at scale.

Actor model

Origins and history

Core concepts

The three axioms

Relationship to other formalisms

Implementations and languages

Properties

Limitations

Use in AI and machine learning

Use cases beyond AI

Comparison with other concurrency models

Recent developments

References

Improve this article

Origins and history

Core concepts

The three axioms

Relationship to other formalisms

Implementations and languages

Properties

Limitations

Use in AI and machine learning

Use cases beyond AI

Comparison with other concurrency models

Recent developments

References

Origins and history

Core concepts

The three axioms

Relationship to other formalisms

Implementations and languages

Properties

Limitations

Use in AI and machine learning

Use cases beyond AI

Comparison with other concurrency models

Recent developments

References

Improve this article

Related Articles

Jeff Dean

Product quantization

Model deployment

Origins and history

Core concepts

The three axioms

Relationship to other formalisms

Implementations and languages

Properties

Limitations

Use in AI and machine learning

Use cases beyond AI

Comparison with other concurrency models

Recent developments

References

Related Articles

Jeff Dean

Product quantization

Model deployment