π§ͺ Experimenting with Event Sourcing in Erlang using pure functional principles, gen_server-based aggregates, and pluggable Event Store backends.
I'm a big fan of Erlang/OTP and Event Sourcing, and I strongly believe that the Actor Model and Event Sourcing are a natural fit. This repository is my way of exploring how these two concepts can work together in practice.
As an experiment, this repo won't cover every facet of event sourcing in depth, but it should provide some insights and spark ideas on the potential of this approach in Erlang.
- Kernel OTP app β packaged as an OTP application with a supervision tree (
es_kernel_sup) that boots a dynamic aggregate supervisor and a singleton aggregate manager. Configure stores via application env, start withapplication:ensure_all_started/1, and dispatch commands throughes_kernel:dispatch/1. - Aggregate β a reusable gen_server harness that keeps domain logic pure while delegating event sourcing boilerplate.
- Aggregate Manager β a singleton router that spins up aggregates on demand via the dynamic supervisor, rehydrates them from persisted events, monitors them, and passivates idle instances.
- Event Store β a behaviour-driven abstraction with drop-in backends so you can pick the storage engine that fits your deployment.
- Snapshots β automatic checkpointing at configurable intervals to avoid replaying entire streams.
- Passivation β idle aggregates are shut down cleanly and will rehydrate from the store on the next command.
This project is a work in progress, and I welcome any feedback or contributions. If you're interested in Event Sourcing, Erlang/OTP, or both, feel free to reach out!
Start the Erlang shell and run the following commands to play with the es_xp example application. es_xp depends on es_store_ets and es_kernel, so application:ensure_all_started(es_xp) brings up the full stack; commands are dispatched through es_kernel:dispatch/1.
%% Interactive demo showcasing the event sourcing engine.
%%
%% Usage:
%% rebar3 shell < apps/es_xp/examples/demo_bank.script
%%
%% This script configures es_kernel to use the in-memory ETS backend for
%% both events and snapshots, then starts the es_xp application. es_xp
%% depends on es_store_ets and es_kernel, so `application:ensure_all_started/1`
%% brings up the full stack automatically.
%% Explicitly set the store backends (override defaults if needed)
application:load(es_kernel),
application:set_env(es_kernel, event_store, es_store_ets),
application:set_env(es_kernel, snapshot_store, es_store_ets),
io:format("~n[1] starting es_xp application (and dependencies)~n", []),
{ok, Started} = application:ensure_all_started(es_xp),
io:format(" -> ~p~n", [Started]),
AccountId = <<"123">>,
Dispatch = fun(Type, Amount) ->
Command =
es_contract_command:new(
bank_account,
Type,
AccountId,
0,
#{},
#{amount => Amount}
),
es_kernel:dispatch(Command)
end,
io:format("[2] deposit $100~n", []),
io:format(" -> ~p~n", [Dispatch(deposit, 100)]),
io:format("[3] withdraw $10~n", []),
io:format(" -> ~p~n", [Dispatch(withdraw, 10)]),
io:format("[4] withdraw $1000 (should fail)~n", []),
io:format(" -> ~p~n", [Dispatch(withdraw, 1000)]),
ok.This project is structured around the core principles of Event Sourcing:
- All changes are represented as immutable events.
- Aggregates handle commands and apply events to evolve their state.
- State is rehydrated by replaying historical events. Possible optimizations include snapshots and caching.
es_kernelships as an OTP application. Start it withapplication:ensure_all_started(es_kernel)after configuringevent_storeandsnapshot_store(defaults toes_store_ets).- Start the configured store backends first (e.g.,
es_store_ets:start/0) so tables/processes exist before aggregates boot. Thees_xpexample app handles this by depending ones_store_etsandes_kerneland starting them viaapplication:ensure_all_started(es_xp). - The top-level supervisor (
es_kernel_sup) starts:es_kernel_aggregate_sup: a dynamic supervisor that spawnses_kernel_aggregateprocesses on demand.es_kernel_mgr_aggregate: a registered singleton that routes commands to aggregates and keeps track of live PIDs.
- Commands are dispatched through the public API
es_kernel:dispatch/1, which forwards to the registered manager.
The store layer separates domain logic from persistence concerns through two behaviour contracts: es_contract_event_store for event persistence and es_contract_snapshot_store for snapshot optimization. Backends implement these behaviours, and aggregates interact with stores through the unified es_kernel_store API.
The event store is the heart of event sourcing persistence, designed as a behaviour (es_contract_event_store) that backends implement. It guarantees ordering (events replayed in sequence), atomicity (all-or-nothing persistence), and immutability (events never modified).
Required Callbacks:
% Appends events to a stream with monotonic sequence ordering
-callback append(StreamId, Events) -> ok
when StreamId :: es_contract_event:stream_id(),
Events :: [es_contract_event:t()].
% Replays events in sequence order, applying a fold function
-callback fold(StreamId, Fun, Acc0, Range) -> Acc1
when StreamId :: es_contract_event:stream_id(),
Fun :: fun((Event :: es_contract_event:t(), AccIn) -> AccOut),
Acc0 :: term(),
Range :: es_contract_range:range(),
Acc1 :: term(),
AccIn :: term(),
AccOut :: term().Snapshots provide optional performance optimization by checkpointing aggregate state, avoiding full event replay from stream start. They remain secondary to events, which are always the source of truth.
Required Callbacks:
% Persists a snapshot; returns warning instead of crashing on failure
-callback store(Snapshot) -> ok | {warning, Reason}
when Snapshot :: es_contract_snapshot:t(),
Reason :: term().
% Retrieves the most recent snapshot for a stream
-callback load_latest(StreamId) -> {ok, Snapshot} | {error, not_found}
when StreamId :: es_contract_snapshot:stream_id(),
Snapshot :: es_contract_snapshot:t().How Snapshots Work:
- On aggregate startup:
load_latest/1fetches the most recent snapshot (if available) - Replay optimization: Only events after the snapshot sequence are replayed via
fold/4 - Automatic checkpointing: When
snapshot_intervalis configured (e.g.,10), snapshots save at sequence multiples (10, 20, 30...)
Configuring Snapshots:
Set snapshot_interval per aggregate or globally via es_kernel application environment:
% Global default in sys.config
{es_kernel, [{snapshot_interval, 10}]}
% Or programmatically
application:set_env(es_kernel, snapshot_interval, 10)Setting snapshot_interval => 0 (default) disables automatic snapshotting.
- Support event subscriptions for real-time updates.
- Implement snapshot retention policies (e.g., keep only last N snapshots).
| Backend | Status | Icon | Capabilities | Highlights | Ideal use cases |
|---|---|---|---|---|---|
| ETS | β Ready | Events + snapshots | In-memory tables backed by the BEAM VM, blazing-fast reads/writes, zero external dependencies. | Local development, benchmarks, ephemeral environments where latency matters more than durability. | |
| File (pedagogical) | β Ready | π | Events + snapshots | Plain files (one Erlang term per line) under a configurable root dir, zero dependencies. | Learning/teaching runs where you want to peek at persisted state without external services. |
| Mnesia | β Ready | Events + snapshots | Distributed, transactional, and replicated storage built into Erlang/OTP. | Clusters that need lightweight distribution without introducing an external database. | |
| PostgreSQL | π οΈ Planned | Events + snapshots | Durable SQL store with strong transactional guarantees and easy horizontal scaling. | Production setups that already rely on Postgres or need rock-solid consistency. | |
| MongoDB | π οΈ Planned | Events + snapshots | Flexible document database with built-in replication and sharding. | Event streams that benefit from schemaless payload storage or multi-region clusters. |
The aggregate is implemented as a gen_server that encapsulates domain logic and delegates event persistence to a pluggable Event Store (e.g. ETS or Mnesia).
The core idea is to separate concerns between domain behavior and infrastructure. To achieve this, the system is structured into three main components:
- π§© Domain Module β a pure module that implements domain-specific logic via behaviour callbacks.
- βοΈ
aggregateβ the glue that bridges domain logic and infrastructure (event sourcing logic, event persistence, etc.). - π¦
gen_serverβ the OTP mechanism that provides lifecycle management and message orchestration.
The aggregate provides:
- A behaviour for domain-specific modules to implement.
- A generic OTP gen_server that:
- Rehydrates state from events on startup (with optional snapshot loading).
- Processes commands to produce events.
- Applies events to evolve internal state.
- Automatically passivates (shuts down) after inactivity.
- Saves snapshots at configurable intervals for optimization.
The following diagram shows how the system processes a command using the event-sourced aggregate infrastructure.
sequenceDiagram
actor User
participant Kernel as es_kernel (API)
participant AggMgr as es_kernel_mgr_aggregate
participant AggSup as es_kernel_aggregate_sup
participant Agg as es_kernel_aggregate
participant DomainModule as AggregateModule (callback)
User ->> Kernel: es_kernel:dispatch(Command)
Kernel ->> AggMgr: gen_server:call(?MODULE, Command)
alt aggregate not running
AggMgr ->> AggSup: start_aggregate(Module, Store, Id, Opts)
AggSup -->> AggMgr: {ok, Pid}
end
AggMgr ->> Agg: es_kernel_aggregate:execute(Pid, Command)
Agg ->> DomainModule: handle_command(Command, State)
Agg ->> Agg: persist_events(Store, Events)
loop For each Event
Agg ->> DomainModule: apply_event(Event, State)
end
Each aggregate instance (a gen_server) is automatically passivated β i.e., stopped β after a period of inactivity.
This helps:
- Free up memory in long-lived systems
- Keep the number of live processes bounded
- Rehydrate state on demand from the event store
Passivation is configured via a timeout value when the aggregate is started (defaults to 5000 ms):
es_kernel_aggregate:start_link(Module, Store, Id, #{timeout => 10000}).When no messages are received within the timeout window:
- A passivate message is sent to the process.
- The aggregate process exits normally (
stop). - Its state is discarded.
- Future commands will cause the manager to rehydrate it from persisted events.
Snapshots provide a performance optimization for aggregate rehydration by avoiding the need to replay all events from the beginning of a stream.
How it works:
-
On startup, the aggregate:
- Attempts to load the latest snapshot from the event store
- If found, initializes state from the snapshot
- Replays only events that occurred after the snapshot sequence
-
During command processing, snapshots are automatically created when:
- A
snapshot_intervalis configured (e.g.,10) - The current sequence number is a multiple of the interval
- For example, with
snapshot_interval => 10, snapshots are saved at sequences 10, 20, 30, etc.
- A
Configuration:
% Create aggregate with snapshots every 10 events
es_kernel_aggregate:start_link(
bank_account_aggregate,
es_store_ets,
<<"account-123">>,
#{
timeout => 5000,
snapshot_interval => 10 % Save snapshot every 10 events
}
).Setting snapshot_interval => 0 (the default) disables automatic snapshotting.
The aggregate manager is a gen_server started by es_kernel_sup and registered as es_kernel_mgr_aggregate. It routes es_contract_command:t() to the right aggregate process, spins up instances via the dynamic supervisor, and monitors them to keep its registry clean. The store context is taken from the es_kernel application environment, and the public entrypoint es_kernel:dispatch/1 forwards to this singleton.
The manager is responsible for:
- Routing commands to the appropriate aggregate process across all aggregate types.
- Starting aggregate instances on demand through
es_kernel_aggregate_sup. - Monitoring aggregate processes (for passivation or crashes) and cleaning up registry entries when they terminate.
The aggregate manager maintains a mapping of {AggregateType, AggregateId} to aggregate process PIDs. When a command is received (typically via es_kernel:dispatch/1):
- It extracts the
aggregate_typeandaggregate_idfrom the command map. - The
aggregate_typeis resolved to its implementing module viaes_kernel_registry. - The internal
pidsmap is checked for an existing aggregate instance. - If none exists, the manager asks
es_kernel_aggregate_supto start the aggregate, monitors the new PID, and stores it in the registry. - The command is forwarded to the aggregate via
es_kernel_aggregate:execute/2. - When an aggregate terminates (passivation or crash), the monitor
'DOWN'message removes it from the registry so a new command will recreate it if needed.
flowchart LR
%% Supervision tree
Kernel((es_kernel<br>application)):::app
KernelSup([es_kernel_sup]):::supervisor
AggSup([es_kernel_aggregate_sup]):::supervisor
AggMgr([es_kernel_mgr_aggregate<br>singleton]):::manager
%% Aggregate Instances
AggOrder((order<br>order-123)):::aggregate
AggUser((user<br>user-456)):::aggregate
Kernel --> KernelSup
KernelSup -.->|supervises| AggSup
KernelSup -.->|supervises| AggMgr
AggSup -.->|starts| AggOrder
AggSup -.->|starts| AggUser
AggMgr -->|cmd| AggOrder
AggMgr -->|cmd| AggUser
AggMgr -.->|monitor| AggOrder
AggMgr -.->|monitor| AggUser
AggOrder -->|apply<br>events| AggOrder
AggUser -->|apply<br>events| AggUser
classDef supervisor stroke-dasharray: 5 5,stroke-width:2;
classDef manager stroke:#5b4ae0,stroke-width:2;
classDef aggregate stroke:#222,stroke-width:1.5;
classDef app stroke:#0b67a3,stroke-width:2;
The manager options are applied to the aggregates it starts:
timeout: Inactivity timeout passed to aggregates.now_fun: Function to provide timestamps for events/snapshots.
rebar3 compilerebar3 eunitrebar3 do dialyzer, fmt --checkdialyzer runs the type analysis, while fmt --check makes sure all Erlang sources are already formatted.