The database that appeared from nowhere

This is the SICP chapter. We’re going to start with something embarrassingly simple — a key-value store implemented as a handler — and progressively add features. At each step the additions are small, local handler refinements. By the end we’ll have a small distributed database with transactions, persistence, and replication, and at no point will we have written code that announces itself as building a database. We will have built one in the negative space of solving small problems one at a time.

The chapter is here to make a specific argument: that effect handlers are not just a way to implement coroutines, schedulers, and protocol stacks — they’re a way of composing arbitrarily complex stateful systems out of small, independently-testable pieces. The database is incidental; the structural property is the point.

Step 1: A key-value store

effect KV {
  get : key -> option value
  set : key -> value -> ()
  del : key -> ()
}

handler in_memory_kv = {
  let store = Hash.empty () in
  {
    | KV.get k, c -> resume c (Hash.find store k)
    | KV.set k v, c -> Hash.insert store k v; resume c ()
    | KV.del k, c -> Hash.remove store k; resume c ()
    | return v -> v
  }
}

The handler holds a hash table. The three operations do what you’d expect. Forty-some lines including the hash table, none of which is interesting. We’ve built a hash table with a typed interface; nobody is impressed yet.

Notice the new syntax: handler X = { let s = ... in { clauses } }. This is a handler with its own local state, allocated when the handler is installed and freed when it goes out of scope. The hash table store is private to this handler — nothing outside can see or mutate it directly. Access is only through KV.get, KV.set, KV.del. This is the encapsulation property that OO uses class boundaries for and Rust uses module boundaries for; here it’s the handler boundary.

Step 2: Add transactions

A transaction is “do a sequence of operations atomically — either they all happen or none do.” The straightforward extension is to add a new effect:

effect Txn {
  begin    : () -> ()
  commit   : () -> ()
  rollback : () -> ()
}

And a new handler that wraps the kv handler, buffering writes until commit:

handler with_transactions = {
  let mutable pending : (key * op) list option = None in
  {
    | Txn.begin (), c ->
        pending <- Some [];
        resume c ()
    | Txn.commit (), c ->
        (match pending with
         | None -> resume c ()    (* not in txn, no-op *)
         | Some ops ->
             List.iter apply_op (List.rev ops);
             pending <- None;
             resume c ())
    | Txn.rollback (), c ->
        pending <- None;
        resume c ()
    | KV.set k v, c ->
        (match pending with
         | Some ops ->
             pending <- Some ((k, Set v) :: ops);
             resume c ()
         | None ->
             perform (KV.set k v);     (* pass through *)
             resume c ())
    | KV.del k, c ->
        (match pending with
         | Some ops ->
             pending <- Some ((k, Del) :: ops);
             resume c ()
         | None ->
             perform (KV.del k);
             resume c ())
    | KV.get k, c ->
        (* Reads see uncommitted writes from this txn *)
        let from_pending = lookup_in_pending pending k in
        match from_pending with
        | Some Tombstone -> resume c None
        | Some (Value v) -> resume c (Some v)
        | None -> resume c (perform (KV.get k))
  }
}

The user writes:

with in_memory_kv in {
  with with_transactions in {
    perform Txn.begin ();
    perform KV.set "x" 10;
    perform KV.set "y" 20;
    perform Txn.commit ()
  }
}

The transaction handler is between the user and the in-memory kv handler. It intercepts every KV.set, KV.del, and KV.get to honor transaction semantics. On commit, it forwards the buffered operations to the underlying kv. On rollback, it discards them.

The kv handler doesn’t know transactions exist. The user doesn’t know how transactions are implemented. The handler boundary is the contract.

Step 3: Add persistence

Suppose we want changes to survive a crash. The natural place is another handler between transactions and storage:

handler journaling (j : journal) = {
  | KV.set k v, c ->
      journal_write j (Set k v);
      perform (KV.set k v);
      resume c ()
  | KV.del k, c ->
      journal_write j (Del k);
      perform (KV.del k);
      resume c ()
  | KV.get k, c ->
      resume c (perform (KV.get k))    (* pass-through *)
}

This handler logs every mutation to a journal before forwarding to the underlying handler. On crash, we can replay the journal to reconstruct state.

Stack it:

with in_memory_kv in {
  with journaling (open_journal "kv.log") in {
    with with_transactions in {
      user_code ()
    }
  }
}

The order matters. From bottom to top: the in-memory store is the base; journaling logs every operation that reaches it; transactions buffer until commit. When a user performs KV.set, the operation flows up to transactions (buffered), eventually flows down via commit to journaling (logged), eventually flows down to the in-memory store (applied).

On crash and restart, we read the journal at boot, replay each operation against a fresh in-memory store, and we’re back where we were. Recovery is “fold over the journal.”

We’ve added durability without modifying the previous handlers. Each handler is a thin wrapper around the layer below; new behaviors are introduced by sliding new handlers into the stack.

Step 4: Add replication

Now suppose we want changes to be applied on multiple cores, so failure of one core doesn’t lose data. Yet another handler:

handler replication (peers : core_id list) = {
  | KV.set k v, c ->
      List.iter (fun peer ->
        perform NoC.send peer (Wrap_KvOp (Set k v))
      ) peers;
      perform (KV.set k v);
      resume c ()
  | KV.del k, c ->
      List.iter (fun peer ->
        perform NoC.send peer (Wrap_KvOp (Del k))
      ) peers;
      perform (KV.del k);
      resume c ()
  | KV.get k, c ->
      resume c (perform (KV.get k))
}

This handler, for every mutation, sends a copy of the operation to each peer core via the NoC. The peers have their own kv handlers (with the same journaling handler stacked, presumably); they receive the message in their NoC arrival handler and apply it to their local store.

We’ve added replication without modifying anything else. The kv handler is unchanged. The journaling handler is unchanged. The transaction handler is unchanged. The user code is unchanged.

This is fire-and-forget replication, which is the wrong semantics for a real database — you want acknowledgments, conflict resolution, quorum, leader election, the whole consensus apparatus. But the shape of the solution is already there. To add an “wait for ack from N peers” semantic, the replication handler captures the user’s continuation, files it pending acks, and resumes it when enough acks arrive. This is the exact same pattern as TCP’s recv waiting for data: continuation as the wait slot. The handler grows; the layers around it don’t.

What we just built

Let me count what’s in the stack now:

with in_memory_kv in {
  with journaling (open_journal "kv.log") in {
    with replication [peer1; peer2] in {
      with with_transactions in {
        user_code ()
      }
    }
  }
}

Four handlers, each maybe 30-50 lines. The user gets transactions, durability, and replication by writing perform KV.set. The handlers compose. Each handler is independently testable, independently swappable, independently inspectable.

The database is not anywhere. There’s no database.cml file. There’s no class called Database. There’s no module that announces itself. The database is the emergent property of stacking these four handlers. The user can call this configuration whatever they want; whatever it’s called, it has the properties of a database.

The general pattern

The structural move that’s been made repeatedly:

Identify a behavior that some users want (transactions, durability, replication).
Write a handler that provides that behavior by intercepting the relevant effects, doing the necessary bookkeeping, and forwarding to the layer below.
Stack the handler.

The handler is the abstraction unit. The stack is the program. The layers compose because their interfaces (the effects) are typed and the type system enforces compatibility.

This is the same move that:

A kernel makes when it adds preemption by installing a Tick handler (chapter 11).
A protocol stack makes when it layers IP on Link on Hardware (chapter 15).
A debugger makes when it interposes a tracing handler.
A test harness makes when it stubs out side-effecting layers.
A profiler makes when it counts effects.

The reason it composes well: handlers are first-class language constructs with type-checked interfaces. Adding a layer doesn’t require modifying any other layer’s source. Removing a layer is identically easy. Reordering them changes semantics in well-defined ways.

In OO, the closest analog is the decorator pattern, which is hampered by the lack of type-level enforcement and the need for boilerplate. In functional programming without effects, the closest analog is monad transformers, which are hampered by the lift/wrap bookkeeping that makes them tedious in practice. Effect handlers, given a type system that tracks them, make the composition free.

The honest caveats

A few things this chapter glossed over that matter for a real database:

Concurrency control. With transactions and concurrent users, you need isolation (serializable, read-committed, snapshot, etc.). The transaction handler shown is single-user. Multi-user requires either a lock manager handler or an MVCC-style version-tracking handler. Both are doable; both add complexity inside the handler. The composition story doesn’t change; the inside of one handler grows.

Crash recovery semantics. The journaling handler shown is write-ahead-logging. To get the right durability guarantee, the journal must be flushed to durable storage before acknowledging the write. The handler should perform a Storage.fsync effect; the handler that interprets Storage.fsync decides what durability means (write to actual flash, write to NoC peer, etc.).

Replication consistency. Fire-and-forget gives no consistency guarantee. Strong consistency requires consensus (Paxos, Raft) which is substantially harder. The handler structure can express consensus, but a real implementation is hundreds of lines per replica. The structural argument of this chapter holds; the engineering argument doesn’t extend to “we trivially get consensus.”

The point of the chapter isn’t that databases are easy. The point is that the compositional structure of a database — layers, with clearly typed interfaces between them, each layer adding capability — is what effect handlers naturally express. Real systems have always been built this way; the design we’ve assembled makes the structure explicit and the composition cheap.

What this chapter committed to

A key-value store, transactions, persistence, and (sketched) replication, built as four stackable handlers. The recognition that this composition pattern is the same one used for kernel preemption, protocol layering, and other apparently-distinct concerns. The acknowledgment that the structural ease does not eliminate the essential complexity of real database engineering.

The next chapter takes the same compositional move into concurrency: building a coroutines runtime, and discovering that what we’ve built is the runtime we already had.

Keyboard shortcuts

Concrete Machines