PandaStack

Sandbox lifecycle

Born, used, paused, hibernated, forked, snapshotted, deleted — every state a sandbox can be in.

A sandbox is a single Firecracker microVM with a dedicated kernel, rootfs, network namespace, and lifecycle. Sandboxes are identified by a UUID (e.g. 278a4f42-3467-4424-98e6-a547646dd0fd) and always belong to a workspace.

States

                    create


                  ┌─ creating ─┐

                       │ snapshot restored, SSH ready

                    running ◄────────────┐
                       │                 │
                   pause │                │ resume
                       ▼                 │
                    paused ──────────────┘

                       │ hibernate

                  hibernated  (memory written to disk, VM stopped)

                       │ wake

                    running

   any state ──► failed   (orchestrator marks unhealthy sandboxes)
   any state ──► deleted  (rootfs purged, slot released)

Lifetime guarantees

  • Cold create → running: P50 250 ms, P99 350 ms (snapshot mode on XFS+reflink).
  • Pause → resume: < 5 ms (just firecracker pause).
  • Hibernate → wake: ~150 ms (write memory state to disk, restore from it).
  • Fork: ~50 ms per child (memory CoW + rootfs reflink).

Lease semantics

Every running sandbox holds a lease in the shared store. The agent that owns it refreshes the lease every 10 s. If the lease expires (agent crash, network partition), the scheduler marks the sandbox failed and releases its NATID slot.

This is what lets you safely have a multi-node cluster: a dead agent's sandboxes don't linger as zombies.

Health monitor

While a sandbox is running, the agent runs a background health monitor that:

  • Pokes the SSH socket every 5 s.
  • Checks Firecracker's /state endpoint every 30 s.
  • Tracks RSS + vCPU usage for metrics.
  • Marks the sandbox failed after 3 consecutive failed probes.

You can subscribe to lifecycle changes via the events stream.

On this page