Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions architecture/compute-runtimes.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,15 @@ when a sandbox create request asks for GPU resources.
| VM | Experimental microVM isolation. | Per-sandbox libkrun VM. | Managed endpoint-backed driver. The gateway spawns `openshell-driver-vm`, waits for its Unix socket, and then consumes it through the same remote `compute_driver.proto` path used by unmanaged endpoint drivers. The VM driver boots a cached bootstrap `rootfs.ext4`, prepares requested OCI images inside a bootstrap VM with `umoci`, attaches the prepared image disk read-only, and gives each sandbox a writable `overlay.ext4` for merged-root changes and runtime material. The driver persists each accepted launch request beside the overlay and restarts those VMs on driver startup without recreating the overlay. |
| Extension | Out-of-tree drivers operated alongside the gateway. | Whatever boundary the driver implements. | Selected by a non-reserved custom `compute_drivers = ["<name>"]` entry with `[openshell.drivers.<name>].socket_path`, or at launch time by pairing `--drivers <name>` with `--compute-driver-socket=<path>`. Reserved built-in names such as `vm`, `docker`, `podman`, and `kubernetes` cannot be used as unmanaged socket endpoints. The gateway connects to a UDS the operator already provisioned, runs `GetCapabilities`, logs the advertised `driver_name`, and dispatches all sandbox lifecycle calls through `compute_driver.proto`. The driver process and socket lifecycle are operator-owned; the gateway does not spawn, supervise, or remove unmanaged extension drivers. The trust boundary is the socket's filesystem permissions: the operator must ensure only the gateway uid can read/write it. |

The Docker driver treats container labels as discovery metadata rather than
proof of ownership. It reconciles a persisted sandbox against the recorded
container ID and a driver-issued, privately journaled instance generation. The
NemoClaw compatibility handoff journals an exact old-ID/new-ID intent and
requires two consistent observations before adoption; the same generation can
recover a missed overlap without deleting durable sandbox state. Missing or
incorrect generations, legacy containers, and ambiguous candidates fail closed
with an operator-visible warning.

Per-sandbox CPU and memory values currently enter the driver layer through
template resource limits. Docker and Podman apply them as runtime limits.
Kubernetes mirrors each limit into the matching request. VM accepts the fields
Expand Down
2 changes: 2 additions & 0 deletions crates/openshell-driver-docker/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ tracing = { workspace = true }
bytes = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
sha2 = { workspace = true }
uuid = { workspace = true }
prost-types = { workspace = true }
bollard = { version = "0.20" }
tar = "0.4"
Expand Down
43 changes: 43 additions & 0 deletions crates/openshell-driver-docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,49 @@ Desktop, OrbStack, and macOS-hosted gateways, those names use Docker's
`host-gateway` alias. On native Linux Docker, the gateway also binds the bridge
gateway IP so containers can call back to the host process.

## Container Instance Ownership

Docker labels identify containers for discovery and cleanup. They do not grant
authority to represent a persisted sandbox because another Docker client can
copy them. The driver records the authoritative container ID from the Docker
create response and restores it from `Sandbox.status.agent_pod` at gateway
startup. New containers also receive a driver-issued instance-generation label.
That label is not a secret; it is checked against an owner-only journal under
`$XDG_STATE_HOME/openshell/docker-sandbox-instances/<sha256(namespace)>/`, so
generic copied identity labels are insufficient to authorize a replacement.
The local Docker driver assumes that Docker-daemon access remains trusted host
authority.

The journal is written after Docker returns the created container ID and before
the driver publishes it as managed. It durably records the current ID, earlier
IDs, generation, and any exact replacement intent. A crash before the first
journal write fails closed as unresolved ownership. A crash after a replacement
journal commit can recover even if the public sandbox status still contains the
previous ID.

The driver permits one compatibility handoff used by NemoClaw's Docker GPU
patch. The current authoritative container must remain present in the `exited`
state under the exact `<canonical-name>-nemoclaw-gpu-backup-<digits>` name while
exactly one active replacement uses the canonical name and carries the recorded
generation. The first consistent observation journals an intent for those exact
container IDs; a subsequent consistent observation adopts the replacement. If
polling observes the gap or misses the overlap, the durable sandbox is retained.
One canonical, generation-matching replacement can then be authorized and
adopted through the same two-observation intent without resurrecting a skeletal
sandbox record.

Ambiguous candidates, terminal replacements, interrupted intents, and missing
or incorrect generations fail closed as ownership conflicts. Containers created
by an older driver without a journaled generation cannot use the external
handoff; recreate them through OpenShell first so the driver can establish the
generation. Explicit sandbox deletion removes every exact identity match, then
clears the driver's in-memory and journaled ownership.

Missing instances, ignored unowned containers, duplicate candidates, rejected
replacements, and accepted handoff or rollback transitions produce deduplicated
warnings and warning-level driver platform events. Notices generated before a
watcher subscribes remain eligible for delivery on the first watched snapshot.

## Container Contract

The driver-controlled container settings are part of the sandbox security
Expand Down
Loading
Loading