All posts

The Session ID Becomes a Primitive: Inside AetherFS's Bet on Portable Agent Workspaces

A workspace that travels

Published

Nov 6, 2025

Topic

Engineering

A workspace that travels across HTTP, gRPC, and a local FUSE mount looks unremarkable until you try to build it without one.

A concrete scenario, drawn from how production agent workflows actually run today. A backend orchestrator spawns an agent to investigate a bug in a 4 GB monorepo. The agent creates a workspace, runs a partial build, and pauses while waiting for a model response. A second agent forks from the first to test a different hypothesis. A human engineer, alerted by a Slack notification, mounts the original workspace locally to spot-check the diff. The first agent resumes, hands the result to a third agent for evaluation, and shuts down. The second agent's branch is preserved for replay. Total elapsed wall time, twelve minutes.

Now build that scenario without a portable session abstraction. Every transition in the sequence becomes a serialization problem: snapshot to S3, push a container image, generate a tarball, hope dependencies still install on the receiving side. The simplicity of the original sequence depends on a single primitive, an addressable workspace, that most agent platforms today do not have. AetherFS is building it. The question is whether that primitive ends up shaped like a URL, a container image, or something genuinely new.

The Pre-Session Era

The workflows above are not aspirational. Some version of each is running in production at every serious agent platform today. The implementations, however, are universally awkward.

Snapshotting a workspace currently looks like one of three patterns, all imperfect. The first is to push a tarball or zip archive to object storage and pass the URI. This works for small workspaces and breaks down past a few hundred megabytes, both because of upload latency and because dependency state often does not survive serialization. The second is to commit the state to a Git branch and pass the SHA, which is clean for source code but incomplete for caches, build outputs, and any state the agent generated that does not belong in version control. The third is to snapshot a container image and push it to a registry, which is comprehensive but can take five to ten minutes for a meaningful workspace, defeating the purpose of fast handoff.

Each pattern requires the receiving side to materialize the state from scratch. A new VM boots, pulls the image or the tarball, restores dependencies, and resumes. The time and cost overhead per handoff scales with the size of the workspace, not with the size of the change.

The teams building agent platforms know this is bad. The workarounds are visible in their architecture diagrams: dedicated snapshot services, custom S3 caching layers, image-layering tricks to reduce push time. Each platform has invented some version of the same set of compromises. None of them has a primitive that obviously replaces the pattern, because the primitive does not exist as a generally available component.

What a Session ID Wants to Be

The shape that emerges, when you build the primitive properly, is closer to a URL than a container image. A session ID identifies a specific workspace state. It is portable, addressable, and short. It does not encode the contents of the workspace, only a reference to them. The contents live in a content-addressable store, deduplicated across all sessions on the same base.

In the AetherFS design, a session is a copy-on-write overlay over a shared base. Forking is an O(1) metadata operation regardless of base size, because the fork does not copy bytes; it creates a new overlay that shares the base by reference. Materializing a session for read or write does not require pulling the entire base into local storage; only the blocks actually touched are read through the FUSE layer, with the content-addressable store underneath handling cache and dedup. The session ID is the key to all of this. Pass it, and the receiving side can mount the workspace through gRPC, HTTP, WebDAV, or FUSE without any prior coordination.

The architectural commitment that matters here is that the session is a first-class entity in the system, not a serialized snapshot. This sounds like a small distinction. It is the entire distinction. A snapshot is a copy. A session is a reference. Snapshots scale linearly with workspace size. References do not scale at all in the relevant sense. The asymptotic difference shows up in workflows that involve frequent handoff, which is most agent workflows above a trivial scale.

The team has described the goal of session creation in single-digit seconds, with the marginal cost of an additional session approaching the cost of a metadata insert.

Three Workflows Sessions Unlock

Three workflows are worth describing in detail because each one breaks the existing per-VM model in a different way.

The first is parallel exploration. An agent platform tasked with finding the best fix for a bug spawns 30 forks of the same base session, each running a different proposed solution. Each fork costs a metadata operation, not a clone. The forks run in parallel, each writing only its delta. At the end, the orchestrator inspects the results, picks the best one, and discards the rest. Without session-level forking, this workflow is prohibitively expensive: 30 cold VM starts, 30 dependency installations, 30 full repository clones. With it, the workflow is cheap enough to be the default.

The second is human-in-the-loop review without environment drift. A human engineer reviewing an agent's work today usually pulls the diff into their local environment and tries to reproduce the conditions. This is reliably terrible. Dependency versions drift, environment variables differ, build caches are missing. With a portable session, the human mounts the exact session the agent used, with the exact state at the moment the agent finished. The review happens in the actual environment, not a reconstruction of it.

The third is deterministic replay. For evaluation pipelines and debugging, the same agent task started from the same session ID produces a reproducible result, modulo the model's own non-determinism. The substrate guarantees the input state is identical. This matters for regression testing, for benchmark suites, and for diagnosing the inevitable cases where an agent succeeds in development and fails in production.

Each of these workflows is technically possible without session primitives. None of them is economical without them. The shift, when it happens, will look less like a new feature and more like a new class of workflows becoming available because the cost structure stopped fighting them.

The Composability Problem

A session ID is only as useful as the systems that recognize it. A URL became foundational not because the abstraction was clever but because every system on the web agreed to handle URLs the same way. A container image became foundational for the same reason. The session-as-primitive bet only works if a similar agreement happens for agent workspaces.

This is the strategic question underneath AetherFS's architecture. If the session abstraction stays proprietary, it remains a useful internal tool but never reaches the network effects that make a primitive durable. If it becomes a standard, even a de facto one, it could end up doing for agent infrastructure what container images did for compute portability.

The AetherFS team's bet, paraphrased from public design materials, is that an open-source Rust core is the right form factor for a substrate that needs to be trusted at the infrastructure layer. The reasoning has two parts. First, infrastructure that touches the filesystem cannot be a black box; serious users will not adopt it without source visibility. Second, an open core lets adjacent vendors (CDEs, evaluation platforms, agent orchestrators) integrate the abstraction without depending on a single vendor for runtime availability. Whether either of these conditions actually drives adoption is the next 18 months of evidence.

The Forward Bet

Speculative framing, flagged. The session ID may end up as the foundational primitive of agent infrastructure, in the same way the URL was for the web and the container image was for cloud computing. It may also be a useful abstraction that gets absorbed into a larger platform stack and never reaches general visibility. Both outcomes are consistent with the technical merit of the design. The deciding factor is adoption, which is downstream of distribution, which is downstream of timing.