Bash

@flow-state-dev/tools/bash — Execute bash commands in a sandboxed workspace with automatic resource-backed persistence.

Why this exists

Agents that write code, run scripts, or manage files need a real filesystem. But agent state should be portable and persistent, not tied to a particular machine. The bash tool bridges these two needs: files live as framework resources for persistence, get materialized into a sandbox for execution, and sync back after mutations.

The sandbox is the execution surface. Resource collections are the source of truth. Bash mounts each collection at its pattern prefix, routes writes back by longest-prefix match, and keeps a reserved ./tmp/ directory for scratch that never persists.

Recommended path — `createBashCapability`

Almost every real setup wants a capability: a bundle of tool blocks + dynamic context guidance, wired in with a single uses: [bashCap] on a generator.

import { createBashCapability } from "@flow-state-dev/tools/bash";

export const bashCap = createBashCapability({
  provider: { type: "local" },
});

That's it. No sessionResources, no collectionKey — bash auto-discovers every ResourceCollectionRef installed on the block's resource context at runtime and mounts each at its pattern prefix:

Collection pattern	Workspace path
`artifacts/**`	`/workspace/artifacts/...`
`skills/**`	`/workspace/skills/...`
`logs/**`	`/workspace/logs/...`

Writes are routed back to the owning collection by longest-prefix match. The agent creating /workspace/skills/new-thing/SKILL.md in bash persists to the skills collection; /workspace/artifacts/report.md persists to artifacts.

Attach the capability to a generator alongside whatever caps install your collections:

import { generator } from "@flow-state-dev/core";

const assistant = generator({
  name: "assistant",
  model: "preset/medium",
  prompt: "You can run bash commands to read and write files.",
  uses: [artifactsCap, skillsCap, bashCap],
});

Bash inherits artifacts from artifactsCap and skills from skillsCap — no coordination needed between capabilities.

The workspace layout

/workspace/
  artifacts/        # mounted collection, writes persist
  skills/           # mounted collection, writes persist
  tmp/              # scratch; never persists; silent
  <anywhere else>   # dropped on flush with a console warning

Orphan files (outside any mounted collection and outside ./tmp/) are explicitly dropped with a warning. The guidance text auto-generated by the capability tells the agent this up front so it doesn't try to write "somewhere".

Narrowing what gets mounted

Explicit collections overrides auto-discovery:

createBashCapability({
  provider: { type: "local" },
  collections: ["artifacts"],  // only mount this one, ignore the rest
});

Strings are shorthand for writable mounts. To mount a collection as read-only — edits stay local to the sandbox, never flush back — use the object form:

createBashCapability({
  provider: { type: "local" },
  collections: [
    "artifacts",
    { key: "skills", writable: false },
  ],
});

For the "everything except X" pattern, use exclude:

createBashCapability({
  provider: { type: "local" },
  exclude: ["logs"],
});

Sandbox providers

Each provider implements the same Sandbox interface. Swapping providers changes where commands run without touching tool or sync logic.

Provider	Config	Best for	Real filesystem?
Local FS	`{ type: "local", cwd?: string }`	Development, local agents	Yes
just-bash	`{ type: "just-bash", python?: true, ... }`	Testing, lightweight analysis	No (in-memory)
Vercel	`{ type: "vercel", Sandbox, sandboxId? }`	Production, cloud execution	Yes (remote)
Upstash	`{ type: "upstash", client, boxId? }`	Remote sandbox	Yes (remote)
MOAT	`{ type: "moat", grants, allowHosts, ... }`	Local container isolation with credential injection	Yes (host, bind-mounted)
Custom	`{ type: "custom", sandbox: Sandbox }`	Anything else	You decide

just-bash takes a python: true toggle to expose python3 in the sandbox, and a network config to gate external HTTP access.

Adapters that wrap third-party SDKs (vercel, upstash) take the SDK from the consumer rather than dynamically importing it. Pass the Sandbox class for Vercel:

import { Sandbox } from "@vercel/sandbox";
import { createBashCapability } from "@flow-state-dev/tools/bash";

createBashCapability({
  provider: { type: "vercel", Sandbox },
});

This keeps @flow-state-dev/tools free of a peer dep on the SDK and gives bundlers (webpack, Vercel's nft file tracer) a real static import to follow when building for deployment. See apps/kitchen-sink/flows/chat-agent/shared/capabilities/bash.ts for the canonical environment-aware pattern.

For deployment configuration — OIDC Federation vs. the static VERCEL_TOKEN triple, the BASH_PROVIDER opt-in env var, and cost notes for public demos — see the Deploying to Vercel guide.

MOAT (local container isolation)

What it is

The MOAT provider runs commands in a container — an isolated Linux environment, similar to Docker — on the same machine as the agent. The host directory you point it at is bind-mounted into the container, so reads and writes go to a real filesystem you own. Outbound network calls flow through a credential-injecting proxy: the agent process never sees API keys, but requests to whitelisted hosts arrive with the right tokens attached.

MOAT is a separate CLI (majorcontext/moat) that the host operator installs once. The framework spawns it; it does not bundle or auto-install it.

When to use it

You want OS-level isolation without sending the workspace off-host. Faster than the Vercel or Upstash sandboxes because there is no network round trip per command, but slower than local because container start adds a few seconds on the first call.
You do not want the agent process to inherit your shell environment. The local provider exposes every variable in process.env to whatever the agent runs; MOAT's container starts with a clean environment.
You are running on macOS 15+ on Apple Silicon (native containers) or any Linux host with Docker installed.

Skip it if you only need an in-memory test sandbox (just-bash is lighter), if you want zero install footprint (local requires nothing), or if you are already deploying on serverless (Vercel/Upstash fit better).

Installing MOAT

@flow-state-dev/tools does not bundle or auto-install the moat CLI. The host operator installs it once, then the framework spawns it like any other binary. Follow the official install guide for your platform.

After installing, confirm the version is 0.4.0 or later (required for moat exec):

moat version --json

Grant the credentials the agent should be able to reach. The framework only declares which grant names a workspace needs — credentials live with MOAT, not in your code:

moat grant github
moat grant openai

See the MOAT credentials concept page for the full grant model.

Configuration

Minimal:

import { createBashCapability } from "@flow-state-dev/tools/bash";

const bashCap = createBashCapability({
  provider: {
    type: "moat",
    grants: ["github"],
    allowHosts: ["api.github.com"],
  },
});

Expanded:

const bashCap = createBashCapability({
  provider: {
    type: "moat",
    runtime: "docker",
    runName: "fsdev-myflow",
    grants: ["github", "openai"],
    allowHosts: ["api.github.com", "api.openai.com"],
    configPath: "./moat.yaml",
    execTimeoutMs: 120_000,
  },
});

Wiring cleanup

MOAT containers persist between commands. Without a teardown step, every flow request leaks one. The capability returns a handler block that releases the sandbox at request end — wire it into the flow:

import { defineFlow } from "@flow-state-dev/core";

defineFlow({
  // ...
  request: { onFinished: bashCap.cleanupBlock },
});

This is required for the MOAT provider. For local, just-bash, vercel, and upstash it is optional — cleanupBlock is returned unconditionally so the capability shape stays stable across providers.

If you call createBashTool directly outside a flow, the returned sandbox.stop() is the equivalent — call it yourself when you are done.

Persistent containers (local dev)

Cold-starting a MOAT container takes a few seconds. Locally, that adds up. Set a stable runName and persist: true so one container survives across requests — the first call pays the cold-start cost; every subsequent call is a moat exec against the live container:

createBashCapability({
  provider: {
    type: "moat",
    runName: "fsdev-dev",
    persist: true,
    grants: ["github"],
    allowHosts: ["api.github.com"],
  },
});

How it behaves:

bashCap.cleanupBlock is still wired into request.onFinished, but becomes a no-op for the MOAT side — no moat stop, no moat destroy.
The next request finds the running container via moat list --json and reuses it.
The generated moat.yaml in the workspace carries an # fsdev-managed marker, so a later session knows the file is reusable instead of refusing it as user-authored.
The container outlives the framework process. Reclaim resources with moat stop <runName> or moat clean. Changing grants / allowHosts between runs does not retrofit the live container — stop it first if the policy must change.

Skip persist in production. Each request should start clean there; the few-seconds cold start is the cost of isolation.

Grants

A grant is a credential MOAT holds for a third-party provider (GitHub, OpenAI, an npm registry, etc.). The host operator runs moat grant <provider> once outside the framework; the framework does not store credentials, only declares which named grants the workspace needs. Missing grants surface a clear error before the run starts. See the credentials concept page for how MOAT injects them on outbound requests.

Limits

Default moat exec timeout is 60 seconds. Override with execTimeoutMs.
readFile only handles UTF-8 text. Binary files throw MoatBinaryReadError.
Output is buffered, not streamed. Long-running commands surface their full stdout/stderr at the end.
A crashed container is not auto-restarted. The next command surfaces the failure.
Process termination outside the cleanup path (SIGTERM, host crash) leaves the container running. Configure a MOAT-side TTL (moat clean) as a backstop.

Sync lifecycle

On the first bash call in a session:

Discover mounts — walk ctx.session.resources, ctx.user.resources, ctx.org.resources; any ResourceCollectionRef becomes a mount at its pattern prefix. collections / exclude can override.
Hydrate — write every mount's entries into the sandbox under its prefix. Seed ./tmp/.keep so the scratch directory exists.
Run the command — whatever the agent requested.
Flush — walk the workspace with find. For each file:
- Under a writable mount → upsert to that mount's collection with the prefix stripped.
- Under a read-only mount → skip.
- Under ./tmp/ → skip silently.
- Under nothing known → log a warning and drop.
Delete — per-mount: refs whose bare key isn't in the current sandbox walk are removed from their collection.

Flush runs after bash and after bash-write-file. It does NOT run after bash-read-file — reads don't change state.

Content hashing

SHA-256 hashes detect changes. Only files whose hash differs from the stored value are written back to resources, so flush is cheap even for large workspaces.

Orphan writes

Files the agent creates outside every known path are not persisted. They're logged via console.warn at flush time so the behavior is visible during development. If the agent genuinely needs scratch space, ./tmp/ is the explicit place: writes there are silent and never saved.

Resource definitions

Collections need a state schema that includes path, hash, and updatedAt fields to participate in bash sync:

import { defineResourceCollection } from "@flow-state-dev/core";
import { z } from "zod";

const artifacts = defineResourceCollection({
  pattern: "artifacts/**",
  stateSchema: z.object({
    path: z.string(),
    hash: z.string(),
    updatedAt: z.string(),
  }),
});

File content is stored separately via the resource content system (readContent / writeContent), not in state. State holds metadata only.

Error handling

Scenario	Behavior
Command fails (non-zero exit)	Returns the result with `exitCode`, `stderr`. Does not throw.
File not found on read	Throws. Generator retry can handle transient cases.
Sandbox provider unavailable	Throws on first bash call.
`just-bash` not installed	Falls back to an in-memory Map-based sandbox.

Low-level: `createBashTool` for AI SDK integration

createBashCapability is the recommended path. If you need AI SDK provider-native tools directly — for example, to wire bash into Anthropic's native computer-use tool contract — use createBashTool:

import { createBashTool } from "@flow-state-dev/tools/bash";

const { tools, sandbox } = await createBashTool({
  collections: { artifacts: ctx.session.resources.artifacts },
  provider: { type: "local", cwd: "./workspace" },
});

This returns AI SDK tool() objects you can pass to a generator via providerTools. It's a thinner layer than the capability — no auto-discovery, no guidance text, no flush-routing behavior beyond what FileSync provides directly. Reach for it only when you need AI SDK-shaped tools.

Next steps

Skills — skill bundle files become reachable in bash automatically when bash is on
Resources — resource system fundamentals
Collections — resource collection patterns

Why this exists​

Recommended path — createBashCapability​

The workspace layout​

Narrowing what gets mounted​

Sandbox providers​

MOAT (local container isolation)​

What it is​

When to use it​

Installing MOAT​

Configuration​

Wiring cleanup​

Persistent containers (local dev)​

Grants​

Limits​

Sync lifecycle​

Content hashing​

Orphan writes​

Resource definitions​

Error handling​

Low-level: createBashTool for AI SDK integration​

Next steps​