Agentic AI Atlas

II.

Page JSON

page:docs-cli-examples

Structured · live

Babysitter CLI & SDK Examples json

Inspect the normalized record payload exactly as the atlas UI reads it.

File · wiki/docs/cli-examples.mdCluster · wiki

Record JSON

{
  "id": "page:docs-cli-examples",
  "_kind": "Page",
  "_file": "wiki/docs/cli-examples.md",
  "_cluster": "wiki",
  "attributes": {
    "nodeKind": "Page",
    "sourcePath": "docs/cli-examples.md",
    "sourceKind": "repo-docs",
    "title": "Babysitter CLI & SDK Examples",
    "displayName": "Babysitter CLI & SDK Examples",
    "slug": "docs/cli-examples",
    "articlePath": "wiki/docs/cli-examples.md",
    "article": "\n# Babysitter CLI & SDK Examples\n\nThis guide walks through a realistic flow that exercises the `babysitter` CLI and the new deterministic test harness exposed from `@a5c-ai/babysitter-sdk/testing`. The examples assume you are standing in the repo root (or a project that already vendored the CLI + SDK) and that `~/.a5c/runs` is the default runs directory. Set `BABYSITTER_RUNS_SCOPE=repo` if you want repo-local `<repo>/.a5c/runs` instead.\n\n> **Tip:** All CLI paths in this document are rendered with POSIX separators (matching the CLI output convention) even when running on Windows.\n\n---\n\n## 1. Create a run from a process entrypoint\n\n```bash\nbabysitter run:create \\\n  --process-id dev/build \\\n  --entry processes/build/process.mjs#process \\\n  --inputs examples/inputs/build.json \\\n  --prompt \"Build all workspace packages\"\n```\n\nTypical JSON response (`--json`):\n\n```json\n{\n  \"runId\": \"run-20260112-130455\",\n  \"runDir\": \"~/.a5c/runs/run-20260112-130455\",\n  \"process\": {\n    \"processId\": \"dev/build\",\n    \"entry\": \"processes/build/process.mjs#process\"\n  }\n}\n```\n\n---\n\n## 1b. Assign a process to a bare run\n\nWhen a run is created without `--entry` (a bare run), assign a process before iterating:\n\n```bash\nbabysitter run:assign-process .a5c/runs/run-20260112-130455 \\\n  --entry processes/build/process.mjs#process \\\n  --process-id dev/build \\\n  --json\n```\n\n```json\n{\n  \"runId\": \"run-20260112-130455\",\n  \"runDir\": \".a5c/runs/run-20260112-130455\",\n  \"entry\": \"processes/build/process.mjs#process\",\n  \"processId\": \"dev/build\",\n  \"previousEntrypoint\": { \"importPath\": \"bare-run\" },\n  \"assigned\": true\n}\n```\n\n---\n\n## 2. Inspect run status\n\n```bash\nbabysitter run:status run-20260112-130455 --json\n```\n\n```json\n{\n  \"state\": \"waiting\",\n  \"lastEvent\": \"RUN_CREATED#0001 2026-01-12T13:04:56.012Z\",\n  \"pendingByKind\": {\n    \"node\": 2\n  },\n  \"metadata\": {\n    \"stateVersion\": 1,\n    \"pendingEffectsByKind\": {\n      \"node\": 2\n    }\n  }\n}\n```\n\nThe CLI prints the same summary in human form when `--json` is omitted:\n\n```\n[run:status] state=waiting last=RUN_CREATED#0001 2026-01-12T13:04:56.012Z pending[node]=2 pending[total]=2 stateVersion=1\n```\n\n---\n\n## 3. Discover pending effects\n\n```bash\nbabysitter task:list run-20260112-130455 --pending\n```\n\n```\n[task:list] pending=2\n- ef-build-001 [node requested] build workspace (taskId=build.workspaces)\n- ef-lint-001 [node requested] lint sources (taskId=lint.sources)\n```\n\nThe JSON variant highlights the run-relative artifact refs (all `/` even on Windows):\n\n```json\n{\n  \"tasks\": [\n    {\n      \"effectId\": \"ef-build-001\",\n      \"status\": \"requested\",\n      \"kind\": \"node\",\n      \"label\": \"build workspace\",\n      \"taskDefRef\": \"tasks/ef-build-001/task.json\",\n      \"resultRef\": null,\n      \"stdoutRef\": null,\n      \"stderrRef\": null\n    }\n  ]\n}\n```\n\n---\n\n## 4. Inspect a specific effect\n\n```bash\nbabysitter task:show run-20260112-130455 ef-build-001 --json\n```\n\nKey fields in the response:\n\n```json\n{\n  \"effect\": {\n    \"effectId\": \"ef-build-001\",\n    \"taskId\": \"build.workspaces\",\n    \"status\": \"requested\",\n    \"stdoutRef\": null\n  },\n  \"task\": {\n    \"kind\": \"node\",\n    \"node\": {\n      \"entry\": \"build/scripts/build-workspace.mjs\",\n      \"args\": [\"--workspace\", \"frontend\"]\n    }\n  },\n  \"result\": null,\n  \"largeResult\": null\n}\n```\n\nWhen `result.json` exceeds 1 MiB the CLI prints `result: see tasks/<id>/result.json` instead of dumping the payload.\n\n---\n\n## 5. Dry-run a task result post\n\n```bash\nbabysitter task:post run-20260112-130455 ef-build-001 --status ok --dry-run\n```\n\n```\n[task:post] status=skipped\n```\n\nDry runs preview the mutation and exit `0` without changing on-disk state.\n\n---\n\n## 6. Drive a run without built-in auto-execution\n\nInstead of `run:continue` (removed), loop `run:iterate`, execute pending effects using your own runner (hook/worker/agent), then commit results with `task:post`.\n\n---\n\n## 7. Unit-test a process with the deterministic harness\n\nThe SDK now exports `runToCompletionWithFakeRunner` from `@a5c-ai/babysitter-sdk/testing`. Use it to exercise process logic without invoking real node runners:\n\n```ts\nimport { runToCompletionWithFakeRunner } from \"@a5c-ai/babysitter-sdk/testing\";\nimport { createRun } from \"@a5c-ai/babysitter-sdk\";\nimport path from \"node:path\";\nimport os from \"node:os\";\nimport fs from \"node:fs/promises\";\n\ntest(\"build pipeline converges\", async () => {\n  const runsDir = await fs.mkdtemp(path.join(os.tmpdir(), \"babysitter-tests-\"));\n  const { runDir } = await createRun({\n    runsDir,\n    process: {\n      processId: \"dev/build\",\n      importPath: \"../processes/build/process.mjs\",\n      exportName: \"process\",\n    },\n    inputs: { branch: \"main\" },\n  });\n\n  const result = await runToCompletionWithFakeRunner({\n    runDir,\n    resolve(action) {\n      if (action.kind === \"node\") {\n        return { status: \"ok\", value: { value: action.taskDef.metadata?.value ?? 0 } };\n      }\n      return undefined;\n    },\n  });\n\n  expect(result.status).toBe(\"completed\");\n  expect(result.executed).toHaveLength(2);\n});\n```\n\n* Each fake resolution can provide `stdout`, `stderr`, timestamps, and metadata.\n* If your resolver returns `undefined` for an action, the harness leaves it pending and returns `{ status: \"waiting\", pending: [...] }`.\n* Use `maxIterations` (default `100`) to catch runaway loops, and `onIteration(result)` to inspect intermediate states.\n\n---\n\n## 8. Cleaning up run artifacts\n\nAll examples above write into `~/.a5c/runs/<runId>` by default. After a tutorial or test completes, remove the directory (or move it under an archive location) to keep your environment tidy:\n\n```bash\nrm -rf ~/.a5c/runs/run-20260112-130455\n```\n\n---\n\nNeed another scenario documented? Open an issue with the desired flow (CLI flags, harness behavior, etc.) and the team will extend this file. For the deeper specification refer to [`babysitter_cli_surface_spec.md`](./reference/babysitter_cli_surface_spec.md).\n\n---\n\n## Appendix A. Regenerating this walkthrough (deterministic workflow)\n\nThis walkthrough is anchored to the real smoke harness in `packages/sdk/scripts/smoke-cli.js` and the generated traceability index at `docs/generated/cli-examples-verification.md`. When you change CLI output, flags, or wording in this file, use the current repo workflow below from a fresh checkout:\n\n1. **Install dependencies and build the SDK CLI.**\n\n```bash\nnpm ci\nnpm run build --workspace=@a5c-ai/babysitter-sdk\n```\n\n2. **Regenerate repo docs artifacts, including the CLI traceability index.**\n\n```bash\nnpm run docs:prepare\n```\n\n3. **Run the real CLI smoke harness.**\n\n```bash\nnpm run docs:examples:smoke\n```\n\nThis delegates to `npm run smoke:cli --workspace=@a5c-ai/babysitter-sdk`, which stages deterministic fixtures under `packages/sdk/test-fixtures/cli/runs/smoke/`. Add `-- --keep` to the underlying SDK command when you need to inspect the staged run directory after the smoke run finishes.\n\n4. **Run the repo docs checks that validate published command surfaces.**\n\n```bash\nnpm run docs:snippets\nnpm run docs:qa\n```\n\n5. **Keep the harness API docs aligned.**\n   - `packages/sdk/src/testing/README.md` and `library/reference/sdk.md` should be updated in the same change when this walkthrough references `runToCompletionWithFakeRunner`, `captureRunSnapshot`, or other deterministic harness APIs.\n   - Run `npm run test --workspace=@a5c-ai/babysitter-sdk` after changing those APIs or their examples.\n\nCLI output intentionally uses POSIX-style paths even on Windows so the published examples stay stable across platforms. Task payloads remain redacted unless `BABYSITTER_ALLOW_SECRET_LOGS=1` is set for verbose JSON inspection.\n",
    "documents": []
  },
  "outgoingEdges": [],
  "incomingEdges": []
}

Babysitter CLI & SDK Examples json

Inspect the normalized record payload exactly as the atlas UI reads it.

File · wiki/docs/cli-examples.mdCluster · wiki

Record JSON

{
  "id": "page:docs-cli-examples",
  "_kind": "Page",
  "_file": "wiki/docs/cli-examples.md",
  "_cluster": "wiki",
  "attributes": {
    "nodeKind": "Page",
    "sourcePath": "docs/cli-examples.md",
    "sourceKind": "repo-docs",
    "title": "Babysitter CLI & SDK Examples",
    "displayName": "Babysitter CLI & SDK Examples",
    "slug": "docs/cli-examples",
    "articlePath": "wiki/docs/cli-examples.md",
    "article": "\n# Babysitter CLI & SDK Examples\n\nThis guide walks through a realistic flow that exercises the `babysitter` CLI and the new deterministic test harness exposed from `@a5c-ai/babysitter-sdk/testing`. The examples assume you are standing in the repo root (or a project that already vendored the CLI + SDK) and that `~/.a5c/runs` is the default runs directory. Set `BABYSITTER_RUNS_SCOPE=repo` if you want repo-local `<repo>/.a5c/runs` instead.\n\n> **Tip:** All CLI paths in this document are rendered with POSIX separators (matching the CLI output convention) even when running on Windows.\n\n---\n\n## 1. Create a run from a process entrypoint\n\n```bash\nbabysitter run:create \\\n  --process-id dev/build \\\n  --entry processes/build/process.mjs#process \\\n  --inputs examples/inputs/build.json \\\n  --prompt \"Build all workspace packages\"\n```\n\nTypical JSON response (`--json`):\n\n```json\n{\n  \"runId\": \"run-20260112-130455\",\n  \"runDir\": \"~/.a5c/runs/run-20260112-130455\",\n  \"process\": {\n    \"processId\": \"dev/build\",\n    \"entry\": \"processes/build/process.mjs#process\"\n  }\n}\n```\n\n---\n\n## 1b. Assign a process to a bare run\n\nWhen a run is created without `--entry` (a bare run), assign a process before iterating:\n\n```bash\nbabysitter run:assign-process .a5c/runs/run-20260112-130455 \\\n  --entry processes/build/process.mjs#process \\\n  --process-id dev/build \\\n  --json\n```\n\n```json\n{\n  \"runId\": \"run-20260112-130455\",\n  \"runDir\": \".a5c/runs/run-20260112-130455\",\n  \"entry\": \"processes/build/process.mjs#process\",\n  \"processId\": \"dev/build\",\n  \"previousEntrypoint\": { \"importPath\": \"bare-run\" },\n  \"assigned\": true\n}\n```\n\n---\n\n## 2. Inspect run status\n\n```bash\nbabysitter run:status run-20260112-130455 --json\n```\n\n```json\n{\n  \"state\": \"waiting\",\n  \"lastEvent\": \"RUN_CREATED#0001 2026-01-12T13:04:56.012Z\",\n  \"pendingByKind\": {\n    \"node\": 2\n  },\n  \"metadata\": {\n    \"stateVersion\": 1,\n    \"pendingEffectsByKind\": {\n      \"node\": 2\n    }\n  }\n}\n```\n\nThe CLI prints the same summary in human form when `--json` is omitted:\n\n```\n[run:status] state=waiting last=RUN_CREATED#0001 2026-01-12T13:04:56.012Z pending[node]=2 pending[total]=2 stateVersion=1\n```\n\n---\n\n## 3. Discover pending effects\n\n```bash\nbabysitter task:list run-20260112-130455 --pending\n```\n\n```\n[task:list] pending=2\n- ef-build-001 [node requested] build workspace (taskId=build.workspaces)\n- ef-lint-001 [node requested] lint sources (taskId=lint.sources)\n```\n\nThe JSON variant highlights the run-relative artifact refs (all `/` even on Windows):\n\n```json\n{\n  \"tasks\": [\n    {\n      \"effectId\": \"ef-build-001\",\n      \"status\": \"requested\",\n      \"kind\": \"node\",\n      \"label\": \"build workspace\",\n      \"taskDefRef\": \"tasks/ef-build-001/task.json\",\n      \"resultRef\": null,\n      \"stdoutRef\": null,\n      \"stderrRef\": null\n    }\n  ]\n}\n```\n\n---\n\n## 4. Inspect a specific effect\n\n```bash\nbabysitter task:show run-20260112-130455 ef-build-001 --json\n```\n\nKey fields in the response:\n\n```json\n{\n  \"effect\": {\n    \"effectId\": \"ef-build-001\",\n    \"taskId\": \"build.workspaces\",\n    \"status\": \"requested\",\n    \"stdoutRef\": null\n  },\n  \"task\": {\n    \"kind\": \"node\",\n    \"node\": {\n      \"entry\": \"build/scripts/build-workspace.mjs\",\n      \"args\": [\"--workspace\", \"frontend\"]\n    }\n  },\n  \"result\": null,\n  \"largeResult\": null\n}\n```\n\nWhen `result.json` exceeds 1 MiB the CLI prints `result: see tasks/<id>/result.json` instead of dumping the payload.\n\n---\n\n## 5. Dry-run a task result post\n\n```bash\nbabysitter task:post run-20260112-130455 ef-build-001 --status ok --dry-run\n```\n\n```\n[task:post] status=skipped\n```\n\nDry runs preview the mutation and exit `0` without changing on-disk state.\n\n---\n\n## 6. Drive a run without built-in auto-execution\n\nInstead of `run:continue` (removed), loop `run:iterate`, execute pending effects using your own runner (hook/worker/agent), then commit results with `task:post`.\n\n---\n\n## 7. Unit-test a process with the deterministic harness\n\nThe SDK now exports `runToCompletionWithFakeRunner` from `@a5c-ai/babysitter-sdk/testing`. Use it to exercise process logic without invoking real node runners:\n\n```ts\nimport { runToCompletionWithFakeRunner } from \"@a5c-ai/babysitter-sdk/testing\";\nimport { createRun } from \"@a5c-ai/babysitter-sdk\";\nimport path from \"node:path\";\nimport os from \"node:os\";\nimport fs from \"node:fs/promises\";\n\ntest(\"build pipeline converges\", async () => {\n  const runsDir = await fs.mkdtemp(path.join(os.tmpdir(), \"babysitter-tests-\"));\n  const { runDir } = await createRun({\n    runsDir,\n    process: {\n      processId: \"dev/build\",\n      importPath: \"../processes/build/process.mjs\",\n      exportName: \"process\",\n    },\n    inputs: { branch: \"main\" },\n  });\n\n  const result = await runToCompletionWithFakeRunner({\n    runDir,\n    resolve(action) {\n      if (action.kind === \"node\") {\n        return { status: \"ok\", value: { value: action.taskDef.metadata?.value ?? 0 } };\n      }\n      return undefined;\n    },\n  });\n\n  expect(result.status).toBe(\"completed\");\n  expect(result.executed).toHaveLength(2);\n});\n```\n\n* Each fake resolution can provide `stdout`, `stderr`, timestamps, and metadata.\n* If your resolver returns `undefined` for an action, the harness leaves it pending and returns `{ status: \"waiting\", pending: [...] }`.\n* Use `maxIterations` (default `100`) to catch runaway loops, and `onIteration(result)` to inspect intermediate states.\n\n---\n\n## 8. Cleaning up run artifacts\n\nAll examples above write into `~/.a5c/runs/<runId>` by default. After a tutorial or test completes, remove the directory (or move it under an archive location) to keep your environment tidy:\n\n```bash\nrm -rf ~/.a5c/runs/run-20260112-130455\n```\n\n---\n\nNeed another scenario documented? Open an issue with the desired flow (CLI flags, harness behavior, etc.) and the team will extend this file. For the deeper specification refer to [`babysitter_cli_surface_spec.md`](./reference/babysitter_cli_surface_spec.md).\n\n---\n\n## Appendix A. Regenerating this walkthrough (deterministic workflow)\n\nThis walkthrough is anchored to the real smoke harness in `packages/sdk/scripts/smoke-cli.js` and the generated traceability index at `docs/generated/cli-examples-verification.md`. When you change CLI output, flags, or wording in this file, use the current repo workflow below from a fresh checkout:\n\n1. **Install dependencies and build the SDK CLI.**\n\n```bash\nnpm ci\nnpm run build --workspace=@a5c-ai/babysitter-sdk\n```\n\n2. **Regenerate repo docs artifacts, including the CLI traceability index.**\n\n```bash\nnpm run docs:prepare\n```\n\n3. **Run the real CLI smoke harness.**\n\n```bash\nnpm run docs:examples:smoke\n```\n\nThis delegates to `npm run smoke:cli --workspace=@a5c-ai/babysitter-sdk`, which stages deterministic fixtures under `packages/sdk/test-fixtures/cli/runs/smoke/`. Add `-- --keep` to the underlying SDK command when you need to inspect the staged run directory after the smoke run finishes.\n\n4. **Run the repo docs checks that validate published command surfaces.**\n\n```bash\nnpm run docs:snippets\nnpm run docs:qa\n```\n\n5. **Keep the harness API docs aligned.**\n   - `packages/sdk/src/testing/README.md` and `library/reference/sdk.md` should be updated in the same change when this walkthrough references `runToCompletionWithFakeRunner`, `captureRunSnapshot`, or other deterministic harness APIs.\n   - Run `npm run test --workspace=@a5c-ai/babysitter-sdk` after changing those APIs or their examples.\n\nCLI output intentionally uses POSIX-style paths even on Windows so the published examples stay stable across platforms. Task payloads remain redacted unless `BABYSITTER_ALLOW_SECRET_LOGS=1` is set for verbose JSON inspection.\n",
    "documents": []
  },
  "outgoingEdges": [],
  "incomingEdges": []
}