Agentic AI Atlas

II.

Page JSON

page:docs-testing-trace-identifiers-and-evidence

Structured · live

Trace Identifiers And Evidence json

Inspect the normalized record payload exactly as the atlas UI reads it.

File · wiki/docs/testing/trace-identifiers-and-evidence.mdCluster · wiki

Record JSON

{
  "id": "page:docs-testing-trace-identifiers-and-evidence",
  "_kind": "Page",
  "_file": "wiki/docs/testing/trace-identifiers-and-evidence.md",
  "_cluster": "wiki",
  "attributes": {
    "nodeKind": "Page",
    "sourcePath": "docs/testing/trace-identifiers-and-evidence.md",
    "sourceKind": "repo-docs",
    "title": "Trace Identifiers And Evidence",
    "displayName": "Trace Identifiers And Evidence",
    "slug": "docs/testing/trace-identifiers-and-evidence",
    "articlePath": "wiki/docs/testing/trace-identifiers-and-evidence.md",
    "article": "\n# Trace Identifiers And Evidence\n\nUse this document as the evidence checklist for tests described in [Primary Flow Data Paths](./primary-flow-data-paths.md). A scenario should not be marked E2E unless it records the identifiers needed to join the agent session, hook events, Babysitter run state, and transport trace.\n\n## Identifier Spine\n\n| Identifier | Owner | Where it appears | Why it matters |\n| --- | --- | --- | --- |\n| `agentMuxRunId` / `runId` | Agent-mux | CLI result, gateway runtime state, event log filename or event body | Joins agent-mux session events to launch/transport evidence |\n| `agentMuxSessionId` / `sessionId` | Agent-mux/external harness | CLI args, session runtime, harness transcript | Proves continuity across prompts, plugin command, and hook events |\n| `babysitterRunId` / SDK `runId` | Babysitter SDK and babysitter-agent | `run:create` output, `.a5c/runs/<runId>/`, `babysitter-agent` progress events | Primary key for SDK journal, tasks, and terminal state |\n| `runDir` | Babysitter SDK | `run:create` output, `babysitter-agent` progress events | Filesystem root for journal, tasks, outputs, and replay state |\n| `babysitterSessionId` | SDK session binding or harness adapter | `session:init`, `session:associate`, run-create session block, hooks env | Joins harness session to SDK run loop |\n| `effectId` | Babysitter SDK | `run:iterate` next actions, `task:list`, `task:post`, `tasks/<effectId>/` | Joins requested work to posted results |\n| `taskId` / `stepId` | Babysitter process runtime | `task:list`, task definition refs | Names process step semantics independently of generated effect ID |\n| `UnifiedHookEvent.execution.sessionId` | Hooks-mux | Normalized hook event JSON | Joins native hook event to agent or Babysitter session |\n| `UnifiedHookEvent.execution.toolCallId` | Hooks-mux/native harness | Tool hook payloads and normalized event | Joins tool call ready/result pairs and handler decisions |\n| `event.seq` | Agent-mux gateway event log | `packages/agent-mux/gateway/src/runs/event-log.ts` event entries | Orders session events and detects gaps/truncation |\n| Transport request/trace ID | Transport-mux | Proxy request logs, trace query/headers, upstream metadata | Joins provider request/stream to agent-mux launch/session |\n\n## Environment And Hook Context\n\n| Variable or payload field | Produced by | Consumed by | Required assertion |\n| --- | --- | --- | --- |\n| `AGENT_SESSION_ID` | Hooks-mux bootstrap/session persistence or SDK harness adapter | Hook handlers, child commands, SDK session binding | Equals the scenario session ID and is stable across hook invocations |\n| `AGENT_ADAPTER` | Hooks-mux normalized execution context | Hook handlers and trace artifacts | Equals selected adapter such as `claude`, `codex`, or `gemini` |\n| `AGENT_WORKSPACE_ROOT` | Hooks-mux execution context | Hook handlers and subprocesses | Equals expected workspace/cwd |\n| `AGENT_TRANSCRIPT_PATH` | Harness-native payload where available | Hook handlers and evidence collector | Points to redacted transcript artifact when available |\n| `AGENT_CAPABILITIES_JSON` | Hooks-mux handler runner | Hook handlers | Captures adapter capability gate decisions |\n| `HOOKS_PROXY_EVENT` | Hooks-mux handler runner | Hook handlers | JSON equals the normalized event given on stdin |\n| `CLAUDE_ENV_FILE` | Claude native hook environment | Hooks-mux propagation backend | Contains exported persisted env after bootstrap or handler result |\n| `HOOKS_PROXY_ENV_FILE` | Generic hooks-mux env propagation | Hooks-mux propagation backend | Contains persisted env when native env file is not provider-specific |\n| `HOOKS_PROXY_SESSION_ID` | Adapter enrichment/fallback | Normalizer | Matches native session ID when adapter enriches env from stdin |\n| `HOOKS_PROXY_TOOL_NAME` / `HOOKS_PROXY_TOOL_CALL_ID` | Adapter enrichment | Normalizer/handler env | Matches native tool payload values |\n\n## Evidence Bundles By Flow\n\n### Agent-Mux Plugin Path\n\nA passing artifact bundle should include:\n\n- `agent-mux` invocation: command, selected adapter, model, cwd, prompt digest, `runId`, session mode.\n- Agent-mux event log: ordered `seq`, `ts`, `source`, event type, session/run IDs, terminal event.\n- Harness/plugin setup: `babysitter harness:install <harness>` and `babysitter harness:install-plugin <harness>` output or a cached precondition artifact.\n- Plugin command transcript: user command such as `/babysitter:call`, plugin dispatch evidence, assistant/tool result.\n- Babysitter SDK run evidence: `runId`, `runDir`, `run:iterate` output, `task:list`, `task:post`, terminal journal state.\n- Hook evidence: normalized session/tool/stop event, stop-hook decision, handler env snapshot with secrets redacted.\n\n### Babysitter-Agent Runtime Path\n\nA passing artifact bundle should include:\n\n- `babysitter-agent call` or `babysitter-agent create-run` command and parsed options.\n- Progress events for planning/process path, run creation, session binding, iteration start, effect resolution, and completion.\n- Selected harness/backend: `agent-core` for internal primary tests, external harness name for bridge tests.\n- Generated/provided process path and process fingerprint or file digest.\n- SDK `runId`, `runDir`, session binding result, pending effects, posted task results, terminal state.\n- Redacted model/provider trace for model-backed runs, or mock transcript for no-model runs.\n\n### SDK Run/Session Loop\n\nA passing artifact bundle should include:\n\n- `babysitter run:create --json` output with `runId`, `runDir`, `entry`, `processId`, and session block if bound.\n- `.a5c/runs/<runId>/` file listing or archived subset: metadata, journal/events, tasks.\n- `babysitter run:iterate --json` outputs for each iteration.\n- `babysitter task:list --pending --json` before each post.\n- `babysitter task:post --json` output for every `effectId` resolved by the test.\n- Final `run:status` or terminal journal event proving completion/failure.\n\n### Hooks-Mux Path\n\nA passing artifact bundle should include:\n\n- Raw native hook fixture or redacted live stdin payload.\n- CLI command: `a5c-hooks-mux bootstrap` or `a5c-hooks-mux invoke --adapter <name> --native-event <event>`.\n- Adapter capabilities and mapping support level (`native`, `lossy`, `unsupported`).\n- Normalized `UnifiedHookEvent` with `adapter`, `phase`, `rawEventName`, `supportLevel`, and `execution` fields.\n- Handler plan and child-process result; include stdout/stderr and timeout status.\n- Merged hook result, persisted env/context diff, and native renderer output.\n\n### Transport-Mux Path\n\nA passing artifact bundle should include:\n\n- Agent-mux launch decision: native provider vs transport proxy, `proxyNeeded`, reason, route, and redacted env diff.\n- Transport-mux route request: method, path, query/trace flag, upstream target, status code.\n- Stream evidence: first byte/event, at least one delta, final event, cancellation/timeout case where applicable.\n- Correlation to agent-mux `runId` or session ID.\n- Explicit statement that Babysitter completion is out of scope unless a `babysitterRunId` and SDK terminal state are also present.\n\n## Redaction Rules\n\n- Never store provider API keys, OAuth tokens, cookies, or raw auth headers.\n- Store model/provider names, endpoint family, status code, request shape, token counts, and timing metadata only after redaction.\n- Prompt/transcript artifacts may store prompt digests and bounded excerpts; full live transcripts require a fixture-safe redaction pass.\n- Hook env snapshots must include `AGENT_*` and `HOOKS_PROXY_*` correlation variables but remove credential variables.\n\n## Failure Classification\n\n| Failure class | Example | How to report |\n| --- | --- | --- |\n| Setup failure | Harness/plugin install fails | Mark setup lane failed; do not claim runtime E2E attempted |\n| Capability skip | Codex plugin manager unsupported | Mark skipped with adapter capability artifact |\n| Session correlation failure | Hook event session ID differs from agent-mux session ID | Fail E2E and attach both IDs plus raw/normalized hook evidence |\n| SDK run failure | `run:iterate` emits `RUN_FAILED` | Fail Babysitter run path; attach journal and last effect result |\n| Hook normalization failure | Native event maps to wrong phase/support level | Fail hooks-mux lane; attach raw payload and `UnifiedHookEvent` |\n| Transport failure | Proxy stream times out or loses final event | Fail transport lane; attach route trace and agent-mux session state |\n| Provider failure | Live model returns auth/quota error | Mark model-backed infra failure; keep no-model lane separate |\n\n## Minimal Artifact Naming\n\nUse deterministic artifact names so CI and local runs can be compared:\n\n| Artifact | Suggested name |\n| --- | --- |\n| Agent-mux event log | `agent-mux-events-<agentMuxRunId>.ndjson` |\n| Babysitter run summary | `babysitter-run-<babysitterRunId>.json` |\n| Babysitter task bundle | `babysitter-tasks-<babysitterRunId>.json` |\n| Hook normalized event | `hooks-mux-<adapter>-<nativeEvent>-<sessionId>.json` |\n| Hook handler result | `hooks-mux-handler-<effect-or-tool-id>.json` |\n| Transport trace | `transport-mux-trace-<agentMuxRunId>.json` |\n| Redaction report | `redaction-report-<scenario-id>.json` |\n\n## Scenario Completion Checklist\n\nBefore a scenario is labeled complete, verify:\n\n- [ ] The primary path is declared: agent-mux plugin, babysitter-agent runtime, SDK run loop, hooks-mux fixture, or transport-mux route.\n- [ ] All required identifiers for that path are present and joinable.\n- [ ] The terminal condition is owned by the correct layer.\n- [ ] Any capability gate or model credential requirement is explicit.\n- [ ] Redaction completed before artifacts are uploaded.\n- [ ] The scenario names which permutation IDs from [Stack Permutations](./stack-permutations.md) and which primary flow IDs from [Primary Flow Data Paths](./primary-flow-data-paths.md) it covers.\n",
    "documents": []
  },
  "outgoingEdges": [],
  "incomingEdges": [
    {
      "from": "page:docs-testing",
      "to": "page:docs-testing-trace-identifiers-and-evidence",
      "kind": "contains_page"
    }
  ]
}

Trace Identifiers And Evidence json

Inspect the normalized record payload exactly as the atlas UI reads it.

File · wiki/docs/testing/trace-identifiers-and-evidence.mdCluster · wiki

Record JSON

{
  "id": "page:docs-testing-trace-identifiers-and-evidence",
  "_kind": "Page",
  "_file": "wiki/docs/testing/trace-identifiers-and-evidence.md",
  "_cluster": "wiki",
  "attributes": {
    "nodeKind": "Page",
    "sourcePath": "docs/testing/trace-identifiers-and-evidence.md",
    "sourceKind": "repo-docs",
    "title": "Trace Identifiers And Evidence",
    "displayName": "Trace Identifiers And Evidence",
    "slug": "docs/testing/trace-identifiers-and-evidence",
    "articlePath": "wiki/docs/testing/trace-identifiers-and-evidence.md",
    "article": "\n# Trace Identifiers And Evidence\n\nUse this document as the evidence checklist for tests described in [Primary Flow Data Paths](./primary-flow-data-paths.md). A scenario should not be marked E2E unless it records the identifiers needed to join the agent session, hook events, Babysitter run state, and transport trace.\n\n## Identifier Spine\n\n| Identifier | Owner | Where it appears | Why it matters |\n| --- | --- | --- | --- |\n| `agentMuxRunId` / `runId` | Agent-mux | CLI result, gateway runtime state, event log filename or event body | Joins agent-mux session events to launch/transport evidence |\n| `agentMuxSessionId` / `sessionId` | Agent-mux/external harness | CLI args, session runtime, harness transcript | Proves continuity across prompts, plugin command, and hook events |\n| `babysitterRunId` / SDK `runId` | Babysitter SDK and babysitter-agent | `run:create` output, `.a5c/runs/<runId>/`, `babysitter-agent` progress events | Primary key for SDK journal, tasks, and terminal state |\n| `runDir` | Babysitter SDK | `run:create` output, `babysitter-agent` progress events | Filesystem root for journal, tasks, outputs, and replay state |\n| `babysitterSessionId` | SDK session binding or harness adapter | `session:init`, `session:associate`, run-create session block, hooks env | Joins harness session to SDK run loop |\n| `effectId` | Babysitter SDK | `run:iterate` next actions, `task:list`, `task:post`, `tasks/<effectId>/` | Joins requested work to posted results |\n| `taskId` / `stepId` | Babysitter process runtime | `task:list`, task definition refs | Names process step semantics independently of generated effect ID |\n| `UnifiedHookEvent.execution.sessionId` | Hooks-mux | Normalized hook event JSON | Joins native hook event to agent or Babysitter session |\n| `UnifiedHookEvent.execution.toolCallId` | Hooks-mux/native harness | Tool hook payloads and normalized event | Joins tool call ready/result pairs and handler decisions |\n| `event.seq` | Agent-mux gateway event log | `packages/agent-mux/gateway/src/runs/event-log.ts` event entries | Orders session events and detects gaps/truncation |\n| Transport request/trace ID | Transport-mux | Proxy request logs, trace query/headers, upstream metadata | Joins provider request/stream to agent-mux launch/session |\n\n## Environment And Hook Context\n\n| Variable or payload field | Produced by | Consumed by | Required assertion |\n| --- | --- | --- | --- |\n| `AGENT_SESSION_ID` | Hooks-mux bootstrap/session persistence or SDK harness adapter | Hook handlers, child commands, SDK session binding | Equals the scenario session ID and is stable across hook invocations |\n| `AGENT_ADAPTER` | Hooks-mux normalized execution context | Hook handlers and trace artifacts | Equals selected adapter such as `claude`, `codex`, or `gemini` |\n| `AGENT_WORKSPACE_ROOT` | Hooks-mux execution context | Hook handlers and subprocesses | Equals expected workspace/cwd |\n| `AGENT_TRANSCRIPT_PATH` | Harness-native payload where available | Hook handlers and evidence collector | Points to redacted transcript artifact when available |\n| `AGENT_CAPABILITIES_JSON` | Hooks-mux handler runner | Hook handlers | Captures adapter capability gate decisions |\n| `HOOKS_PROXY_EVENT` | Hooks-mux handler runner | Hook handlers | JSON equals the normalized event given on stdin |\n| `CLAUDE_ENV_FILE` | Claude native hook environment | Hooks-mux propagation backend | Contains exported persisted env after bootstrap or handler result |\n| `HOOKS_PROXY_ENV_FILE` | Generic hooks-mux env propagation | Hooks-mux propagation backend | Contains persisted env when native env file is not provider-specific |\n| `HOOKS_PROXY_SESSION_ID` | Adapter enrichment/fallback | Normalizer | Matches native session ID when adapter enriches env from stdin |\n| `HOOKS_PROXY_TOOL_NAME` / `HOOKS_PROXY_TOOL_CALL_ID` | Adapter enrichment | Normalizer/handler env | Matches native tool payload values |\n\n## Evidence Bundles By Flow\n\n### Agent-Mux Plugin Path\n\nA passing artifact bundle should include:\n\n- `agent-mux` invocation: command, selected adapter, model, cwd, prompt digest, `runId`, session mode.\n- Agent-mux event log: ordered `seq`, `ts`, `source`, event type, session/run IDs, terminal event.\n- Harness/plugin setup: `babysitter harness:install <harness>` and `babysitter harness:install-plugin <harness>` output or a cached precondition artifact.\n- Plugin command transcript: user command such as `/babysitter:call`, plugin dispatch evidence, assistant/tool result.\n- Babysitter SDK run evidence: `runId`, `runDir`, `run:iterate` output, `task:list`, `task:post`, terminal journal state.\n- Hook evidence: normalized session/tool/stop event, stop-hook decision, handler env snapshot with secrets redacted.\n\n### Babysitter-Agent Runtime Path\n\nA passing artifact bundle should include:\n\n- `babysitter-agent call` or `babysitter-agent create-run` command and parsed options.\n- Progress events for planning/process path, run creation, session binding, iteration start, effect resolution, and completion.\n- Selected harness/backend: `agent-core` for internal primary tests, external harness name for bridge tests.\n- Generated/provided process path and process fingerprint or file digest.\n- SDK `runId`, `runDir`, session binding result, pending effects, posted task results, terminal state.\n- Redacted model/provider trace for model-backed runs, or mock transcript for no-model runs.\n\n### SDK Run/Session Loop\n\nA passing artifact bundle should include:\n\n- `babysitter run:create --json` output with `runId`, `runDir`, `entry`, `processId`, and session block if bound.\n- `.a5c/runs/<runId>/` file listing or archived subset: metadata, journal/events, tasks.\n- `babysitter run:iterate --json` outputs for each iteration.\n- `babysitter task:list --pending --json` before each post.\n- `babysitter task:post --json` output for every `effectId` resolved by the test.\n- Final `run:status` or terminal journal event proving completion/failure.\n\n### Hooks-Mux Path\n\nA passing artifact bundle should include:\n\n- Raw native hook fixture or redacted live stdin payload.\n- CLI command: `a5c-hooks-mux bootstrap` or `a5c-hooks-mux invoke --adapter <name> --native-event <event>`.\n- Adapter capabilities and mapping support level (`native`, `lossy`, `unsupported`).\n- Normalized `UnifiedHookEvent` with `adapter`, `phase`, `rawEventName`, `supportLevel`, and `execution` fields.\n- Handler plan and child-process result; include stdout/stderr and timeout status.\n- Merged hook result, persisted env/context diff, and native renderer output.\n\n### Transport-Mux Path\n\nA passing artifact bundle should include:\n\n- Agent-mux launch decision: native provider vs transport proxy, `proxyNeeded`, reason, route, and redacted env diff.\n- Transport-mux route request: method, path, query/trace flag, upstream target, status code.\n- Stream evidence: first byte/event, at least one delta, final event, cancellation/timeout case where applicable.\n- Correlation to agent-mux `runId` or session ID.\n- Explicit statement that Babysitter completion is out of scope unless a `babysitterRunId` and SDK terminal state are also present.\n\n## Redaction Rules\n\n- Never store provider API keys, OAuth tokens, cookies, or raw auth headers.\n- Store model/provider names, endpoint family, status code, request shape, token counts, and timing metadata only after redaction.\n- Prompt/transcript artifacts may store prompt digests and bounded excerpts; full live transcripts require a fixture-safe redaction pass.\n- Hook env snapshots must include `AGENT_*` and `HOOKS_PROXY_*` correlation variables but remove credential variables.\n\n## Failure Classification\n\n| Failure class | Example | How to report |\n| --- | --- | --- |\n| Setup failure | Harness/plugin install fails | Mark setup lane failed; do not claim runtime E2E attempted |\n| Capability skip | Codex plugin manager unsupported | Mark skipped with adapter capability artifact |\n| Session correlation failure | Hook event session ID differs from agent-mux session ID | Fail E2E and attach both IDs plus raw/normalized hook evidence |\n| SDK run failure | `run:iterate` emits `RUN_FAILED` | Fail Babysitter run path; attach journal and last effect result |\n| Hook normalization failure | Native event maps to wrong phase/support level | Fail hooks-mux lane; attach raw payload and `UnifiedHookEvent` |\n| Transport failure | Proxy stream times out or loses final event | Fail transport lane; attach route trace and agent-mux session state |\n| Provider failure | Live model returns auth/quota error | Mark model-backed infra failure; keep no-model lane separate |\n\n## Minimal Artifact Naming\n\nUse deterministic artifact names so CI and local runs can be compared:\n\n| Artifact | Suggested name |\n| --- | --- |\n| Agent-mux event log | `agent-mux-events-<agentMuxRunId>.ndjson` |\n| Babysitter run summary | `babysitter-run-<babysitterRunId>.json` |\n| Babysitter task bundle | `babysitter-tasks-<babysitterRunId>.json` |\n| Hook normalized event | `hooks-mux-<adapter>-<nativeEvent>-<sessionId>.json` |\n| Hook handler result | `hooks-mux-handler-<effect-or-tool-id>.json` |\n| Transport trace | `transport-mux-trace-<agentMuxRunId>.json` |\n| Redaction report | `redaction-report-<scenario-id>.json` |\n\n## Scenario Completion Checklist\n\nBefore a scenario is labeled complete, verify:\n\n- [ ] The primary path is declared: agent-mux plugin, babysitter-agent runtime, SDK run loop, hooks-mux fixture, or transport-mux route.\n- [ ] All required identifiers for that path are present and joinable.\n- [ ] The terminal condition is owned by the correct layer.\n- [ ] Any capability gate or model credential requirement is explicit.\n- [ ] Redaction completed before artifacts are uploaded.\n- [ ] The scenario names which permutation IDs from [Stack Permutations](./stack-permutations.md) and which primary flow IDs from [Primary Flow Data Paths](./primary-flow-data-paths.md) it covers.\n",
    "documents": []
  },
  "outgoingEdges": [],
  "incomingEdges": [
    {
      "from": "page:docs-testing",
      "to": "page:docs-testing-trace-identifiers-and-evidence",
      "kind": "contains_page"
    }
  ]
}