II.
Page overview
Reference · livepage:docs-testing-trace-identifiers-and-evidence
Trace Identifiers And Evidence overview
Inspect the raw attributes, linked wiki pages, and inbound or outbound graph edges for page:docs-testing-trace-identifiers-and-evidence.
Attributes
nodeKind
Page
sourcePath
docs/testing/trace-identifiers-and-evidence.md
sourceKind
repo-docs
title
Trace Identifiers And Evidence
displayName
Trace Identifiers And Evidence
slug
docs/testing/trace-identifiers-and-evidence
articlePath
wiki/docs/testing/trace-identifiers-and-evidence.md
article
# Trace Identifiers And Evidence
Use this document as the evidence checklist for tests described in [Primary Flow Data Paths](./primary-flow-data-paths.md). A scenario should not be marked E2E unless it records the identifiers needed to join the agent session, hook events, Babysitter run state, and transport trace.
## Identifier Spine
| Identifier | Owner | Where it appears | Why it matters |
| --- | --- | --- | --- |
| `agentMuxRunId` / `runId` | Agent-mux | CLI result, gateway runtime state, event log filename or event body | Joins agent-mux session events to launch/transport evidence |
| `agentMuxSessionId` / `sessionId` | Agent-mux/external harness | CLI args, session runtime, harness transcript | Proves continuity across prompts, plugin command, and hook events |
| `babysitterRunId` / SDK `runId` | Babysitter SDK and babysitter-agent | `run:create` output, `.a5c/runs/<runId>/`, `babysitter-agent` progress events | Primary key for SDK journal, tasks, and terminal state |
| `runDir` | Babysitter SDK | `run:create` output, `babysitter-agent` progress events | Filesystem root for journal, tasks, outputs, and replay state |
| `babysitterSessionId` | SDK session binding or harness adapter | `session:init`, `session:associate`, run-create session block, hooks env | Joins harness session to SDK run loop |
| `effectId` | Babysitter SDK | `run:iterate` next actions, `task:list`, `task:post`, `tasks/<effectId>/` | Joins requested work to posted results |
| `taskId` / `stepId` | Babysitter process runtime | `task:list`, task definition refs | Names process step semantics independently of generated effect ID |
| `UnifiedHookEvent.execution.sessionId` | Hooks-mux | Normalized hook event JSON | Joins native hook event to agent or Babysitter session |
| `UnifiedHookEvent.execution.toolCallId` | Hooks-mux/native harness | Tool hook payloads and normalized event | Joins tool call ready/result pairs and handler decisions |
| `event.seq` | Agent-mux gateway event log | `packages/agent-mux/gateway/src/runs/event-log.ts` event entries | Orders session events and detects gaps/truncation |
| Transport request/trace ID | Transport-mux | Proxy request logs, trace query/headers, upstream metadata | Joins provider request/stream to agent-mux launch/session |
## Environment And Hook Context
| Variable or payload field | Produced by | Consumed by | Required assertion |
| --- | --- | --- | --- |
| `AGENT_SESSION_ID` | Hooks-mux bootstrap/session persistence or SDK harness adapter | Hook handlers, child commands, SDK session binding | Equals the scenario session ID and is stable across hook invocations |
| `AGENT_ADAPTER` | Hooks-mux normalized execution context | Hook handlers and trace artifacts | Equals selected adapter such as `claude`, `codex`, or `gemini` |
| `AGENT_WORKSPACE_ROOT` | Hooks-mux execution context | Hook handlers and subprocesses | Equals expected workspace/cwd |
| `AGENT_TRANSCRIPT_PATH` | Harness-native payload where available | Hook handlers and evidence collector | Points to redacted transcript artifact when available |
| `AGENT_CAPABILITIES_JSON` | Hooks-mux handler runner | Hook handlers | Captures adapter capability gate decisions |
| `HOOKS_PROXY_EVENT` | Hooks-mux handler runner | Hook handlers | JSON equals the normalized event given on stdin |
| `CLAUDE_ENV_FILE` | Claude native hook environment | Hooks-mux propagation backend | Contains exported persisted env after bootstrap or handler result |
| `HOOKS_PROXY_ENV_FILE` | Generic hooks-mux env propagation | Hooks-mux propagation backend | Contains persisted env when native env file is not provider-specific |
| `HOOKS_PROXY_SESSION_ID` | Adapter enrichment/fallback | Normalizer | Matches native session ID when adapter enriches env from stdin |
| `HOOKS_PROXY_TOOL_NAME` / `HOOKS_PROXY_TOOL_CALL_ID` | Adapter enrichment | Normalizer/handler env | Matches native tool payload values |
## Evidence Bundles By Flow
### Agent-Mux Plugin Path
A passing artifact bundle should include:
- `agent-mux` invocation: command, selected adapter, model, cwd, prompt digest, `runId`, session mode.
- Agent-mux event log: ordered `seq`, `ts`, `source`, event type, session/run IDs, terminal event.
- Harness/plugin setup: `babysitter harness:install <harness>` and `babysitter harness:install-plugin <harness>` output or a cached precondition artifact.
- Plugin command transcript: user command such as `/babysitter:call`, plugin dispatch evidence, assistant/tool result.
- Babysitter SDK run evidence: `runId`, `runDir`, `run:iterate` output, `task:list`, `task:post`, terminal journal state.
- Hook evidence: normalized session/tool/stop event, stop-hook decision, handler env snapshot with secrets redacted.
### Babysitter-Agent Runtime Path
A passing artifact bundle should include:
- `babysitter-agent call` or `babysitter-agent create-run` command and parsed options.
- Progress events for planning/process path, run creation, session binding, iteration start, effect resolution, and completion.
- Selected harness/backend: `agent-core` for internal primary tests, external harness name for bridge tests.
- Generated/provided process path and process fingerprint or file digest.
- SDK `runId`, `runDir`, session binding result, pending effects, posted task results, terminal state.
- Redacted model/provider trace for model-backed runs, or mock transcript for no-model runs.
### SDK Run/Session Loop
A passing artifact bundle should include:
- `babysitter run:create --json` output with `runId`, `runDir`, `entry`, `processId`, and session block if bound.
- `.a5c/runs/<runId>/` file listing or archived subset: metadata, journal/events, tasks.
- `babysitter run:iterate --json` outputs for each iteration.
- `babysitter task:list --pending --json` before each post.
- `babysitter task:post --json` output for every `effectId` resolved by the test.
- Final `run:status` or terminal journal event proving completion/failure.
### Hooks-Mux Path
A passing artifact bundle should include:
- Raw native hook fixture or redacted live stdin payload.
- CLI command: `a5c-hooks-mux bootstrap` or `a5c-hooks-mux invoke --adapter <name> --native-event <event>`.
- Adapter capabilities and mapping support level (`native`, `lossy`, `unsupported`).
- Normalized `UnifiedHookEvent` with `adapter`, `phase`, `rawEventName`, `supportLevel`, and `execution` fields.
- Handler plan and child-process result; include stdout/stderr and timeout status.
- Merged hook result, persisted env/context diff, and native renderer output.
### Transport-Mux Path
A passing artifact bundle should include:
- Agent-mux launch decision: native provider vs transport proxy, `proxyNeeded`, reason, route, and redacted env diff.
- Transport-mux route request: method, path, query/trace flag, upstream target, status code.
- Stream evidence: first byte/event, at least one delta, final event, cancellation/timeout case where applicable.
- Correlation to agent-mux `runId` or session ID.
- Explicit statement that Babysitter completion is out of scope unless a `babysitterRunId` and SDK terminal state are also present.
## Redaction Rules
- Never store provider API keys, OAuth tokens, cookies, or raw auth headers.
- Store model/provider names, endpoint family, status code, request shape, token counts, and timing metadata only after redaction.
- Prompt/transcript artifacts may store prompt digests and bounded excerpts; full live transcripts require a fixture-safe redaction pass.
- Hook env snapshots must include `AGENT_*` and `HOOKS_PROXY_*` correlation variables but remove credential variables.
## Failure Classification
| Failure class | Example | How to report |
| --- | --- | --- |
| Setup failure | Harness/plugin install fails | Mark setup lane failed; do not claim runtime E2E attempted |
| Capability skip | Codex plugin manager unsupported | Mark skipped with adapter capability artifact |
| Session correlation failure | Hook event session ID differs from agent-mux session ID | Fail E2E and attach both IDs plus raw/normalized hook evidence |
| SDK run failure | `run:iterate` emits `RUN_FAILED` | Fail Babysitter run path; attach journal and last effect result |
| Hook normalization failure | Native event maps to wrong phase/support level | Fail hooks-mux lane; attach raw payload and `UnifiedHookEvent` |
| Transport failure | Proxy stream times out or loses final event | Fail transport lane; attach route trace and agent-mux session state |
| Provider failure | Live model returns auth/quota error | Mark model-backed infra failure; keep no-model lane separate |
## Minimal Artifact Naming
Use deterministic artifact names so CI and local runs can be compared:
| Artifact | Suggested name |
| --- | --- |
| Agent-mux event log | `agent-mux-events-<agentMuxRunId>.ndjson` |
| Babysitter run summary | `babysitter-run-<babysitterRunId>.json` |
| Babysitter task bundle | `babysitter-tasks-<babysitterRunId>.json` |
| Hook normalized event | `hooks-mux-<adapter>-<nativeEvent>-<sessionId>.json` |
| Hook handler result | `hooks-mux-handler-<effect-or-tool-id>.json` |
| Transport trace | `transport-mux-trace-<agentMuxRunId>.json` |
| Redaction report | `redaction-report-<scenario-id>.json` |
## Scenario Completion Checklist
Before a scenario is labeled complete, verify:
- [ ] The primary path is declared: agent-mux plugin, babysitter-agent runtime, SDK run loop, hooks-mux fixture, or transport-mux route.
- [ ] All required identifiers for that path are present and joinable.
- [ ] The terminal condition is owned by the correct layer.
- [ ] Any capability gate or model credential requirement is explicit.
- [ ] Redaction completed before artifacts are uploaded.
- [ ] The scenario names which permutation IDs from [Stack Permutations](./stack-permutations.md) and which primary flow IDs from [Primary Flow Data Paths](./primary-flow-data-paths.md) it covers.
documents
[]
Outgoing edges
None.
Incoming edges
contains_page1
- page:docs-testing·PageTesting Strategy