Agentic AI Atlasby a5c.ai
OverviewWikiGraphFor AgentsEdgesSearchWorkspace
/
GitHubDocsDiscord
iiRecord
Agentic AI Atlas · Trace Identifiers And Evidence
page:docs-testing-trace-identifiers-and-evidencea5c.ai
Search record views/
Record · tabs

Available views

II.Record viewspp. 1 - 1
overviewarticlejsongraph
III.Related pagespp. 1 - 1
II.
Page reference

page:docs-testing-trace-identifiers-and-evidence

Reading · 7 min

Trace Identifiers And Evidence reference

Use this document as the evidence checklist for tests described in Primary Flow Data Paths(./primary-flow-data-paths.md). A scenario should not be marked E2E unless it records the identifiers needed to join the agent session, hook events, Babysitter run state, and transport trace.

Pagewiki/docs/testing/trace-identifiers-and-evidence.mdOutgoing · 0Incoming · 1

Trace Identifiers And Evidence

Use this document as the evidence checklist for tests described in Primary Flow Data Paths. A scenario should not be marked E2E unless it records the identifiers needed to join the agent session, hook events, Babysitter run state, and transport trace.

Identifier Spine

IdentifierOwnerWhere it appearsWhy it matters
agentMuxRunId / runIdAgent-muxCLI result, gateway runtime state, event log filename or event bodyJoins agent-mux session events to launch/transport evidence
agentMuxSessionId / sessionIdAgent-mux/external harnessCLI args, session runtime, harness transcriptProves continuity across prompts, plugin command, and hook events
babysitterRunId / SDK runIdBabysitter SDK and babysitter-agentrun:create output, .a5c/runs/<runId>/, babysitter-agent progress eventsPrimary key for SDK journal, tasks, and terminal state
runDirBabysitter SDKrun:create output, babysitter-agent progress eventsFilesystem root for journal, tasks, outputs, and replay state
babysitterSessionIdSDK session binding or harness adaptersession:init, session:associate, run-create session block, hooks envJoins harness session to SDK run loop
effectIdBabysitter SDKrun:iterate next actions, task:list, task:post, tasks/<effectId>/Joins requested work to posted results
taskId / stepIdBabysitter process runtimetask:list, task definition refsNames process step semantics independently of generated effect ID
UnifiedHookEvent.execution.sessionIdHooks-muxNormalized hook event JSONJoins native hook event to agent or Babysitter session
UnifiedHookEvent.execution.toolCallIdHooks-mux/native harnessTool hook payloads and normalized eventJoins tool call ready/result pairs and handler decisions
event.seqAgent-mux gateway event logpackages/agent-mux/gateway/src/runs/event-log.ts event entriesOrders session events and detects gaps/truncation
Transport request/trace IDTransport-muxProxy request logs, trace query/headers, upstream metadataJoins provider request/stream to agent-mux launch/session

Environment And Hook Context

Variable or payload fieldProduced byConsumed byRequired assertion
AGENT_SESSION_IDHooks-mux bootstrap/session persistence or SDK harness adapterHook handlers, child commands, SDK session bindingEquals the scenario session ID and is stable across hook invocations
AGENT_ADAPTERHooks-mux normalized execution contextHook handlers and trace artifactsEquals selected adapter such as claude, codex, or gemini
AGENT_WORKSPACE_ROOTHooks-mux execution contextHook handlers and subprocessesEquals expected workspace/cwd
AGENT_TRANSCRIPT_PATHHarness-native payload where availableHook handlers and evidence collectorPoints to redacted transcript artifact when available
AGENT_CAPABILITIES_JSONHooks-mux handler runnerHook handlersCaptures adapter capability gate decisions
HOOKS_PROXY_EVENTHooks-mux handler runnerHook handlersJSON equals the normalized event given on stdin
CLAUDE_ENV_FILEClaude native hook environmentHooks-mux propagation backendContains exported persisted env after bootstrap or handler result
HOOKS_PROXY_ENV_FILEGeneric hooks-mux env propagationHooks-mux propagation backendContains persisted env when native env file is not provider-specific
HOOKS_PROXY_SESSION_IDAdapter enrichment/fallbackNormalizerMatches native session ID when adapter enriches env from stdin
HOOKS_PROXY_TOOL_NAME / HOOKS_PROXY_TOOL_CALL_IDAdapter enrichmentNormalizer/handler envMatches native tool payload values

Evidence Bundles By Flow

Agent-Mux Plugin Path

A passing artifact bundle should include:

  • agent-mux invocation: command, selected adapter, model, cwd, prompt digest, runId, session mode.
  • Agent-mux event log: ordered seq, ts, source, event type, session/run IDs, terminal event.
  • Harness/plugin setup: babysitter harness:install <harness> and babysitter harness:install-plugin <harness> output or a cached precondition artifact.
  • Plugin command transcript: user command such as /babysitter:call, plugin dispatch evidence, assistant/tool result.
  • Babysitter SDK run evidence: runId, runDir, run:iterate output, task:list, task:post, terminal journal state.
  • Hook evidence: normalized session/tool/stop event, stop-hook decision, handler env snapshot with secrets redacted.

Babysitter-Agent Runtime Path

A passing artifact bundle should include:

  • babysitter-agent call or babysitter-agent create-run command and parsed options.
  • Progress events for planning/process path, run creation, session binding, iteration start, effect resolution, and completion.
  • Selected harness/backend: agent-core for internal primary tests, external harness name for bridge tests.
  • Generated/provided process path and process fingerprint or file digest.
  • SDK runId, runDir, session binding result, pending effects, posted task results, terminal state.
  • Redacted model/provider trace for model-backed runs, or mock transcript for no-model runs.

SDK Run/Session Loop

A passing artifact bundle should include:

  • babysitter run:create --json output with runId, runDir, entry, processId, and session block if bound.
  • .a5c/runs/<runId>/ file listing or archived subset: metadata, journal/events, tasks.
  • babysitter run:iterate --json outputs for each iteration.
  • babysitter task:list --pending --json before each post.
  • babysitter task:post --json output for every effectId resolved by the test.
  • Final run:status or terminal journal event proving completion/failure.

Hooks-Mux Path

A passing artifact bundle should include:

  • Raw native hook fixture or redacted live stdin payload.
  • CLI command: a5c-hooks-mux bootstrap or a5c-hooks-mux invoke --adapter <name> --native-event <event>.
  • Adapter capabilities and mapping support level (native, lossy, unsupported).
  • Normalized UnifiedHookEvent with adapter, phase, rawEventName, supportLevel, and execution fields.
  • Handler plan and child-process result; include stdout/stderr and timeout status.
  • Merged hook result, persisted env/context diff, and native renderer output.

Transport-Mux Path

A passing artifact bundle should include:

  • Agent-mux launch decision: native provider vs transport proxy, proxyNeeded, reason, route, and redacted env diff.
  • Transport-mux route request: method, path, query/trace flag, upstream target, status code.
  • Stream evidence: first byte/event, at least one delta, final event, cancellation/timeout case where applicable.
  • Correlation to agent-mux runId or session ID.
  • Explicit statement that Babysitter completion is out of scope unless a babysitterRunId and SDK terminal state are also present.

Redaction Rules

  • Never store provider API keys, OAuth tokens, cookies, or raw auth headers.
  • Store model/provider names, endpoint family, status code, request shape, token counts, and timing metadata only after redaction.
  • Prompt/transcript artifacts may store prompt digests and bounded excerpts; full live transcripts require a fixture-safe redaction pass.
  • Hook env snapshots must include AGENT_* and HOOKS_PROXY_* correlation variables but remove credential variables.

Failure Classification

Failure classExampleHow to report
Setup failureHarness/plugin install failsMark setup lane failed; do not claim runtime E2E attempted
Capability skipCodex plugin manager unsupportedMark skipped with adapter capability artifact
Session correlation failureHook event session ID differs from agent-mux session IDFail E2E and attach both IDs plus raw/normalized hook evidence
SDK run failurerun:iterate emits RUN_FAILEDFail Babysitter run path; attach journal and last effect result
Hook normalization failureNative event maps to wrong phase/support levelFail hooks-mux lane; attach raw payload and UnifiedHookEvent
Transport failureProxy stream times out or loses final eventFail transport lane; attach route trace and agent-mux session state
Provider failureLive model returns auth/quota errorMark model-backed infra failure; keep no-model lane separate

Minimal Artifact Naming

Use deterministic artifact names so CI and local runs can be compared:

ArtifactSuggested name
Agent-mux event logagent-mux-events-<agentMuxRunId>.ndjson
Babysitter run summarybabysitter-run-<babysitterRunId>.json
Babysitter task bundlebabysitter-tasks-<babysitterRunId>.json
Hook normalized eventhooks-mux-<adapter>-<nativeEvent>-<sessionId>.json
Hook handler resulthooks-mux-handler-<effect-or-tool-id>.json
Transport tracetransport-mux-trace-<agentMuxRunId>.json
Redaction reportredaction-report-<scenario-id>.json

Scenario Completion Checklist

Before a scenario is labeled complete, verify:

  • [ ] The primary path is declared: agent-mux plugin, babysitter-agent runtime, SDK run loop, hooks-mux fixture, or transport-mux route.
  • [ ] All required identifiers for that path are present and joinable.
  • [ ] The terminal condition is owned by the correct layer.
  • [ ] Any capability gate or model credential requirement is explicit.
  • [ ] Redaction completed before artifacts are uploaded.
  • [ ] The scenario names which permutation IDs from Stack Permutations and which primary flow IDs from Primary Flow Data Paths it covers.

Article source

The article body is owned directly by this record.

Related pages

No related wiki pages for this record.

Shortcuts

Open overview
Open JSON
Open graph