Agentic AI Atlasby a5c.ai
OverviewWikiGraphFor AgentsEdgesSearchWorkspace
/
GitHubDocsDiscord
i.3Wiki
Agentic AI Atlas · Primary Flow Data Paths
docs/testing/primary-flow-data-pathsa5c.ai
Search the atlas/
Wiki · linked records

Article and nearby pages

I.Current articlepp. 1 - 1
Agent Mux And Runtime E2ECoverage And ReportingCurrent Test Command InventoryHarness And Plugin E2EImplementation RoadmapMock And Fixture Contracts
I.
Wiki article

docs/testing/primary-flow-data-paths

Reading · 10 min

Primary Flow Data Paths reference

This document maps the main flows that the rebuilt E2E strategy should prove. It is intentionally data-path oriented: every flow names the caller, command/API boundary, state that must be created, hook/session artifacts that should exist, and the identifiers that let a test join evidence across packages.

Page nodewiki/docs/testing/primary-flow-data-paths.mdNearby pages · 11Documents · 0

Continue reading

Nearby pages in the same section.

Agent Mux And Runtime E2ECoverage And ReportingCurrent Test Command InventoryHarness And Plugin E2EImplementation RoadmapMock And Fixture ContractsPipeline IntegrationQuality GatesStack PermutationsTest LanesTrace Identifiers And Evidence

Primary Flow Data Paths

This document maps the main flows that the rebuilt E2E strategy should prove. It is intentionally data-path oriented: every flow names the caller, command/API boundary, state that must be created, hook/session artifacts that should exist, and the identifiers that let a test join evidence across packages.

Primary Configuration

The primary configuration has two valid runtime paths and one shared hook/trace layer:

PathPrimary targetWhat it provesWhat it must not claim
Agent-mux plugin pathClaude Code first; Codex only when capability-gated plugin support is availableA real external harness session can be launched through agent-mux, the Babysitter plugin can run a /babysitter:call-style session command, and the resulting Babysitter run reaches a terminal stateIt does not prove babysitter-agent runtime orchestration and does not use babysitter-agent create-run
Babysitter-agent runtime pathbabysitter-agent call / babysitter-agent create-run with agent-core internal backend, plus external-harness bridge where selectedThe runtime can understand intent, create or reuse a process, create and bind a Babysitter SDK run, iterate effects, resolve tasks, and completeIt does not install external harness plugins; babysitter harness:install belongs to SDK setup, not this path
Hooks and transport layerhooks-mux and transport-mux alongside either runtime pathNative hook payloads normalize into UnifiedHookEvent, handlers receive traceable env/stdin, and provider traffic can be proxied/recorded where configuredHooks-mux does not own agent-mux sessions; transport-mux does not own Babysitter run state

Flow A: Agent-Mux Plugin Session To Babysitter Run

This is the primary plugin E2E for Claude Code. Codex uses the same shape only after an explicit capability gate proves plugin install/support for the Codex adapter.

text
operator / CI
  -> babysitter harness:install claude
  -> babysitter harness:install-plugin claude
  -> agent-mux CLI (`amux run` or launch path)
  -> agent-mux adapter/runtime session
  -> external harness process (Claude Code primary)
  -> Babysitter plugin command inside the harness session
  -> Babysitter SDK run creation / iteration
  -> hooks-mux native hook normalization and stop-hook evidence
  -> terminal Babysitter run state and agent-mux event log evidence

Data Path

StepBoundaryData passedRequired evidence
1SDK setup CLIHarness name and plugin target via babysitter harness:install and babysitter harness:install-pluginInstall JSON or log, installed plugin path, marketplace/registry entry, idempotency result
2Agent-mux invocationAgent name, prompt, --session, --run-id, cwd/env/model flags from packages/agent-mux/cli/src/commands/run.tsagent-mux run ID, selected adapter, cwd, model, prompt digest, session mode
3Agent-mux gateway/runtimeSession runtime and event log under packages/agent-mux/gateway/src/runs/session-runtime.ts and packages/agent-mux/gateway/src/runs/event-log.tsEvent-log file or API events with monotonic seq, source, ts, event type, runId
4External harnessNative harness session ID, native hook payloads, tool calls, stop/session eventsHarness transcript/session ID, native hook payload fixture or redacted live artifact
5Babysitter plugin command/babysitter:call or equivalent Babysitter-enabled session command posted in the harnessAssistant/tool transcript showing command, plugin dispatch evidence, created Babysitter runId
6SDK run looprun:create, run:iterate, pending effects, task:post, terminal completion.a5c/runs/<runId>/, journal/events, tasks/<effectId>/result.json, terminal status
7Hook bridgehooks-mux normalizes session/tool/stop hooks and injects AGENT_* envUnifiedHookEvent, handler stdin/stdout, AGENT_SESSION_ID, stop-hook result

Assertions

  • The agent-mux runId and session ID are recorded before the Babysitter plugin command runs.
  • The Babysitter plugin command creates or resumes exactly one Babysitter runId for the scenario.
  • The Babysitter runId appears in final output and maps to an existing .a5c/runs/<runId>/ directory.
  • At least one hook artifact proves stop/session handling, not just assistant text.
  • The final state is terminal: RUN_COMPLETED or equivalent completed status from the SDK run, not merely a successful model reply.

Flow B: Babysitter-Agent Runtime Create-Run

This path tests @a5c-ai/babysitter-agent as the runtime owner. It is separate from agent-mux plugin setup.

text
operator / CI
  -> babysitter-agent call/create-run
  -> PhaseUnderstandIntent / PhasePlanProcess
  -> process definition in workspace `.a5c/processes` or provided `--process`
  -> Babysitter SDK `createRun`
  -> session binding for selected harness/backend
  -> PhaseOrchestration loop
  -> effect resolution through internal `agent-core` or external harness bridge
  -> SDK `commitEffectResult` / task result files
  -> terminal run completion

Data Path

StepBoundaryData passedRequired evidence
1babysitter-agent CLIcall, create-run, yolo, plan, resume-run; args parsed in packages/babysitter-agent/src/cli/dispatch.tsInvocation command, selected harness, workspace, model, max iterations, output mode
2Create-run coordinatorhandleHarnessCreateRun in packages/babysitter-agent/src/harness/internal/createRun/index.tsProgress events for planning, process path, run creation, session binding
3Planning phasePrompt, workspace context, selected harness, compression configProcess file path, process fingerprint or generated process report, optional planning conversation summary
4SDK run creationcreateRun through packages/sdk/src/cli/main/runCreate.ts or SDK APIrunId, runDir, process ID, entrypoint, inputs path, non-interactive metadata
5Session bindingSelected harness session ID from resolveHarnessSessionIdForBinding and SDK session stateBabysitter session ID, state file, run/session association, harness name
6Orchestration looporchestrateIteration, pending EffectActions, resolveEffect, commitEffectResultIteration count, pending effect IDs, task IDs, task result refs, stdout/stderr refs
7Effect executionInternal agent-core for internal harnesses; external bridge for external harnessesModel/provider trace redacted, backend name, task result JSON, errors/retries if any
8Terminal stateSDK journal and completion proofRUN_COMPLETED, final summary, completion proof only after terminal state

Assertions

  • Runtime tests invoke babysitter-agent, not babysitter harness:install.
  • The selected harness/backend is recorded (agent-core for the internal primary path; external harness bridge only for explicit external-harness tests).
  • The created or resumed runId is bound to a session and appears in SDK state and final output.
  • Every pending effect has a posted result or a declared failure, keyed by effectId.
  • A terminal Babysitter state is the pass condition.

Flow C: SDK Run/Session Loop

This is the deterministic contract shared by both runtime paths.

text
babysitter run:create
  -> .a5c/runs/<runId>/ metadata + journal
  -> optional session binding through harness adapter
  -> babysitter run:iterate
  -> pending effects under tasks/<effectId>/task.json
  -> babysitter task:post
  -> result refs under tasks/<effectId>/
  -> repeated run:iterate
  -> RUN_COMPLETED / RUN_FAILED

Command Boundaries

CommandOwnerState created or readEvidence key
babysitter run:create --process-id ... --entry ... --inputs ...SDK CLIRun directory, run metadata, initial journal, optional session bindingrunId, runDir, entry, processId, session.sessionId
babysitter session:init --session-id ...SDK CLISession state filestateFile, iteration, max iterations
babysitter session:associate --session-id ... --run-id ...SDK CLISession file updated with run IDstateFile, runId
babysitter run:iterate <runDir>SDK CLI/runtimeReplayed state, emitted effects, terminal eventsiteration, status, nextActions[].effectId
babysitter task:list <runDir> --pendingSDK CLI/runtimePending task indexeffectId, taskId, stepId, kind, taskDefRef
babysitter task:post <runDir> <effectId> --status ok --value <file>SDK CLI/runtimeTask result, stdout/stderr refs, effect resolution journal eventeffectId, resultRef, status

Flow D: Hooks-Mux Native Hook Path

Hooks-mux is the canonical hook-normalization and handler fan-out layer.

text
native harness hook payload on stdin
  -> `a5c-hooks-mux bootstrap` or `a5c-hooks-mux invoke`
  -> adapter loader (the matching hooks-mux adapter package for the selected harness)
  -> adapter normalizer
  -> `UnifiedHookEvent`
  -> handler plan + child-process handlers
  -> merged hook result
  -> session env/context persistence
  -> native renderer output back to harness

Data Path

StepBoundaryData passedRequired evidence
1Native hookClaude/Codex/Gemini/etc. JSON stdin and native event nameRaw hook payload fixture or redacted live payload
2CLI entrybootstrap, invoke, or exec in packages/hooks-mux/cli/src/cli/commandsCLI args, adapter name, native event, explicit session override if any
3Adapter loadloadAdapter resolves package and capabilitiesAdapter name, capability JSON, phase mappings
4NormalizeAdapter builds UnifiedHookEvent from packages/hooks-mux/core/src/types/event.tsversion, adapter, phase, rawEventName, supportLevel, execution.*
5Handler executionrunPlan injects event on stdin and context env into child handlersHandler command, HOOKS_PROXY_EVENT, AGENT_SESSION_ID, AGENT_ADAPTER, timeout/result
6Merge and persistMerge result updates session persisted env/context varspersistEnv, contextVars, unsetEnv, session file diff
7RenderAdapter renderer writes native hook outputNative decision/output JSON and dropped/degraded fields

Assertions

  • Tests assert both raw native event and canonical phase.
  • UnifiedHookEvent.execution.sessionId matches the session used by agent-mux or Babysitter where the flow crosses that boundary.
  • Stop-hook tests assert recursion guard/stop behavior explicitly.
  • Handler env contains AGENT_SESSION_ID and AGENT_ADAPTER; sensitive provider keys are redacted from artifacts.

Flow E: Transport-Mux Assisted Agent-Mux Launch

Transport-mux belongs to provider/proxy transport, not Babysitter run state. The primary E2E use is to prove that an agent-mux launch can route provider traffic through a configured transport proxy and still complete a model-backed session.

text
agent-mux launch/run
  -> launch decision: native provider vs transport-mux proxy
  -> transport-mux HTTP/SSE route
  -> upstream provider or mock transport
  -> streamed/non-streamed response
  -> agent-mux session event log
  -> optional hooks-mux events from harness runtime

Assertions

  • Agent-mux launch evidence includes proxyNeeded/proxyReason or equivalent launch decision metadata.
  • Transport evidence includes route, upstream target, status code, stream completion/cancellation, timeout behavior, and redacted auth metadata.
  • The transport trace is correlated to an agent-mux runId or session ID.
  • The transport test does not claim Babysitter completion unless a Babysitter run ID and terminal SDK state are also present.

Valid Primary Test Set

IDFlowLaneMinimum proof
PF-1SDK run/session loopNo-modelCreate run, list pending task, post result, complete run, inspect journal
PF-2Hooks-mux Claude fixtureNo-modelSession/tool/stop hook fixtures normalize and render; handler env contains trace IDs
PF-3Hooks-mux Codex fixtureNo-modelSession/tool aliases normalize, lossy/native support levels match mapping, handler env is present
PF-4Agent-mux mock sessionNo-modelrunId, session event log, ordered events, terminal session output
PF-5Transport-mux mock routeNo-modelProxy route roundtrip, stream and non-stream artifacts, timeout/cancel fixture
PF-6Babysitter-agent internalModel-backed or controlled fake modelbabysitter-agent call/create-run, agent-core backend, SDK run terminal state
PF-7Agent-mux + Claude + Babysitter pluginModel-backedHarness/plugin installed, /babysitter:call, agent-mux session log, SDK run terminal state, stop hook evidence
PF-8Agent-mux + Codex + Babysitter pluginCapability-gated model-backedSame as PF-7 only after plugin support is proven; otherwise skip evidence must cite capability gate
PF-9Agent-mux + transport-mux live streamModel-backedLaunch decision, proxy trace, streamed response, agent-mux session completion

Source Map

AreaSource files to inspect first
Agent-mux CLI and sessionspackages/agent-mux/cli/src/commands/run.ts, packages/agent-mux/cli/src/commands/launch.ts, packages/agent-mux/gateway/src/runs/session-runtime.ts, packages/agent-mux/gateway/src/runs/event-log.ts
Babysitter-agent runtimepackages/babysitter-agent/src/cli/dispatch.ts, packages/babysitter-agent/src/cli/commands/harness/createRun.ts, packages/babysitter-agent/src/harness/internal/createRun/index.ts, packages/babysitter-agent/src/harness/internal/createRun/orchestration/effects.ts
SDK run/session looppackages/sdk/src/cli/main/runCreate.ts, packages/sdk/src/cli/main/taskCommands.ts, packages/sdk/src/cli/commands/session/init.ts, packages/sdk/src/cli/commands/session/associate.ts
Hooks-muxpackages/hooks-mux/cli/src/cli/commands/invoke.ts, packages/hooks-mux/cli/src/cli/bootstrap-runtime.ts, packages/hooks-mux/core/src/types/event.ts, packages/hooks-mux/core/src/normalizer/runner.ts
Transport-muxpackages/transport-mux/src/index.ts, packages/transport-mux/tests/e2e/http-roundtrip.test.ts, packages/transport-mux/tests/runtime.test.ts

Trail

Wiki
Babysitter Docs
Testing Strategy

Primary Flow Data Paths

Continue reading

Agent Mux And Runtime E2E
Coverage And Reporting
Current Test Command Inventory
Harness And Plugin E2E
Implementation Roadmap
Mock And Fixture Contracts
Pipeline Integration
Quality Gates

Page record

Open node ledger

wiki/docs/testing/primary-flow-data-paths.md

Documents

No documented graph nodes on this page.