Agentic AI Atlasby a5c.ai
OverviewWikiGraphFor AgentsEdgesSearchWorkspace
/
GitHubDocsDiscord
iiRecord
Agentic AI Atlas · Harness Features Roadmap
page:docs-harness-features-backlog-roadmapa5c.ai
Search record views/
Record · tabs

Available views

II.Record viewspp. 1 - 1
overviewarticlejsongraph
III.Related pagespp. 1 - 1
II.
Page reference

page:docs-harness-features-backlog-roadmap

Reading · 14 min

Harness Features Roadmap reference

147 gaps organized into 7 milestones. Each milestone has a goal, unlocks specific

Pagewiki/docs/harness-features-backlog/roadmap.mdOutgoing · 0Incoming · 1

Harness Features Roadmap

147 gaps organized into 7 milestones. Each milestone has a goal, unlocks specific capabilities, and respects dependency ordering. Gaps within a milestone can be worked in parallel unless noted.

---

M0: Quick Wins and Foundations

**Goal**: Ship small, no-prerequisite improvements that immediately improve tool parity and process validation. No architectural changes -- just better defaults.

**Unlocks**: Tool feature parity for existing agentic tools, process parameter validation.

GapTitleEffortPriority
GAP-TOOLS-035Grep Output Modes and Context ParamsSMedium
GAP-TOOLS-033Runtime Configuration ToolSLow
GAP-TOOLS-038Ask Tool Interaction Model AlignmentSLow
GAP-TOOLS-007JS/TS REPL ToolSLow
GAP-PROC-004Process Parameter Schemas and ValidationSMedium

**Estimated scope**: 5 gaps, all S effort. ~1 week.

---

M1: Core Infrastructure

**Goal**: Build the foundational systems that almost everything else depends on. Prompt strata, governance, session model, JSON API, capability routing, and streaming capture. These are the load-bearing walls.

**Unlocks**: Structured prompt composition, policy-based governance, programmatic run management, session-run relationships, harness capability awareness, live output from dispatched tasks.

GapTitleEffortPriorityDepends On
GAP-PROMPT-001Prompt Strata ModelLCritical--
GAP-SEC-001Governance Policy LayerLCritical--
GAP-SESSION-001Session-to-Run One-to-ManyLCritical--
GAP-HADAPT-001Capability-Based Task RoutingLCritical--
GAP-SUBOBS-001Streaming Output CaptureLCritical--
GAP-JSON-001JSON API for Run CreationLCritical--
GAP-JSON-002JSON Effect Dispatch ProtocolLCriticalGAP-JSON-001
GAP-STATE-008Run Health ModelMHigh--
GAP-REMOTE-007Host Contract LayerLHigh--
GAP-PAR-009Parallel Effect Execution StrategiesMHigh--
GAP-ROUTE-003Effect Result Caching and DedupMMedium--

**Estimated scope**: 11 gaps (7 Critical, 3 High, 1 Medium). ~6-8 weeks.

---

M2: Observability and Control

**Goal**: See what's happening during orchestration and control it. Health monitoring, cost tracking, effect cancellation, progress tracking, structured status views, and the embedded SDK dashboard foundation.

**Unlocks**: Operators can monitor run health in real-time, track costs per effect, cancel runaway tasks, see structured status, and get progress updates from subagents. Breakpoint approval chains work.

GapTitleEffortPriorityDepends On
GAP-SUBOBS-002Subagent Progress TrackingMHighM1: SUBOBS-001
GAP-SUBOBS-003Per-Subagent Token and Cost TrackingMHighM1: SUBOBS-001
GAP-TOOLS-030Effect CancellationMHigh--
GAP-TOOLS-036Bash Background ExecutionSMediumGAP-TOOLS-030
GAP-OBS-001Run Health SnapshotMHighM1: STATE-008
GAP-OBS-004Policy Decision TrailMHighM1: SEC-001
GAP-OBS-NEW-001Dashboard Webhook and Alert SystemMHighM1: STATE-008
GAP-UX-005Structured Orchestration Status ViewMHighM1: STATE-008
GAP-UX-006Pending Work InspectorMHigh--
GAP-USER-006Real-Time Cost TrackingMHighGAP-SUBOBS-003, GAP-SESSION-004
GAP-SESSION-002Session State Persistence and HistoryMHighM1: SESSION-001
GAP-SESSION-004Session-Level Cost and BudgetsMHighM1: SESSION-001, GAP-SUBOBS-003
GAP-JSON-003JSON Breakpoint Interaction APIMHighM1: JSON-001
GAP-JSON-004JSON Session Management APIMHighM1: JSON-001
GAP-BRK-001Breakpoint Approval ChainsMHighM1: SEC-001
GAP-SEC-003Permission Request and Denial HooksLHighM1: SEC-001
GAP-SEC-005Approval Posture ModelMHighM1: SEC-001, GAP-SEC-003
GAP-PROMPT-002Deterministic Capability ProjectionMHighM1: PROMPT-001
GAP-PROMPT-005Continuity Overlays for ResumeMHighM1: PROMPT-001, M1: STATE-008
GAP-TOOLS-014Programmatic Task CRUD Beyond CLIMHighM1: JSON-001
GAP-TOOLS-018Structured Planning PhaseMHighM0: PROC-004
GAP-PROMPT-008Coding Philosophy Prompt SectionSHighM1: PROMPT-001
GAP-PROMPT-009Tool Preference and Usage RulesSHighM1: PROMPT-001
GAP-PROMPT-010Safety and Reversibility Prompt FrameworkSHighM1: PROMPT-001
GAP-PROMPT-011Output Efficiency RulesSMediumM1: PROMPT-001
GAP-PROMPT-012Git Safety Protocol Prompt SectionSMediumM1: PROMPT-001

**Estimated scope**: 26 gaps (mostly M effort). ~10-12 weeks.

---

M3: Multi-Harness Orchestration

**Goal**: Route effects to the right harness, run tasks in parallel across harnesses, isolate work in worktrees, compose processes, and support delegation policies. This is where babysitter becomes a true multi-harness orchestrator.

**Unlocks**: Tasks automatically routed to the best harness for the job. Parallel execution across multiple harnesses. Git worktree isolation. Process chaining. Model selection per task. Fallback chains when a harness is unavailable.

GapTitleEffortPriorityDepends On
GAP-AGENT-001Sub-Harness Invocation with IsolationXLHighM1: HADAPT-001, M1: SUBOBS-001
GAP-PAR-001Concurrent Effect ExecutionLHighM1: PAR-009
GAP-PAR-002Async Effect ExecutionLHighGAP-PAR-001, M2: SUBOBS-002
GAP-PAR-003Multi-Harness Parallel DispatchXLHighGAP-PAR-001, GAP-PAR-002, M1: HADAPT-001
GAP-TOOLS-017Git Worktree IsolationLHighGAP-PAR-001, GAP-AGENT-001
GAP-HADAPT-002Model Selection Per TaskMHighM1: HADAPT-001
GAP-HADAPT-004Harness Fallback ChainsMHighM1: HADAPT-001
GAP-AGENT-005Cross-Run CommunicationLHighGAP-AGENT-001
GAP-AGENT-006Cross-Run State SharingLHighM1: SESSION-001
GAP-AGENT-008Harness Selection PoliciesMHighM1: HADAPT-001
GAP-ROUTE-001Smart Effect Routing EngineXLHighM1: HADAPT-001
GAP-PROC-001Process Chaining and PipelinesMHighM0: PROC-004
GAP-PROC-002Process Nesting and Sub-ProcessLHighGAP-AGENT-001
GAP-TOOLS-023Multi-Step Workflow CompositionLHighGAP-PROC-001, GAP-PROC-002
GAP-PERF-001Prompt Caching (Ephemeral)LCriticalM1: PROMPT-001
GAP-PERF-002Session CompactionXLCriticalM1: PROMPT-001
GAP-PERF-005Cache-Aware Prompt AssemblyLHighM1: PROMPT-001, GAP-PERF-001
GAP-PERF-008Structured Continuity StateLHighM2: PROMPT-005
GAP-STATE-003Session State PersistenceLHighM1: SESSION-001
GAP-STATE-001Long-Term Memory ExtractionLHighM2: SESSION-002
GAP-USER-001Operator Command LayerLHighM2: UX-005
GAP-USER-012Plan Mode with VerificationMHighM2: TOOLS-018
GAP-TOOLS-008Web Search Agentic ToolMMediumM1: HADAPT-001

**Estimated scope**: 23 gaps (2 Critical, 19 High, 2 Medium; includes 4 XL). ~12-16 weeks.

---

M4: MCP and External Integration

**Goal**: Connect babysitter to the outside world. MCP tool discovery and invocation, channel messaging (Slack/Gmail/Calendar), remote sessions, event triggers, and streaming protocols. Babysitter becomes a platform, not just a local orchestrator.

**Unlocks**: MCP server tools callable from processes. Slack/Gmail/Calendar integration via MCP channels. Breakpoint approval from Slack. Remote WebSocket sessions. Webhook-triggered runs. Daemon mode for always-on orchestration.

GapTitleEffortPriorityDepends On
GAP-TOOLS-025MCP Tool Discovery and InvocationMHighM1: HADAPT-001, GAP-REMOTE-006
GAP-REMOTE-006MCP Client IntegrationLMediumM1: SEC-001
GAP-MCPC-001MCP Channel Inbound MessagingLHighGAP-TOOLS-025
GAP-MCPC-002MCP Channel Outbound MessagingMHighGAP-MCPC-001
GAP-MCPC-003Channel Permission RelayLHighGAP-MCPC-001, M2: BRK-002
GAP-MCPC-004MCP Server Management UIMMediumGAP-TOOLS-025
GAP-TOOLS-031MCP Resource Browsing and ReadingMMediumGAP-TOOLS-025
GAP-TOOLS-032MCP Authentication (OAuth)LMediumGAP-TOOLS-025
GAP-TOOLS-034Dynamic Tool Discovery and SearchMMediumGAP-TOOLS-025
GAP-JSON-005JSON Event Stream (SSE/WebSocket)LHighM1: JSON-001, GAP-REMOTE-008
GAP-REMOTE-001Daemon ModeXLHigh--
GAP-REMOTE-003Remote Sessions (WebSocket)XLHighM1: REMOTE-007, GAP-JSON-005
GAP-REMOTE-008Streaming Orchestration ProtocolLMediumM1: REMOTE-007
GAP-REMOTE-009Host-Mediated InteractionLMediumM1: REMOTE-007
GAP-REMOTE-004Cron Triggers and SchedulingLMedium--
GAP-TOOLS-020Scheduled Orchestration TriggersLMediumGAP-REMOTE-001, GAP-REMOTE-004
GAP-TOOLS-021External Event TriggersLMediumGAP-REMOTE-001, GAP-TOOLS-020
GAP-BRK-002Breakpoint Delegation to External SystemsLHighM2: JSON-003, M2: OBS-NEW-001
GAP-SEC-002Trust Classes for PluginsLHighM1: SEC-001
GAP-SEC-006OAuth IntegrationLMediumM1: SEC-001
GAP-TOOLS-028Sleep/Delay Effect EnhancementSLowGAP-TOOLS-020

**Estimated scope**: 21 gaps (includes 2 XL). ~10-14 weeks.

---

M5: Rich UI and Experience

**Goal**: Build the Ink/React rendering foundation and all the UI components that make orchestration a first-class visual experience. Structured diffs, effect trees, streaming panels, message rendering, embedded SDK dashboard with drill-down.

**Unlocks**: Rich terminal UI for orchestration. Visual effect trees. Structured diff rendering. Streaming output panels. Subagent drill-down in the embedded SDK dashboard. Operator mode selection.

GapTitleEffortPriorityDepends On
GAP-UX-001eProgress and Status LineSHighGAP-UX-001
GAP-UX-001Ink/React Terminal Rendering FoundationLHigh--
GAP-UX-001aEffect Tree VisualizationMHighGAP-UX-001
GAP-UX-001bStructured Diff RenderingMMediumGAP-UX-001
GAP-UX-001cPermission and Breakpoint Approval UIMHighGAP-UX-001, M2: BRK-001
GAP-UX-001dMessage Type RenderingLMediumGAP-UX-001
GAP-UX-001fStreaming Output PanelsLHighGAP-UX-001, M1: SUBOBS-001
GAP-SUBOBS-005Dashboard Subagent Drill-DownLMediumM2: SUBOBS-002, M2: SUBOBS-003
GAP-OBS-002Phase Timeline VisualizationMMediumM2: OBS-001
GAP-OBS-003Prompt Plan ObservabilityMMediumM1: PROMPT-001
GAP-OBS-005Context IntrospectionMMediumM2: SESSION-004
GAP-OBS-008Agent Progress SummarizationMMediumM2: OBS-001
GAP-OBS-NEW-002Dashboard API for External DashboardsLMediumM1: JSON-001
GAP-UX-007Rich Breakpoint InteractionMMediumM2: SEC-005
GAP-UX-008Resume DashboardMMediumM2: PROMPT-005
GAP-UX-009Failure Triage ViewMMediumM2: OBS-001
GAP-UX-010Typed Effect Interaction PatternsMMediumM2: JSON-003
GAP-UX-011Command DiscoverabilityMMedium--
GAP-UX-014Operator Mode SelectionMMediumM1: PROMPT-001
GAP-PERF-004Streaming Message RenderingLHighM1: SUBOBS-001
GAP-PERF-006Incremental Orchestration StreamingLMediumM4: JSON-005
GAP-TOOLS-029Structured Output ToolMMediumGAP-UX-001b, GAP-UX-001d
GAP-TOOLS-037Fetch Content ProcessingMLow--

**Estimated scope**: 23 gaps. ~10-14 weeks.

---

M6: Platform and Ecosystem

**Goal**: CC plugin compatibility, marketplace protocol, auto-update, trust model, process versioning, memory systems, and remaining polish. Babysitter becomes a full platform with an ecosystem.

**Unlocks**: CC plugins run on babysitter. Marketplace browsing and install. Plugin trust and blocklist. Process versioning and migration. Long-term memory consolidation. Session sharing. Run forking. Full audit export.

GapTitleEffortPriorityDepends On
GAP-ECO-001CC Plugin Compatibility LayerXLCriticalGAP-ECO-002, GAP-ECO-003
GAP-ECO-002CC Marketplace Protocol SupportLHigh--
GAP-ECO-003Plugin Trust and BlocklistMHighM1: SEC-001
GAP-ECO-004Plugin Auto-Update and VersioningMMediumGAP-ECO-002
GAP-ECO-005Plugin Validation and DiagnosticsSMediumGAP-ECO-001
GAP-AGENT-003Process Orchestration with Effect RoutingXLHighM3: AGENT-001, M3: ROUTE-001
GAP-AGENT-004Built-in Process TemplatesLMediumM3: HADAPT-002
GAP-AGENT-007Delegation Policy LayerLMediumM1: SEC-001, M1: HADAPT-001
GAP-HADAPT-003Cost-Based Routing PoliciesLHighM1: HADAPT-001, M2: SESSION-004
GAP-HADAPT-005Harness Health and Circuit BreakerMMediumM1: HADAPT-001
GAP-SUBOBS-004Subagent Health and Timeout MonitoringMMediumM1: SUBOBS-001, M2: SUBOBS-003
GAP-PROC-003Process Versioning and MigrationLMedium--
GAP-STATE-002Memory ConsolidationLMediumM3: STATE-001
GAP-STATE-006Session Rewind and HistoryLMediumM3: STATE-003
GAP-ROUTE-002Effect Priority and SchedulingMMediumM1: PAR-009
GAP-PAR-005Parallel File OperationsMMediumM3: PAR-001
GAP-PAR-006Streaming ParallelismMMediumM3: PAR-001
GAP-PAR-010Fork-Join Process PatternLMediumM3: PAR-003, M3: PROC-002
GAP-PERF-007Aggressive ParallelismLMediumM3: PAR-001
GAP-RUN-001Run Comparison and DiffingMMedium--
GAP-RUN-002Run Archival and RestoreMLow--
GAP-RUN-003Run Forking and BranchingLMediumGAP-STATE-006
GAP-SESSION-003Session Templates and PresetsMMediumM1: SESSION-001
GAP-SESSION-005Session Sharing and CollaborationLLowM2: SESSION-002, M4: REMOTE-003
GAP-OBS-006Analytics and Feature FlagsLMediumGAP-ECO-004
GAP-OBS-007Audit ExportMMediumM2: OBS-004
GAP-USER-017Plugin Management IntegrationMHighGAP-ECO-001
GAP-SEC-004Sandbox ToggleMMediumM1: SEC-001
GAP-SEC-007Privacy SettingsMMediumM1: SEC-001
GAP-PROMPT-003Runtime Personality OverlaysMMediumM1: PROMPT-001
GAP-PROMPT-004Prompt Inspection ToolingMMediumM1: PROMPT-001
GAP-PROMPT-006Instructions Loaded HookMMediumM1: PROMPT-001
GAP-PROMPT-007Context Compression FamiliesLMediumM1: PROMPT-001
GAP-TOOLS-012LSP IntegrationLHighM3: ROUTE-001, M0: PROC-004
GAP-TOOLS-026Structured User Interaction from EffectsMMediumM2: JSON-003
GAP-TOOLS-027Skill Discovery from Process DefinitionsMMediumM1: HADAPT-001
GAP-PROF-001Auto-Configure from User ProfileMMediumGAP-ECO-004
GAP-BRK-003Breakpoint Analytics and SLA TrackingSLowM2: OBS-004

**Estimated scope**: 38 gaps (includes 2 XL). ~16-20 weeks.

---

Milestone Summary

MilestoneGapsGoalCumulative
**M0** Quick Wins5Tool parity polish + process validation5
**M1** Core Infrastructure11Foundational systems everything depends on16
**M2** Observability & Control26See what's happening, control it42
**M3** Multi-Harness Orchestration23Route, parallelize, compose across harnesses65
**M4** MCP & External Integration21Connect to outside world: MCP, channels, remote86
**M5** Rich UI & Experience23Visual orchestration experience109
**M6** Platform & Ecosystem38Full platform with plugin ecosystem147

Dependency Graph (Milestones)

Code
M0 (Quick Wins) ──────────────────────────────────────────┐
  │                                                        │
  v                                                        │
M1 (Core Infrastructure) ─────────────────────────────┐    │
  │                                                    │    │
  ├──> M2 (Observability & Control) ──────────┐        │    │
  │                                            │        │    │
  ├──> M3 (Multi-Harness Orchestration) <──────┤        │    │
  │         │                                  │        │    │
  │         ├──> M4 (MCP & External) <─────────┘        │    │
  │         │                                           │    │
  │         └──> M5 (Rich UI) <─────────────────────────┘    │
  │                   │                                      │
  └──> M6 (Platform & Ecosystem) <───────────────────────────┘

M3 and M4 can partially overlap (MCP client work can start while multi-harness is in progress). M5 can start its foundation (GAP-UX-001) any time after M1. M6 is the long tail -- work items can be pulled forward if priorities shift.

Critical Path

The fastest path to production-grade multi-harness orchestration:

Code
M0 → M1 (PROMPT-001 + HADAPT-001 + SESSION-001 + JSON-001/002)
   → M2 (SUBOBS-002/003 + TOOLS-030 + OBS-001)
   → M3 (AGENT-001 + PAR-001/002/003 + PERF-001/002)

Everything else enhances this core. The critical blockers are: 1. **GAP-PROMPT-001** (Prompt Strata) -- 19 gaps depend on it 2. **GAP-HADAPT-001** (Capability Routing) -- 15 gaps depend on it 3. **GAP-SESSION-001** (Session Model) -- 8 gaps depend on it 4. **GAP-SEC-001** (Governance) -- 12 gaps depend on it 5. **GAP-SUBOBS-001** (Streaming Capture) -- 7 gaps depend on it 6. **GAP-JSON-001** (JSON API) -- 7 gaps depend on it

Article source

The article body is owned directly by this record.

Related pages

No related wiki pages for this record.

Shortcuts

Open overview
Open JSON
Open graph