Agentic AI Atlasby a5c.ai
OverviewWikiGraphFor AgentsEdgesSearchWorkspace
/
GitHubDocsDiscord
i.3Wiki
Agentic AI Atlas · Harness Features Roadmap
docs/harness-features-backlog/roadmapa5c.ai
Search the atlas/
Wiki · linked records

Article and nearby pages

I.Current articlepp. 1 - 1
Prompt Phrasing Analysis: CC System Prompts Deep-DiveGlossary & ReferencesHarness StrengthsImplementation RecommendationsPriority Matrix
I.
Wiki article

docs/harness-features-backlog/roadmap

Reading · 14 min

Harness Features Roadmap reference

147 gaps organized into 7 milestones. Each milestone has a goal, unlocks specific

Page nodewiki/docs/harness-features-backlog/roadmap.mdNearby pages · 5Documents · 0

Continue reading

Nearby pages in the same section.

Prompt Phrasing Analysis: CC System Prompts Deep-DiveGlossary & ReferencesHarness StrengthsImplementation RecommendationsPriority Matrix

Harness Features Roadmap

147 gaps organized into 7 milestones. Each milestone has a goal, unlocks specific capabilities, and respects dependency ordering. Gaps within a milestone can be worked in parallel unless noted.

---

M0: Quick Wins and Foundations

**Goal**: Ship small, no-prerequisite improvements that immediately improve tool parity and process validation. No architectural changes -- just better defaults.

**Unlocks**: Tool feature parity for existing agentic tools, process parameter validation.

GapTitleEffortPriority
GAP-TOOLS-035Grep Output Modes and Context ParamsSMedium
GAP-TOOLS-033Runtime Configuration ToolSLow
GAP-TOOLS-038Ask Tool Interaction Model AlignmentSLow
GAP-TOOLS-007JS/TS REPL ToolSLow
GAP-PROC-004Process Parameter Schemas and ValidationSMedium

**Estimated scope**: 5 gaps, all S effort. ~1 week.

---

M1: Core Infrastructure

**Goal**: Build the foundational systems that almost everything else depends on. Prompt strata, governance, session model, JSON API, capability routing, and streaming capture. These are the load-bearing walls.

**Unlocks**: Structured prompt composition, policy-based governance, programmatic run management, session-run relationships, harness capability awareness, live output from dispatched tasks.

GapTitleEffortPriorityDepends On
GAP-PROMPT-001Prompt Strata ModelLCritical--
GAP-SEC-001Governance Policy LayerLCritical--
GAP-SESSION-001Session-to-Run One-to-ManyLCritical--
GAP-HADAPT-001Capability-Based Task RoutingLCritical--
GAP-SUBOBS-001Streaming Output CaptureLCritical--
GAP-JSON-001JSON API for Run CreationLCritical--
GAP-JSON-002JSON Effect Dispatch ProtocolLCriticalGAP-JSON-001
GAP-STATE-008Run Health ModelMHigh--
GAP-REMOTE-007Host Contract LayerLHigh--
GAP-PAR-009Parallel Effect Execution StrategiesMHigh--
GAP-ROUTE-003Effect Result Caching and DedupMMedium--

**Estimated scope**: 11 gaps (7 Critical, 3 High, 1 Medium). ~6-8 weeks.

---

M2: Observability and Control

**Goal**: See what's happening during orchestration and control it. Health monitoring, cost tracking, effect cancellation, progress tracking, structured status views, and the embedded SDK dashboard foundation.

**Unlocks**: Operators can monitor run health in real-time, track costs per effect, cancel runaway tasks, see structured status, and get progress updates from subagents. Breakpoint approval chains work.

GapTitleEffortPriorityDepends On
GAP-SUBOBS-002Subagent Progress TrackingMHighM1: SUBOBS-001
GAP-SUBOBS-003Per-Subagent Token and Cost TrackingMHighM1: SUBOBS-001
GAP-TOOLS-030Effect CancellationMHigh--
GAP-TOOLS-036Bash Background ExecutionSMediumGAP-TOOLS-030
GAP-OBS-001Run Health SnapshotMHighM1: STATE-008
GAP-OBS-004Policy Decision TrailMHighM1: SEC-001
GAP-OBS-NEW-001Dashboard Webhook and Alert SystemMHighM1: STATE-008
GAP-UX-005Structured Orchestration Status ViewMHighM1: STATE-008
GAP-UX-006Pending Work InspectorMHigh--
GAP-USER-006Real-Time Cost TrackingMHighGAP-SUBOBS-003, GAP-SESSION-004
GAP-SESSION-002Session State Persistence and HistoryMHighM1: SESSION-001
GAP-SESSION-004Session-Level Cost and BudgetsMHighM1: SESSION-001, GAP-SUBOBS-003
GAP-JSON-003JSON Breakpoint Interaction APIMHighM1: JSON-001
GAP-JSON-004JSON Session Management APIMHighM1: JSON-001
GAP-BRK-001Breakpoint Approval ChainsMHighM1: SEC-001
GAP-SEC-003Permission Request and Denial HooksLHighM1: SEC-001
GAP-SEC-005Approval Posture ModelMHighM1: SEC-001, GAP-SEC-003
GAP-PROMPT-002Deterministic Capability ProjectionMHighM1: PROMPT-001
GAP-PROMPT-005Continuity Overlays for ResumeMHighM1: PROMPT-001, M1: STATE-008
GAP-TOOLS-014Programmatic Task CRUD Beyond CLIMHighM1: JSON-001
GAP-TOOLS-018Structured Planning PhaseMHighM0: PROC-004
GAP-PROMPT-008Coding Philosophy Prompt SectionSHighM1: PROMPT-001
GAP-PROMPT-009Tool Preference and Usage RulesSHighM1: PROMPT-001
GAP-PROMPT-010Safety and Reversibility Prompt FrameworkSHighM1: PROMPT-001
GAP-PROMPT-011Output Efficiency RulesSMediumM1: PROMPT-001
GAP-PROMPT-012Git Safety Protocol Prompt SectionSMediumM1: PROMPT-001

**Estimated scope**: 26 gaps (mostly M effort). ~10-12 weeks.

---

M3: Multi-Harness Orchestration

**Goal**: Route effects to the right harness, run tasks in parallel across harnesses, isolate work in worktrees, compose processes, and support delegation policies. This is where babysitter becomes a true multi-harness orchestrator.

**Unlocks**: Tasks automatically routed to the best harness for the job. Parallel execution across multiple harnesses. Git worktree isolation. Process chaining. Model selection per task. Fallback chains when a harness is unavailable.

GapTitleEffortPriorityDepends On
GAP-AGENT-001Sub-Harness Invocation with IsolationXLHighM1: HADAPT-001, M1: SUBOBS-001
GAP-PAR-001Concurrent Effect ExecutionLHighM1: PAR-009
GAP-PAR-002Async Effect ExecutionLHighGAP-PAR-001, M2: SUBOBS-002
GAP-PAR-003Multi-Harness Parallel DispatchXLHighGAP-PAR-001, GAP-PAR-002, M1: HADAPT-001
GAP-TOOLS-017Git Worktree IsolationLHighGAP-PAR-001, GAP-AGENT-001
GAP-HADAPT-002Model Selection Per TaskMHighM1: HADAPT-001
GAP-HADAPT-004Harness Fallback ChainsMHighM1: HADAPT-001
GAP-AGENT-005Cross-Run CommunicationLHighGAP-AGENT-001
GAP-AGENT-006Cross-Run State SharingLHighM1: SESSION-001
GAP-AGENT-008Harness Selection PoliciesMHighM1: HADAPT-001
GAP-ROUTE-001Smart Effect Routing EngineXLHighM1: HADAPT-001
GAP-PROC-001Process Chaining and PipelinesMHighM0: PROC-004
GAP-PROC-002Process Nesting and Sub-ProcessLHighGAP-AGENT-001
GAP-TOOLS-023Multi-Step Workflow CompositionLHighGAP-PROC-001, GAP-PROC-002
GAP-PERF-001Prompt Caching (Ephemeral)LCriticalM1: PROMPT-001
GAP-PERF-002Session CompactionXLCriticalM1: PROMPT-001
GAP-PERF-005Cache-Aware Prompt AssemblyLHighM1: PROMPT-001, GAP-PERF-001
GAP-PERF-008Structured Continuity StateLHighM2: PROMPT-005
GAP-STATE-003Session State PersistenceLHighM1: SESSION-001
GAP-STATE-001Long-Term Memory ExtractionLHighM2: SESSION-002
GAP-USER-001Operator Command LayerLHighM2: UX-005
GAP-USER-012Plan Mode with VerificationMHighM2: TOOLS-018
GAP-TOOLS-008Web Search Agentic ToolMMediumM1: HADAPT-001

**Estimated scope**: 23 gaps (2 Critical, 19 High, 2 Medium; includes 4 XL). ~12-16 weeks.

---

M4: MCP and External Integration

**Goal**: Connect babysitter to the outside world. MCP tool discovery and invocation, channel messaging (Slack/Gmail/Calendar), remote sessions, event triggers, and streaming protocols. Babysitter becomes a platform, not just a local orchestrator.

**Unlocks**: MCP server tools callable from processes. Slack/Gmail/Calendar integration via MCP channels. Breakpoint approval from Slack. Remote WebSocket sessions. Webhook-triggered runs. Daemon mode for always-on orchestration.

GapTitleEffortPriorityDepends On
GAP-TOOLS-025MCP Tool Discovery and InvocationMHighM1: HADAPT-001, GAP-REMOTE-006
GAP-REMOTE-006MCP Client IntegrationLMediumM1: SEC-001
GAP-MCPC-001MCP Channel Inbound MessagingLHighGAP-TOOLS-025
GAP-MCPC-002MCP Channel Outbound MessagingMHighGAP-MCPC-001
GAP-MCPC-003Channel Permission RelayLHighGAP-MCPC-001, M2: BRK-002
GAP-MCPC-004MCP Server Management UIMMediumGAP-TOOLS-025
GAP-TOOLS-031MCP Resource Browsing and ReadingMMediumGAP-TOOLS-025
GAP-TOOLS-032MCP Authentication (OAuth)LMediumGAP-TOOLS-025
GAP-TOOLS-034Dynamic Tool Discovery and SearchMMediumGAP-TOOLS-025
GAP-JSON-005JSON Event Stream (SSE/WebSocket)LHighM1: JSON-001, GAP-REMOTE-008
GAP-REMOTE-001Daemon ModeXLHigh--
GAP-REMOTE-003Remote Sessions (WebSocket)XLHighM1: REMOTE-007, GAP-JSON-005
GAP-REMOTE-008Streaming Orchestration ProtocolLMediumM1: REMOTE-007
GAP-REMOTE-009Host-Mediated InteractionLMediumM1: REMOTE-007
GAP-REMOTE-004Cron Triggers and SchedulingLMedium--
GAP-TOOLS-020Scheduled Orchestration TriggersLMediumGAP-REMOTE-001, GAP-REMOTE-004
GAP-TOOLS-021External Event TriggersLMediumGAP-REMOTE-001, GAP-TOOLS-020
GAP-BRK-002Breakpoint Delegation to External SystemsLHighM2: JSON-003, M2: OBS-NEW-001
GAP-SEC-002Trust Classes for PluginsLHighM1: SEC-001
GAP-SEC-006OAuth IntegrationLMediumM1: SEC-001
GAP-TOOLS-028Sleep/Delay Effect EnhancementSLowGAP-TOOLS-020

**Estimated scope**: 21 gaps (includes 2 XL). ~10-14 weeks.

---

M5: Rich UI and Experience

**Goal**: Build the Ink/React rendering foundation and all the UI components that make orchestration a first-class visual experience. Structured diffs, effect trees, streaming panels, message rendering, embedded SDK dashboard with drill-down.

**Unlocks**: Rich terminal UI for orchestration. Visual effect trees. Structured diff rendering. Streaming output panels. Subagent drill-down in the embedded SDK dashboard. Operator mode selection.

GapTitleEffortPriorityDepends On
GAP-UX-001eProgress and Status LineSHighGAP-UX-001
GAP-UX-001Ink/React Terminal Rendering FoundationLHigh--
GAP-UX-001aEffect Tree VisualizationMHighGAP-UX-001
GAP-UX-001bStructured Diff RenderingMMediumGAP-UX-001
GAP-UX-001cPermission and Breakpoint Approval UIMHighGAP-UX-001, M2: BRK-001
GAP-UX-001dMessage Type RenderingLMediumGAP-UX-001
GAP-UX-001fStreaming Output PanelsLHighGAP-UX-001, M1: SUBOBS-001
GAP-SUBOBS-005Dashboard Subagent Drill-DownLMediumM2: SUBOBS-002, M2: SUBOBS-003
GAP-OBS-002Phase Timeline VisualizationMMediumM2: OBS-001
GAP-OBS-003Prompt Plan ObservabilityMMediumM1: PROMPT-001
GAP-OBS-005Context IntrospectionMMediumM2: SESSION-004
GAP-OBS-008Agent Progress SummarizationMMediumM2: OBS-001
GAP-OBS-NEW-002Dashboard API for External DashboardsLMediumM1: JSON-001
GAP-UX-007Rich Breakpoint InteractionMMediumM2: SEC-005
GAP-UX-008Resume DashboardMMediumM2: PROMPT-005
GAP-UX-009Failure Triage ViewMMediumM2: OBS-001
GAP-UX-010Typed Effect Interaction PatternsMMediumM2: JSON-003
GAP-UX-011Command DiscoverabilityMMedium--
GAP-UX-014Operator Mode SelectionMMediumM1: PROMPT-001
GAP-PERF-004Streaming Message RenderingLHighM1: SUBOBS-001
GAP-PERF-006Incremental Orchestration StreamingLMediumM4: JSON-005
GAP-TOOLS-029Structured Output ToolMMediumGAP-UX-001b, GAP-UX-001d
GAP-TOOLS-037Fetch Content ProcessingMLow--

**Estimated scope**: 23 gaps. ~10-14 weeks.

---

M6: Platform and Ecosystem

**Goal**: CC plugin compatibility, marketplace protocol, auto-update, trust model, process versioning, memory systems, and remaining polish. Babysitter becomes a full platform with an ecosystem.

**Unlocks**: CC plugins run on babysitter. Marketplace browsing and install. Plugin trust and blocklist. Process versioning and migration. Long-term memory consolidation. Session sharing. Run forking. Full audit export.

GapTitleEffortPriorityDepends On
GAP-ECO-001CC Plugin Compatibility LayerXLCriticalGAP-ECO-002, GAP-ECO-003
GAP-ECO-002CC Marketplace Protocol SupportLHigh--
GAP-ECO-003Plugin Trust and BlocklistMHighM1: SEC-001
GAP-ECO-004Plugin Auto-Update and VersioningMMediumGAP-ECO-002
GAP-ECO-005Plugin Validation and DiagnosticsSMediumGAP-ECO-001
GAP-AGENT-003Process Orchestration with Effect RoutingXLHighM3: AGENT-001, M3: ROUTE-001
GAP-AGENT-004Built-in Process TemplatesLMediumM3: HADAPT-002
GAP-AGENT-007Delegation Policy LayerLMediumM1: SEC-001, M1: HADAPT-001
GAP-HADAPT-003Cost-Based Routing PoliciesLHighM1: HADAPT-001, M2: SESSION-004
GAP-HADAPT-005Harness Health and Circuit BreakerMMediumM1: HADAPT-001
GAP-SUBOBS-004Subagent Health and Timeout MonitoringMMediumM1: SUBOBS-001, M2: SUBOBS-003
GAP-PROC-003Process Versioning and MigrationLMedium--
GAP-STATE-002Memory ConsolidationLMediumM3: STATE-001
GAP-STATE-006Session Rewind and HistoryLMediumM3: STATE-003
GAP-ROUTE-002Effect Priority and SchedulingMMediumM1: PAR-009
GAP-PAR-005Parallel File OperationsMMediumM3: PAR-001
GAP-PAR-006Streaming ParallelismMMediumM3: PAR-001
GAP-PAR-010Fork-Join Process PatternLMediumM3: PAR-003, M3: PROC-002
GAP-PERF-007Aggressive ParallelismLMediumM3: PAR-001
GAP-RUN-001Run Comparison and DiffingMMedium--
GAP-RUN-002Run Archival and RestoreMLow--
GAP-RUN-003Run Forking and BranchingLMediumGAP-STATE-006
GAP-SESSION-003Session Templates and PresetsMMediumM1: SESSION-001
GAP-SESSION-005Session Sharing and CollaborationLLowM2: SESSION-002, M4: REMOTE-003
GAP-OBS-006Analytics and Feature FlagsLMediumGAP-ECO-004
GAP-OBS-007Audit ExportMMediumM2: OBS-004
GAP-USER-017Plugin Management IntegrationMHighGAP-ECO-001
GAP-SEC-004Sandbox ToggleMMediumM1: SEC-001
GAP-SEC-007Privacy SettingsMMediumM1: SEC-001
GAP-PROMPT-003Runtime Personality OverlaysMMediumM1: PROMPT-001
GAP-PROMPT-004Prompt Inspection ToolingMMediumM1: PROMPT-001
GAP-PROMPT-006Instructions Loaded HookMMediumM1: PROMPT-001
GAP-PROMPT-007Context Compression FamiliesLMediumM1: PROMPT-001
GAP-TOOLS-012LSP IntegrationLHighM3: ROUTE-001, M0: PROC-004
GAP-TOOLS-026Structured User Interaction from EffectsMMediumM2: JSON-003
GAP-TOOLS-027Skill Discovery from Process DefinitionsMMediumM1: HADAPT-001
GAP-PROF-001Auto-Configure from User ProfileMMediumGAP-ECO-004
GAP-BRK-003Breakpoint Analytics and SLA TrackingSLowM2: OBS-004

**Estimated scope**: 38 gaps (includes 2 XL). ~16-20 weeks.

---

Milestone Summary

MilestoneGapsGoalCumulative
**M0** Quick Wins5Tool parity polish + process validation5
**M1** Core Infrastructure11Foundational systems everything depends on16
**M2** Observability & Control26See what's happening, control it42
**M3** Multi-Harness Orchestration23Route, parallelize, compose across harnesses65
**M4** MCP & External Integration21Connect to outside world: MCP, channels, remote86
**M5** Rich UI & Experience23Visual orchestration experience109
**M6** Platform & Ecosystem38Full platform with plugin ecosystem147

Dependency Graph (Milestones)

Code
M0 (Quick Wins) ──────────────────────────────────────────┐
  │                                                        │
  v                                                        │
M1 (Core Infrastructure) ─────────────────────────────┐    │
  │                                                    │    │
  ├──> M2 (Observability & Control) ──────────┐        │    │
  │                                            │        │    │
  ├──> M3 (Multi-Harness Orchestration) <──────┤        │    │
  │         │                                  │        │    │
  │         ├──> M4 (MCP & External) <─────────┘        │    │
  │         │                                           │    │
  │         └──> M5 (Rich UI) <─────────────────────────┘    │
  │                   │                                      │
  └──> M6 (Platform & Ecosystem) <───────────────────────────┘

M3 and M4 can partially overlap (MCP client work can start while multi-harness is in progress). M5 can start its foundation (GAP-UX-001) any time after M1. M6 is the long tail -- work items can be pulled forward if priorities shift.

Critical Path

The fastest path to production-grade multi-harness orchestration:

Code
M0 → M1 (PROMPT-001 + HADAPT-001 + SESSION-001 + JSON-001/002)
   → M2 (SUBOBS-002/003 + TOOLS-030 + OBS-001)
   → M3 (AGENT-001 + PAR-001/002/003 + PERF-001/002)

Everything else enhances this core. The critical blockers are: 1. **GAP-PROMPT-001** (Prompt Strata) -- 19 gaps depend on it 2. **GAP-HADAPT-001** (Capability Routing) -- 15 gaps depend on it 3. **GAP-SESSION-001** (Session Model) -- 8 gaps depend on it 4. **GAP-SEC-001** (Governance) -- 12 gaps depend on it 5. **GAP-SUBOBS-001** (Streaming Capture) -- 7 gaps depend on it 6. **GAP-JSON-001** (JSON API) -- 7 gaps depend on it

Trail

Wiki
Babysitter Docs
Harness Features Backlog: Gap Analysis (Restructured)

Harness Features Roadmap

Continue reading

Prompt Phrasing Analysis: CC System Prompts Deep-Dive
Glossary & References
Harness Strengths
Implementation Recommendations
Priority Matrix

Page record

Open node ledger

wiki/docs/harness-features-backlog/roadmap.md

Documents

No documented graph nodes on this page.