docs/harness-features-backlog
Harness Features Backlog: Gap Analysis (Restructured) guide
This gap analysis identifies feature gaps in the Babysitter orchestration platform from the babysitter-native perspective. All gaps are framed around what the orchestration platform needs -- not what any specific host harness (like Claude Code) provides.
Pages in this section
Start with the section hub, then move sideways into adjacent pages when you need more detail.
This document catalogs specific prompt text, tool descriptions, safety rules, persona instructions, and context assembly patterns from Claude Code that the Babysitter harness should adopt or adapt. Each section includes the exact CC phrasing and a gap assessment against our current prompt generation (packages/sdk/src/prompts/).
wiki/docs/harness-features-backlog/11-prompt-phrasing-analysis.md
Terms, abbreviations, file paths, and cross-codebase references used throughout this gap analysis.
wiki/docs/harness-features-backlog/glossary-references.md
Areas where the Babysitter harness platform is ahead of or differentiated from Claude Code.
wiki/docs/harness-features-backlog/harness-strengths.md
Phased implementation plan for the 147 babysitter-native gaps. All recommendations are framed from the orchestration platform perspective.
wiki/docs/harness-features-backlog/implementation-recommendations.md
Priority Matrix
PageAll 147 gaps ranked by impact-to-effort ratio, organized into implementation phases.
wiki/docs/harness-features-backlog/priority-matrix.md
147 gaps organized into 7 milestones. Each milestone has a goal, unlocks specific
wiki/docs/harness-features-backlog/roadmap.md
Harness Features Backlog: Gap Analysis (Restructured)
Executive Summary
This gap analysis identifies feature gaps in the **Babysitter orchestration platform** from the babysitter-native perspective. All gaps are framed around what the orchestration platform needs -- not what any specific host harness (like Claude Code) provides.
This restructured version replaces the original numbered category files with a one-file-per-gap directory structure under gaps/.
Restructuring Summary
| Action | Count | Description |
|---|---|---|
| **Kept** | 40 | Original gaps retained with babysitter framing |
| **Reframed** | 37 | Good concepts reframed for babysitter orchestration model (incl. tools-capabilities) |
| **New** | 50 | Babysitter-native gaps not in original analysis |
| **Removed** | 52 | CC-centric gaps (host harness concerns, not orchestrator) |
| **Merged** | 18 | Duplicates consolidated (incl. GAP-TOOLS-013 into GAP-AGENT-001, GAP-TOOLS-015 into GAP-AGENT-005) |
Key Statistics
| Metric | Value |
|---|---|
| **Total Gaps** | 147 |
| **Critical** | 10 |
| **High** | 67 |
| **Medium** | 62 |
| **Low** | 8 |
| **Missing** | 95 |
| **Partial** | 52 |
Effort Distribution
| Effort | Count |
|---|---|
| S (Small) | 15 |
| M (Medium) | 68 |
| L (Large) | 56 |
| XL (Extra Large) | 8 |
Category Index
| Category | Dir | Gaps | Focus |
|---|---|---|---|
| Prompt Engineering | prompt-engineering/ | 12 | Prompt strata, caching, personality, inspection, coding philosophy, tool preferences, safety, output efficiency, git safety |
| Performance | performance/ | 7 | Caching, compaction, streaming, continuity |
| Parallelization | parallelization/ | 7 | Concurrent effects, async execution, strategies |
| Observability | observability/ | 8 | Health, timeline, audit, analytics |
| Security | security/ | 7 | Governance, trust, permissions, privacy |
| Ecosystem | ecosystem/ | 5 | CC plugin compatibility, marketplace protocol, trust/blocklist, auto-update, validation |
| Agent Delegation | agent-delegation/ | 7 | Sub-harness, communication, state sharing |
| State Continuity | state-continuity/ | 5 | Memory, session state, health model |
| Remote Integration | remote-integration/ | 7 | Daemon, remote, scheduling, contracts |
| JSON Interaction | json-interaction/ | 5 | **NEW**: Programmatic API, effect protocol, streaming |
| Subagent Observability | subagent-observability/ | 5 | **NEW**: Streaming capture, progress, cost tracking |
| Harness Adaptation | harness-adaptation/ | 5 | **NEW**: Capability routing, model selection, fallback |
| Session Management | session-management/ | 5 | **NEW**: Multi-run sessions, templates, budgets |
| MCP Channels | mcp-channels/ | 4 | **NEW**: Channel messaging, permissions relay, MCP server management |
| User Experience | user-experience/ | 19 | Orchestrator UX: rich rendering (Ink/React foundation + 6 sub-gaps), interaction patterns, status, breakpoints |
| Tools & Capabilities | tools-capabilities/ | 23 | Orchestrator-delegated capabilities: tool parity (grep/bash/fetch enhancements), MCP, worktrees, planning, scheduling, skills |
| Process Composition | process-composition/ | 4 | **NEW**: Chaining, nesting, versioning, schemas |
| Effect Routing | effect-routing/ | 3 | **NEW**: Smart routing, priority, caching |
| Breakpoint Workflows | breakpoint-workflows/ | 3 | **NEW**: Approval chains, delegation, analytics |
| Run Lifecycle | run-lifecycle/ | 3 | **NEW**: Comparison, archival, forking |
| Observer Integration | observer-integration/ | 2 | **NEW**: Webhooks, external dashboard API |
| Profile Orchestration | profile-orchestration/ | 1 | **NEW**: Auto-configure from user profile |
Critical Gaps (10)
1. **GAP-PROMPT-001** -- Prompt Strata Model 2. **GAP-SEC-001** -- Governance Policy Layer 3. **GAP-PERF-001** -- Prompt Caching (Ephemeral) 4. **GAP-PERF-002** -- Session Compaction 5. **GAP-JSON-001** -- JSON API for Run Creation 6. **GAP-JSON-002** -- JSON Effect Dispatch Protocol 7. **GAP-SUBOBS-001** -- Streaming Output Capture 8. **GAP-HADAPT-001** -- Capability-Based Task Routing 9. **GAP-SESSION-001** -- Session-to-Run One-to-Many 10. **GAP-ECO-001** -- CC Plugin Compatibility Layer
What Was Removed (and Why)
The following categories of gaps were removed because they are host harness concerns, not orchestration platform concerns:
- ~~**Rich TUI/Ink/React rendering**~~ -- **RESTORED** as GAP-UX-001: babysitter's own observer dashboard, process visualization, and CLI output need rich rendering (not a host harness feature)
- **Voice mode / speech input** -- host harness feature
- **Vim mode / custom keybindings** -- host harness feature
- **Companion/buddy mode** -- host harness delight feature
- **Theme/output styling** -- host harness feature
- **Session teleport** -- CC-specific implementation
- **CC-specific tool implementations** (PowerShell, REPL, Notebook, Todo, Brief, Monitor)
- **CC-specific team/agent tools** (TeamCreate, TeamDelete, SendMessage as tool)
- **Desktop/mobile handoff, Chrome extension, IDE bridge** -- host harness features
- **Parity items** -- tools at or near parity (file R/W, edit, glob, notebook); partial-parity tools now have explicit enhancement gaps (GAP-TOOLS-035 through 038)
Related Documents
| Document | Description |
|---|---|
| Roadmap | **START HERE** -- 7 milestones with dependency ordering and critical path |
| Priority Matrix | All gaps ranked by impact-to-effort ratio |
| Implementation Recommendations | Phased implementation plan |
| Prompt Phrasing Analysis | CC prompt text to adopt |
| Prompt Phrasing Implementation | Copy-paste-ready prompt sections |
| Tools Coverage Map | CC (42 tools) vs babysitter (16 tools) |
| Harness Strengths | Where babysitter excels |
| Glossary & References | Terms and file paths |
How to Use This Backlog
1. Start with the Roadmap for milestone-based execution order 2. Check the Priority Matrix for impact-to-effort ranking 3. Browse category directories under gaps/ for detailed analysis 4. Each gap file is self-contained with description, current/target state, dependencies, and recommendations