Agentic AI Atlasby a5c.ai
OverviewWikiGraphFor AgentsEdgesSearchWorkspace
/
GitHubDocsDiscord
i.4Wiki
Agentic AI Atlas · Two-Loops Architecture: Understanding Hybrid Agentic Systems
docs/user-guide/features/two-loops-architecturea5c.ai
Search the atlas/
Wiki · linked records

Article and nearby pages

I.Current articlepp. 1 - 1
Architecture OverviewBest Practices Guide: Comprehensive Reference for BabysitterBreakpoints: Human-in-the-Loop ApprovalHooks: Extensible Lifecycle EventsJournal System: Event Sourcing and Audit TrailParallel Execution: Running Tasks Concurrently
I.
Wiki article

docs/user-guide/features/two-loops-architecture

Reading · 12 min

Two-Loops Architecture: Understanding Hybrid Agentic Systems reference

Page nodewiki/docs/user-guide/features/two-loops-architecture.mdNearby pages · 10Documents · 0

Continue reading

Nearby pages in the same section.

Architecture OverviewBest Practices Guide: Comprehensive Reference for BabysitterBreakpoints: Human-in-the-Loop ApprovalHooks: Extensible Lifecycle EventsJournal System: Event Sourcing and Audit TrailParallel Execution: Running Tasks ConcurrentlyProcess Definitions: JavaScript Workflow OrchestrationProcess LibraryQuality Convergence: Iterative Improvement Until Targets MetRun Resumption: Pause and Continue Workflows

Two-Loops Architecture: Understanding Hybrid Agentic Systems

**Version:** 1.1 **Last Updated:** 2026-01-26 **Category:** Feature Guide

---

TL;DR - What You Need to Know

**Skip this section if you just want to USE babysitter.** This document explains the architecture for those who want to understand WHY babysitter works the way it does, or who are building custom processes.

**The key insight:** Babysitter separates "what must happen" (deterministic rules) from "how to do it" (AI reasoning). This makes AI workflows reliable and debuggable.

Code
┌─────────────────────────────────────────────────────────────────┐
│  LOOP 1: The Boss (Orchestrator)                                │
│  - "You must pass tests before deploying"                       │
│  - "You have max 10 attempts"                                   │
│  - "Stop and ask for approval at this point"                    │
│                                                                 │
│  LOOP 2: The Worker (AI Agent)                                  │
│  - "Figure out how to make these tests pass"                    │
│  - "Find and fix the bugs"                                      │
│  - "Write the code that solves the problem"                     │
└─────────────────────────────────────────────────────────────────┘

**When to read this document:**

  • You're building custom processes
  • You want to understand guardrails and safety
  • You're debugging why a run behaves a certain way
  • You're an architect evaluating babysitter for your team

**When to skip this document:**

  • You just want to run existing processes
  • You're following a tutorial
  • You're a beginner (start with Quality Convergence instead)

---

Overview

Babysitter implements a **Two-Loops Control Plane** architecture that combines:

1. **Symbolic Orchestration** (Process Engine): Deterministic, code-defined control 2. **Agentic Harness** (LLM Runtime): Adaptive, AI-powered work execution

This hybrid approach delivers the best of both worlds: the reliability of deterministic systems with the flexibility of AI reasoning.

Why Two Loops?

Single-Loop AITwo-Loops Hybrid
Unpredictable behaviorBounded, testable autonomy
Hard to debugJournaled, replayable execution
No safety guaranteesEnforced guardrails and gates
"It seems done"Evidence-driven completion
Context degradationFresh context per task

---

The Core Building Blocks

A) Symbolic Orchestrator (Process Engine)

The orchestrator is the code-defined process that enforces:

ResponsibilityExample
**Ground truth state**Run is in "implementation" phase
**Progression rules**Must pass tests before deployment
**Invariants**Never modify production directly
**Budgets**Max 10 iterations, 30 min timeout
**Permissions**Only write to src/ directory
**Quality gates**Tests, lint, security must pass
**Journaling**Every event recorded for replay
**Time travel**Fork from any point, compare runs

**The orchestrator owns making execution dependable.**

B) Agent Harness (LLM Runtime)

The harness is not "just an LLM call." Modern harnesses include:

CapabilityDescription
Iterative planningPlan → Execute → Replan
Tool callingFiles, terminal, search, code execution
Command executionParse results, handle errors
Incremental fixesIterate until checks pass
Structured artifactsPlans, diffs, summaries
Multi-step reasoningWith constraints
Sub-agentsDelegation inside the harness

**The harness owns solving fuzzy parts and adapting to feedback.**

C) Symbolic Logic Surfaces (Shared Capabilities)

Symbolic logic appears in **multiple places**, all consistent:

1. **Inside orchestrator** (stage transitions, invariants, gates, budgets) 2. **As symbolic tools** callable by the harness (policy checks, gate evaluation) 3. **As symbolic tasks** callable by orchestration (validators, analyzers)

javascript
// Symbolic logic as orchestrator rule (using loop for retry)
for (let iteration = 0; iteration < maxIterations; iteration++) {
  const impl = await ctx.task(implementTask, { feature });
  const testResults = await ctx.task(runTestsTask, { impl });

  if (testResults.passed) break; // Success - exit loop
  // Loop continues with feedback from failed tests
}

// Symbolic logic as tool callable by harness
const allowed = await ctx.task(policyCheckTask, {
  action: 'modifyFile',
  path: '/etc/config.json'
});

// Symbolic logic as validation task
const gateResult = await ctx.task(securityGateTask, {
  files: impl.filesModified
});

---

The Two Loops in Detail

Loop 1: Orchestration Loop (Symbolic)

A process stepper that progresses a run through explicit stages.

**Typical Cycle:**

Code
1. Reconstruct "what is true" from the journal
2. Determine what stage the run is in
3. Check gates/constraints/budgets
4. Choose the next allowed transition
5. Emit the next effect (or wait)
6. Record results back into the journal

**This loop is about:** control, safety, repeatability, traceability.

Loop 2: Agentic Loop (Harness)

A tool-using reasoning loop that iterates until reaching a local objective.

**Typical Cycle:**

Code
1. Read current objective + constraints
2. Decide what evidence is needed
3. Call tools, inspect results
4. Update plan or actions
5. Produce an output (patch, plan, answer, report)

**This loop is about:** solving the task when information is incomplete.

---

What Goes Where?

The design challenge is deciding **which execution decisions are deterministic/symbolic** and **which are adaptive/agentic**.

Put in Symbolic Logic When...

These decisions must be **stable, enforceable, and auditable**:

Decision TypeExamples
**Safety/permissions**What actions are allowed
**Budgets/limits**Time, cost, tool call limits
**State transitions**What stage you're in
**Concurrency rules**What can run in parallel
**Retry/timeout policy**What happens on failure
**Idempotency**Avoid double execution
**Quality gates**What proof is required
**Compliance/audit**Logging requirements

Put in Agent Harness When...

These decisions benefit from **flexible reasoning**:

Decision TypeExamples
**Ambiguous instructions**"Make it better"
**Uncertain approach**Multiple valid solutions
**Search/discovery**Find relevant files
**Drafting**Code, docs, analyses
**Debugging**Iterate against tool results
**Summarizing**Compress evidence
**Proposing**Candidate solutions

The Mixed Zone

Many tasks are mixed. The pattern is:

  • **Symbolic logic defines the envelope** (constraints + gates + budgets)
  • **Harness explores inside that envelope** (implements, debugs, refines)
  • **Both can invoke symbolic rules** (nothing is guesswork)
javascript
// Mixed: Harness works, orchestrator validates (loop-based retry)
let securityPassed = false;
for (let iteration = 0; iteration < maxIterations && !securityPassed; iteration++) {
  const impl = await ctx.task(implementTask, {
    feature,
    constraints: {
      allowedPaths: ['src/**'],
      forbiddenPatterns: ['eval(', 'exec('],
      maxFilesModified: 10
    },
    // Pass previous feedback on retry iterations
    feedback: iteration > 0 ? lastSecurityResult.recommendations : null
  });

  // Orchestrator enforces gate
  const securityResult = await ctx.task(securityGateTask, { impl });
  securityPassed = securityResult.passed;
  lastSecurityResult = securityResult;
}

---

The Four Guardrail Layers

Guardrails are a **layered approach**, not a single feature.

Layer A: Capability Guardrails (What's Possible)

Define what tools and actions exist.

javascript
const capabilityConfig = {
  allowedTools: ['read', 'write', 'shell', 'search'],
  pathRestrictions: ['src/**', 'tests/**'],
  networkAccess: 'none',
  permissions: 'read-write',
  destructiveActions: 'require-confirmation'
};

Layer B: Budget Guardrails (How Far)

Prevent runaway execution.

javascript
const budgetConfig = {
  maxToolCalls: 100,
  maxWallClockMinutes: 30,
  maxTokenSpend: 50000,
  maxIterations: 10,
  rateLimits: { apiCalls: '10/minute' }
};

Layer C: Policy Guardrails (What's Allowed)

Rules that define acceptable behavior.

javascript
const policyConfig = {
  rules: [
    'never exfiltrate secrets',
    'never modify production directly',
    'always run tests before merge',
    'security scans required for dependencies'
  ]
};

Layer D: Behavioral Guardrails (How Decisions Are Made)

Structural consistency in outputs.

javascript
const behavioralConfig = {
  requireStructuredOutputs: true,
  requireEvidenceCitations: true,
  requireUncertaintyDeclaration: true,
  outputSchemas: { /* JSON schemas */ }
};

---

Quality Gates: Turning Agentic Work into Reliable Outcomes

Quality gates convert "it seems done" into "it is done."

The Evidence-Driven Pattern

Each phase must end with:

ComponentDescription
**Artifact**The work product (patch, doc, config, report)
**Evidence**Proof it meets requirements (logs, test output, checks)

**If you don't have evidence, you don't have completion.**

Common Gated Steps

Gate TypeWhat It Validates
Unit testsIndividual functions work
Integration testsComponents work together
System testsEnd-to-end behavior
Acceptance testsUser requirements met
Lint/formattingCode style compliance
Type checkingType safety
Static analysisPotential bugs
Security scansVulnerabilities
ReproducibilityClean run in fresh env
Diff reviewNo forbidden file changes
PerformanceMeets thresholds

Where Gates Live (Consistent Everywhere)

javascript
// In orchestrator: loop-based retry for gate failures
let gateResults = { passed: false };
for (let i = 0; i < maxIterations && !gateResults.passed; i++) {
  const impl = await ctx.task(implementTask, { feature, feedback: gateResults.failures });
  gateResults = await ctx.task(runGatesTask, { impl });
}

// As symbolic tool: harness pre-checks during work
const gateResult = await checkGate(impl);
if (!gateResult.passed) {
  // Harness can immediately attempt repair
  await repairIssues(gateResult.failures);
}

// As symbolic task: verify evidence objectively
const evidence = await ctx.task(gateValidatorTask, { impl });

Human Approval Gates

For high-impact steps, include explicit checkpoints:

javascript
// Plan approval before execution
await ctx.breakpoint({
  question: 'Review the plan. Approve to proceed with implementation?',
  title: 'Plan Approval',
  context: { /* ... */ }
});

// Diff approval before merge
await ctx.breakpoint({
  question: `Review the diff (${diff.linesChanged} lines). Approve to merge?`,
  title: 'Merge Approval'
});

// Deployment approval
await ctx.breakpoint({
  question: 'Quality: 92/100. Deploy to production?',
  title: 'Production Deployment'
});

---

The Journal: Making Execution Testable

A journaled control plane turns agentic behavior into something you can:

CapabilityValue
**Replay**Debug by re-running
**Inspect**See exactly what happened
**Diff**Compare across forks
**Audit**Compliance evidence
**Analyze**Failure pattern detection

What's Journaled

Event TypeExample
**Inputs/signals**Initial requirements
**Stage transitions**"planning" → "implementation"
**Requested actions**writeFile('/src/auth.ts', ...)
**Results**Action succeeded, 42 lines written
**Artifacts**plan.md, implementation.patch
**Evidence**Test results, gate outcomes
**Gate outcomes**Security: PASS, Tests: PASS
**Approvals**User approved at breakpoint

---

Prompt Quality is Determinism Engineering

In a two-loop system, prompts are **configuration for the harness**.

Why Prompt Quality Matters

Better prompts reduce:

  • Output variance
  • Tool misuse
  • Hidden assumptions
  • Inconsistent formatting
  • Unpredictable branching

Better prompts improve:

  • Repeatability
  • Debuggability
  • Fork comparisons
  • Safe automation

The Real Goal: Structural Consistency

You don't need identical wording. You need consistent:

  • Decision formats
  • Priorities
  • Stop/ask conditions
  • Evidence standards

Prompt Versioning

Treat harness prompts like engineering surfaces:

javascript
const promptVersion = '2.1.0';

const implementerPrompt = {
  version: promptVersion,
  role: 'senior software engineer',
  task: 'Implement feature according to specification',
  constraints: [
    'Follow existing code patterns',
    'Write tests for all public functions',
    'Document complex logic',
    'Ask for clarification if requirements are ambiguous'
  ],
  outputFormat: {
    type: 'object',
    required: ['filesModified', 'summary', 'confidence']
  }
};

---

Common Failure Modes and Fixes

1. Everything is Agentic

**Symptom:** Unpredictable behavior, hard to debug, inconsistent safety.

**Fix:** Move gates, budgets, and invariants into symbolic orchestration.

2. Everything is Symbolic

**Symptom:** Brittle workflows, poor adaptation, high maintenance.

**Fix:** Delegate fuzzy decisions and exploration to the harness.

3. Hidden State

**Symptom:** The harness "remembers" things the system never logged.

**Fix:** Journal what matters; the system's truth must be reconstructible.

4. Wide Tool Surface

**Symptom:** Tool confusion, increased risk, unpredictable results.

**Fix:** Keep tools small, stable, and well-described.

5. No Explicit Evidence Requirements

**Symptom:** "Done" claims without proof.

**Fix:** Define completion as artifact + evidence, enforced by gates.

---

The Doctrine

If you define only a few principles, make them these:

1. **The orchestrator owns** run progression, journaling, and phase boundaries 2. **Symbolic logic owns** constraints, permissions, budgets, and gates 3. **The harness owns** adaptive work inside constraints 4. **Guardrails are enforced** by symbolic checks, not informal intentions 5. **Quality is evidence-driven**, not assertion-driven 6. **Prompts are versioned** control surfaces for harness behavior 7. **The journal is the source** of truth for replay, audit, and forking

---

Getting Started

If you're building from scratch:

1. **Define phases** (a small symbolic process) 2. **Define effects/tools** available in each phase 3. **Add budgets and permissions** 4. **Decide quality gates per phase** 5. **Add a harness** that can do real work 6. **Journal everything** needed for replay and audit 7. **Add fork + time travel** as first-class operations

**If you do only one thing:** make completion require evidence.

---

Process Library Examples

Spec-Driven Development

methodologies/spec-driven-development.js

Implements the full two-loops pattern:

  • **Symbolic:** Constitution validation, plan-constitution alignment, consistency analysis
  • **Agentic:** Specification writing, planning, implementation
  • **Gates:** Every phase has approval breakpoints

V-Model

methodologies/v-model.js

Heavy on symbolic verification:

  • **Four test levels** designed before implementation
  • **Traceability matrix** ensures complete coverage
  • **Safety levels** adjust rigor

GSD Iterative Convergence

gsd/iterative-convergence.js

Feedback-driven quality loop:

  • **Implement → Score → Feedback → Repeat**
  • **Breakpoints** at quality thresholds
  • **Plateau detection** for early exit

---

Related Documentation

  • Quality Convergence - Five quality gate types and 90-score pattern
  • Best Practices - Workflow design and guardrail patterns
  • Process Definitions - Creating your own processes
  • Journal System - Event sourcing and replay
  • Breakpoints - Human-in-the-loop approval

---

Summary

The Two-Loops architecture enables bounded, testable autonomy:

  • **Orchestration Loop** provides control, safety, and traceability
  • **Agentic Loop** provides capability, adaptation, and problem-solving
  • **Quality Gates** turn "seems done" into "is done" with evidence
  • **Guardrails** enforce rules at capability, budget, policy, and behavioral levels
  • **Journaling** makes everything replayable and auditable

When done well, you get **autonomy that is bounded, testable, and steadily improvable**.

---

SDK API Quick Reference

The complete list of SDK intrinsics (functions available on ctx):

FunctionPurposeExample
ctx.task(taskDef, args)Execute a taskawait ctx.task(buildTask, { target: 'dist' })
ctx.breakpoint(opts)Pause for human approvalawait ctx.breakpoint({ question: 'Deploy?', title: 'Approval' })
ctx.parallel.all([...])Run tasks in parallelawait ctx.parallel.all([() => ctx.task(a), () => ctx.task(b)])
ctx.parallel.map(arr, fn)Map over array in parallelawait ctx.parallel.map(files, f => ctx.task(lint, { file: f }))
ctx.sleepUntil(iso8601)Pause until a specific timeawait ctx.sleepUntil('2026-01-27T10:00:00Z')
ctx.log(msg, data?)Log message to journalctx.log('Quality score', { score: 85 })
ctx.now()Get current time (deterministic)const ts = ctx.now().getTime()
ctx.runIdCurrent run identifierconst id = ctx.runId

**Important:** There is NO ctx.retry(). Use loops for retry logic:

javascript
// Correct: Loop-based retry
for (let i = 0; i < maxIterations && !passed; i++) {
  const result = await ctx.task(implementTask, { feedback });
  passed = result.testsPass;
  feedback = result.errors;
}

---

What To Do Next

Based on your role, here's your next step:

If you are...Do this next
**Beginner**Read Quality Convergence for the core iteration pattern
**Building processes**Study Best Practices for workflow design
**Debugging a run**Check Journal System to understand event sourcing
**Adding approvals**See Breakpoints for human-in-the-loop patterns
**Evaluating for team**Review the Four Guardrail Layers section above

Trail

Wiki
Babysitter Docs
Babysitter User Guide
Features

Two-Loops Architecture: Understanding Hybrid Agentic Systems

Continue reading

Architecture Overview
Best Practices Guide: Comprehensive Reference for Babysitter
Breakpoints: Human-in-the-Loop Approval
Hooks: Extensible Lifecycle Events
Journal System: Event Sourcing and Audit Trail
Parallel Execution: Running Tasks Concurrently
Process Definitions: JavaScript Workflow Orchestration
Process Library

Page record

Open node ledger

wiki/docs/user-guide/features/two-loops-architecture.md

Documents

No documented graph nodes on this page.