II.
Page overview
Reference · livepage:docs-reference-troubleshooting
Babysitter Plugin Troubleshooting Guide overview
Inspect the raw attributes, linked wiki pages, and inbound or outbound graph edges for page:docs-reference-troubleshooting.
Attributes
nodeKind
Page
sourcePath
docs/reference/TROUBLESHOOTING.md
sourceKind
repo-docs
title
Babysitter Plugin Troubleshooting Guide
displayName
Babysitter Plugin Troubleshooting Guide
slug
docs/reference/troubleshooting
articlePath
wiki/docs/reference/TROUBLESHOOTING.md
article
# Babysitter Plugin Troubleshooting Guide
> Comprehensive troubleshooting guide for the Babysitter plugin, organized by symptom.
**Version:** 1.0.0
**Last Updated:** 2026-02-03
---
## Table of Contents
1. [Quick Diagnostic Commands](#quick-diagnostic-commands)
2. [Run Stuck in "Waiting" State](#1-run-stuck-in-waiting-state)
3. [Tasks Failing Silently](#2-tasks-failing-silently)
4. [Hooks Not Executing](#3-hooks-not-executing)
5. [Journal Corruption](#4-journal-corruption)
6. [Session Loops Not Working](#5-session-loops-not-working)
7. [Installation/Verification Failures](#6-installationverification-failures)
8. [Permission Errors](#7-permission-errors)
9. [FAQ](#faq)
10. [Getting Help](#getting-help)
---
## Quick Diagnostic Commands
Run these commands first to get an overview of your setup:
```bash
# Check installation status
# Check runtime health
# Check SDK CLI version
npx -y @a5c-ai/babysitter-sdk@latest --version
# Check a specific run status
CLI="npx -y @a5c-ai/babysitter-sdk@latest"
$CLI run:status <runId> --json
# View recent events for a run
$CLI run:events <runId> --limit 20 --reverse
# List pending tasks
$CLI task:list <runId> --pending --json
```
---
## 2. Tasks Failing Silently
### Symptoms
- Run completes but expected work was not done
- Tasks show `status: "completed"` but no output
- No error messages visible
- Process appears to skip steps
### Diagnosis
**Step 1: Check task status and result**
```bash
CLI="npx -y @a5c-ai/babysitter-sdk@latest"
# List all tasks
$CLI task:list <runId> --json
# Show specific task details
$CLI task:show <runId> <effectId> --json
```
**Step 2: Inspect task logs**
```bash
# View task stdout
cat .a5c/runs/<runId>/tasks/<effectId>/stdout.log
# View task stderr
cat .a5c/runs/<runId>/tasks/<effectId>/stderr.log
# View task result
cat .a5c/runs/<runId>/tasks/<effectId>/result.json | jq '.'
```
**Step 3: Check journal events for the task**
```bash
$CLI run:events <runId> --json | jq '.events[] | select(.effectId == "<effectId>")'
```
**Step 4: Verify task definition**
```bash
cat .a5c/runs/<runId>/tasks/<effectId>/task.json | jq '.'
cat .a5c/runs/<runId>/tasks/<effectId>/inputs.json | jq '.'
```
### Solution
**Missing logs:**
If stdout.log or stderr.log are empty or missing:
1. Check that the task script exists and is executable
2. Verify the entry path in the task definition is correct
3. Ensure the working directory is correct
**Task script errors:**
1. Run the task script manually to see errors:
```bash
node .a5c/runs/<runId>/code/main.js
```
2. Check for syntax errors or missing dependencies
**Incorrect task definition:**
1. Review the process file (`main.js`) for task definitions
2. Verify task input/output paths are correct
3. Check that `io.outputJsonPath` points to an existing directory
**Re-run a failed task:**
```bash
# Mark the task as error to retry
$CLI task:post <runId> <effectId> --status error --json
# Then run the next iteration
$CLI run:iterate <runId> --json --iteration <n>
```
### Prevention
- Always check exit codes in task scripts
- Log to both stdout and stderr appropriately
- Use structured JSON output for task results
- Add validation in process files before task execution
- Test task scripts independently before integrating
---
## 3. Hooks Not Executing
### Symptoms
- Expected hook behavior does not occur
- No log output from hooks
- `on-run-start`, `on-task-complete`, etc. not triggering
- Custom hooks are ignored
### Diagnosis
**Step 1: Verify hook is executable**
```bash
ls -la .a5c/hooks/<hook-name>/
ls -la plugins/babysitter-unified/hooks/<hook-name>.sh
```
Hooks must have the executable bit set (`-rwxr-xr-x`).
**Step 2: Check hook discovery order**
Hooks are discovered in this priority order:
1. `.a5c/hooks/<hook-name>/` (per-repo, highest priority)
2. `~/.config/babysitter/hooks/<hook-name>/` (per-user)
3. `plugins/babysitter-unified/hooks/<hook-name>.sh` (maintained plugin source)
**Step 3: Test hook manually**
```bash
# Test with sample payload
echo '{"runId":"test-123","status":"completed"}' | .a5c/hooks/on-run-complete/my-hook.sh
```
**Step 4: Check hook dispatcher**
```bash
# Test the maintained plugin hook entrypoint
echo '{"session_id":"test-123"}' | plugins/babysitter-unified/hooks/session-start.sh
```
**Step 5: Check hook registration (for Claude Code hooks)**
```bash
cat plugins/babysitter-unified/plugin.json | jq '.hooks'
```
### Solution
**Make hooks executable:**
```bash
chmod +x .a5c/hooks/<hook-name>/*.sh
chmod +x plugins/babysitter-unified/hooks/*.sh
```
**Fix hook script errors:**
1. Check for syntax errors:
```bash
bash -n .a5c/hooks/<hook-name>/my-hook.sh
```
2. Ensure the shebang is correct (`#!/bin/bash` or `#!/usr/bin/env bash`)
3. Verify jq is installed (required for JSON parsing)
**Correct hook output format:**
Hooks must:
- Output JSON to stdout (for result data)
- Output logs to stderr (not stdout, to avoid JSON parsing errors)
- Exit with code 0 for success
Example correct hook:
```bash
#!/bin/bash
set -euo pipefail
PAYLOAD=$(cat)
RUN_ID=$(echo "$PAYLOAD" | jq -r '.runId')
# Log to stderr (visible but not captured as result)
echo "[my-hook] Processing run: $RUN_ID" >&2
# Output JSON to stdout
echo '{"ok": true}'
exit 0
```
**Debug hook execution:**
Add debug logging to your hook:
```bash
#!/bin/bash
set -euo pipefail
# Debug: log payload to a file
cat > /tmp/hook-debug-$$.json
# Log execution
echo "[DEBUG] Hook executed at $(date)" >&2
echo "[DEBUG] Payload saved to /tmp/hook-debug-$$.json" >&2
```
### Prevention
- Always test hooks manually before relying on them
- Use `set -euo pipefail` at the start of hooks
- Keep stdout for JSON output only
- Log to stderr for debugging
- Document hook purpose and expected payload
---
## 4. Journal Corruption
### Symptoms
- `run:status` returns errors or unexpected data
- Run cannot be resumed
- State cache is out of sync with journal
- Events appear missing or duplicated
### Diagnosis
**Step 1: Verify journal integrity**
```bash
CLI="npx -y @a5c-ai/babysitter-sdk@latest"
# List all events (will error if corrupted)
$CLI run:events <runId> --json
```
**Step 2: Check journal files directly**
```bash
# List journal files
ls -la .a5c/runs/<runId>/journal/
# Validate each event file is valid JSON
for f in .a5c/runs/<runId>/journal/*.json; do
if ! jq '.' "$f" > /dev/null 2>&1; then
echo "Corrupted: $f"
fi
done
```
**Step 3: Check state cache**
```bash
# State cache should be rebuildable from journal
cat .a5c/runs/<runId>/state/state.json | jq '.'
```
**Step 4: Check for incomplete writes**
```bash
# Look for partial/truncated files
find .a5c/runs/<runId>/journal/ -name "*.json" -size 0
```
### Solution
**Rebuild state cache:**
The state cache (`state/state.json`) is derived from the journal and can be safely deleted:
```bash
rm .a5c/runs/<runId>/state/state.json
# Next CLI command will rebuild it
$CLI run:status <runId>
```
**Remove corrupted event files:**
If specific journal files are corrupted:
1. Identify the corrupted file(s)
2. Check if the event is critical (breakpoint release, task result, etc.)
3. If non-critical, remove the file:
```bash
rm .a5c/runs/<runId>/journal/<corrupted-file>.json
```
4. Rebuild state:
```bash
rm .a5c/runs/<runId>/state/state.json
$CLI run:status <runId>
```
**Restore from backup:**
If the journal is heavily corrupted:
1. If using git, restore from a previous commit:
```bash
git checkout HEAD~1 -- .a5c/runs/<runId>/journal/
```
2. If you have backups, restore the journal directory
**Start fresh:**
If recovery is not possible, create a new run:
```bash
$CLI run:create \
--process-id <same-process-id> \
--entry <same-entry> \
--inputs <same-inputs>
```
### Prevention
- Do not manually edit journal files
- Use atomic file operations (the SDK does this automatically)
- Back up critical runs before major operations
- Use git to track run directories (journal is append-only and merge-friendly)
- Monitor disk space to prevent incomplete writes
---
## 5. Session Loops Not Working
### Symptoms
- `/babysitter:babysit` command does not start a loop
- Claude exits immediately instead of continuing
- Iteration counter does not increment
- Completion promise is not detected
- "No active loop" message appears
### Diagnosis
**Step 1: Check session state file**
```bash
# State files are stored per session
ls -la ~/.a5c/state/
# View state file contents
cat ~/.a5c/state/<session-id>.md
```
**Step 2: Check stop hook logs**
```bash
cat /tmp/babysitter-stop-hook.log
```
**Step 3: Verify session ID is available**
The session ID is set by the SessionStart hook. Check if it was persisted:
```bash
# Check if AGENT_SESSION_ID is set
echo "$AGENT_SESSION_ID"
```
**Step 4: Check hook registration**
```bash
cat plugins/babysitter-unified/plugin.json | jq '.hooks'
```
Should include both `SessionStart` and `Stop` hooks.
**Step 5: Test stop hook manually**
```bash
echo '{"session_id":"test-123","transcript_path":"/tmp/test.jsonl"}' | \
plugins/babysitter-unified/hooks/stop.sh
```
### Solution
**State file not created:**
If the state file is missing, the setup script may have failed:
1. Check that `AGENT_SESSION_ID` is available:
```bash
echo "$AGENT_SESSION_ID"
```
2. If not set, the SessionStart hook may have failed
3. Check hook registration in `hooks.json`
**Session ID not persisted:**
Babysitter first looks for `AGENT_SESSION_ID`. If that is absent, the
SessionStart hook can persist it through `CLAUDE_ENV_FILE` as a fallback.
Check:
1. whether `AGENT_SESSION_ID` is already present in the current Claude session
2. if not, whether `CLAUDE_ENV_FILE` is set
3. whether that file is writable
4. whether the hook is executable:
```bash
chmod +x plugins/babysitter-unified/hooks/session-start.sh
```
**Iteration limit reached too quickly:**
If the loop stops due to "iteration too fast":
```bash
# Check the stop reason in logs
grep "max_iterations_reached\|completion_proof_matched" /tmp/babysitter-stop-hook.log
```
This protection triggers if iterations average under 15 seconds. Ensure your work takes meaningful time.
**Completion promise not detected:**
The completion promise must match exactly:
1. Check run status for `completionProof`:
```bash
$CLI run:status <runId> --json | jq '.completionProof'
```
2. Verify output contains `<promise>SECRET</promise>` with exact match
3. Whitespace is normalized but content must match
**State file corruption:**
If the state file has invalid YAML:
```bash
# Mark the state file inactive to stop the loop
python - <<'PY'
from pathlib import Path
p = Path.home() / '.a5c' / 'state' / '<session-id>.md'
s = p.read_text()
p.write_text(s.replace('active: true', 'active: false', 1))
PY
```
Then start fresh with `/babysitter:babysit`.
### Prevention
- Always specify `--max-iterations` to prevent infinite loops
- Do not manually edit state files
- Ensure hooks are properly registered and executable
- Test the stop hook independently before relying on it
---
## 6. Installation/Verification Failures
### Symptoms
- "Command not found" errors for babysitter CLI
- Missing dependencies (Node.js, npm, jq)
- Plugin structure errors
### Diagnosis
**Step 1: Run verification script**
```bash
```
**Step 2: Check individual dependencies**
```bash
# Node.js (requires v18+)
node --version
# npm
npm --version
# git
git --version
# jq (required for hooks)
jq --version
```
**Step 3: Check SDK CLI**
```bash
npx -y @a5c-ai/babysitter-sdk@latest --version
```
**Step 4: Check plugin structure**
```bash
ls -la plugins/babysitter-unified/
# Should contain: hooks/, skills/, per-harness/, plugin.json, versions.json
```
### Solution
**Install Node.js:**
Node.js v18 or later is required.
```bash
# Download from https://nodejs.org/
# Or use a version manager:
# nvm (Linux/macOS)
nvm install 18
nvm use 18
# fnm (cross-platform)
fnm install 18
fnm use 18
```
**Install jq:**
```bash
# macOS
brew install jq
# Ubuntu/Debian
sudo apt-get install jq
# Windows (Chocolatey)
choco install jq
# Windows (Scoop)
scoop install jq
```
**Install/update SDK CLI:**
```bash
# Install globally
npm install -g @a5c-ai/babysitter-sdk@latest
# Or use npx (no install required)
npx -y @a5c-ai/babysitter-sdk@latest --version
```
**Fix plugin structure:**
If directories are missing, re-clone the plugin:
```bash
git clone https://github.com/a5c-ai/babysitter.git /tmp/babysitter-fresh
npm --prefix /tmp/babysitter-fresh install
npm --prefix /tmp/babysitter-fresh run generate:plugins
```
**Clear npx cache:**
If npx returns stale versions:
```bash
npx --cache clear
# Or specify latest explicitly
npx -y @a5c-ai/babysitter-sdk@latest --version
```
### Prevention
- Use a Node.js version manager (nvm, fnm)
- Pin SDK version in your project (optional)
- Keep the plugin updated with git pull
---
## 7. Permission Errors
### Symptoms
- "Permission denied" when running hooks
- Cannot create state files or directories
- Cannot write to runs directory
- Hook scripts fail with permission errors
### Diagnosis
**Step 1: Check hook permissions**
```bash
ls -la plugins/babysitter-unified/hooks/*.sh
ls -la .a5c/hooks/**/*.sh
```
Hooks should have execute permission (`-rwxr-xr-x` or at least `-rwx------`).
**Step 2: Check directory permissions**
```bash
# Runs directory
ls -la .a5c/runs/
# State directory
ls -la ~/.a5c/state/
# Plugin directory
ls -la plugins/babysitter-unified/
```
**Step 3: Check file ownership**
```bash
ls -la .a5c/
```
Ensure the current user owns the directories.
**Step 4: Test write access**
```bash
# Test runs directory
touch .a5c/runs/.write-test && rm .a5c/runs/.write-test
# Test state directory
touch ~/.a5c/state/.write-test && rm ~/.a5c/state/.write-test
```
### Solution
**Fix hook permissions:**
```bash
# Make all hooks executable
chmod +x plugins/babysitter-unified/hooks/*.sh
chmod +x .a5c/hooks/**/*.sh
```
**Fix directory permissions:**
```bash
# Fix runs directory
chmod 755 .a5c
chmod 755 .a5c/runs
# Fix state directory
chmod 755 ~/.a5c/state
# Create state directory if missing
mkdir -p ~/.a5c/state
chmod 755 ~/.a5c/state
```
**Fix ownership:**
```bash
# Change ownership to current user
sudo chown -R $(whoami) .a5c/
sudo chown -R $(whoami) plugins/babysitter-unified/
```
**SELinux/AppArmor issues (Linux):**
If using SELinux or AppArmor:
```bash
# Check if SELinux is blocking
ausearch -m avc -ts recent
# Temporarily set permissive mode (for testing)
sudo setenforce 0
```
**Windows-specific:**
On Windows (Git Bash/WSL):
```bash
# Git Bash may not preserve execute bits
# Mark hooks as executable in git
git update-index --chmod=+x plugins/babysitter-unified/hooks/*.sh
```
### Prevention
- Set correct permissions when creating new hooks
- Use version control to preserve permissions
- Create directories with appropriate permissions from the start
- On Windows, consider using WSL for full Unix permissions support
---
## FAQ
### General Questions
**Q: How do I check if babysitter is properly installed?**
A: Run the verification script:
```bash
```
**Q: What Node.js version is required?**
A: Node.js v18 or later is required. Check with `node --version`.
**Q: How do I update the SDK CLI?**
A:
```bash
npm install -g @a5c-ai/babysitter-sdk@latest
# Or use npx which always gets latest:
npx -y @a5c-ai/babysitter-sdk@latest
```
### Run Management
**Q: How do I cancel a running orchestration?**
A: You can:
1. Stop the iteration loop (Ctrl+C if running in terminal)
2. Delete the session state file for in-session loops:
```bash
python - <<'PY'
from pathlib import Path
p = Path.home() / '.a5c' / 'state' / '<session-id>.md'
s = p.read_text()
p.write_text(s.replace('active: true', 'active: false', 1))
PY
```
**Q: How do I resume a failed run?**
A: Use the run:iterate command to continue:
```bash
$CLI run:iterate <runId> --json --iteration <next-iteration>
```
**Q: How do I view the full history of a run?**
A:
```bash
$CLI run:events <runId> --limit 100 --json | jq '.events'
```
**Q: Can I run multiple orchestrations in parallel?**
A: Yes. Each run has its own directory and state. For in-session loops, each Claude Code session has isolated state via `AGENT_SESSION_ID`.
### Hooks
**Q: Why is my custom hook not being called?**
A: Check these in order:
1. Hook is in the correct directory (`.a5c/hooks/<hook-name>/`)
2. Hook file ends with `.sh`
3. Hook is executable (`chmod +x`)
4. Hook outputs valid JSON to stdout
**Q: How do I debug a hook?**
A: Test it manually:
```bash
echo '{"runId":"test"}' | ./my-hook.sh
```
Add debug logging to stderr:
```bash
echo "[DEBUG] My message" >&2
```
**Q: What hooks are available?**
A:
- **SDK Lifecycle:** `on-run-start`, `on-run-complete`, `on-run-fail`, `on-task-start`, `on-task-complete`, `on-iteration-start`, `on-iteration-end`, `on-step-dispatch`
- **Process-Level:** `pre-commit`, `pre-branch`, `post-planning`, `on-score`, `on-breakpoint`
### Session Loops
**Q: How do I stop an in-session loop?**
A:
1. Use `--max-iterations` to set a limit
2. Output the completion proof: `<promise>SECRET</promise>`
3. Mark the state file inactive:
```bash
python - <<'PY'
from pathlib import Path
p = Path.home() / '.a5c' / 'state' / '<session-id>.md'
s = p.read_text()
p.write_text(s.replace('active: true', 'active: false', 1))
PY
```
**Q: Where is the session state stored?**
A: `~/.a5c/state/${AGENT_SESSION_ID}.md`
**Q: What happens if I close Claude Code during a loop?**
A: The state file remains. When you restart, the loop will not resume automatically. You can:
- Mark the state file inactive to clear hook blocking while retaining recovery context
- Start a new loop with `/babysitter:babysit`
### Troubleshooting Commands
**Q: What is the most useful diagnostic command?**
A: The health check provides a comprehensive overview:
```bash
```
**Q: How do I get verbose output from the CLI?**
A:
```bash
$CLI run:status <runId> --verbose --json
```
**Q: How do I check what tasks are blocking a run?**
A:
```bash
$CLI task:list <runId> --pending --json | jq '.tasks'
```
---
## Getting Help
### Documentation
- **Plugin Specification:** `plugins/babysitter-unified/plugin.json`
- **Hooks Guide:** `plugins/babysitter-unified/skills/babysit/SKILL.md`
- **SDK Reference:** `packages/sdk/sdk.md`
- **In-Session Loops:** `packages/sdk/src/cli/commands/instructions.ts`
### CLI Help
```bash
npx -y @a5c-ai/babysitter-sdk@latest --help
npx -y @a5c-ai/babysitter-sdk@latest run:create --help
npx -y @a5c-ai/babysitter-sdk@latest task:post --help
```
### Useful Diagnostic Data to Collect
When reporting issues, collect:
1. **System info:**
```bash
```
2. **Health check:**
```bash
```
3. **Run status (if applicable):**
```bash
$CLI run:status <runId> --json
```
4. **Recent events:**
```bash
$CLI run:events <runId> --limit 20 --reverse --json
```
5. **Stop hook logs (for session loops):**
```bash
cat /tmp/babysitter-stop-hook.log
```
### Report Issues
- GitHub Issues: https://github.com/a5c-ai/babysitter/issues
- Include: CLI version, error output, diagnostic data from above
---
**Document Metadata:**
- Created: 2026-02-03
- Version: 1.0.0
- Component: Babysitter Plugin Troubleshooting Guide
- Status: Production
documents
[]
Outgoing edges
None.
Incoming edges
contains_page1
- page:docs-reference·PageReference