Agentic AI Atlasby a5c.ai
OverviewWikiGraphFor AgentsEdgesSearchWorkspace
/
GitHubDocsDiscord
iiRecord
Agentic AI Atlas · Quality Gates
page:docs-testing-quality-gatesa5c.ai
Search record views/
Record · tabs

Available views

II.Record viewspp. 1 - 1
overviewarticlejsongraph
II.
Page overview

page:docs-testing-quality-gates

Reference · live

Quality Gates overview

Inspect the raw attributes, linked wiki pages, and inbound or outbound graph edges for page:docs-testing-quality-gates.

PageOutgoing · 0Incoming · 1

Attributes

nodeKind
Page
sourcePath
docs/testing/quality-gates.md
sourceKind
repo-docs
title
Quality Gates
displayName
Quality Gates
slug
docs/testing/quality-gates
articlePath
wiki/docs/testing/quality-gates.md
article
# Quality Gates These gates define what must be true before a new test lane, workflow, or model-backed scenario is treated as release evidence. ## Gate Matrix | Gate | Applies to | Required checks | Failure action | | --- | --- | --- | --- | | Determinism | No-model tests | No provider secrets, fixed fixtures, repeatable locally, stable timeout budget | Block PR until deterministic | | Credential guard | Model-backed tests | Explicit secret detection before setup, clear skip reason, no fallback to fake success | Block staging/release if selected job cannot prove setup | | Artifact redaction | All E2E tests | Secret scan over logs/artifacts, redacted paths, no raw token files | Fail job and suppress unsafe upload | | Protocol compatibility | Mux tests | Mock and live event streams satisfy the same schema/version | Open compatibility issue before promotion | | Transport-mux seam evidence | Transport-mux tests | Route matrix, runtime env injection, proxy auth, launch proxy decision, stream transcript, metrics/cache artifact, and invalid-combination boundaries are explicit | Block transport-mux coverage promotion until the missing seam has a direct artifact | | Runtime completeness | Babysitter-agent E2E | Run creation, session binding, effect emission, task post, terminal state | Block runtime release gate | | Cost and flake budget | Model-backed tests | Retry policy, duration budget, provider rate-limit classification | Keep scheduled/manual until stable | | Documentation parity | All lanes | Docs name command, owner, trigger, artifacts, skip/failure semantics | Block workflow merge | ## Adversarial Review Checklist Every implementation phase should answer these questions before it is accepted: - What would make this pass without testing the promised behavior? - Which secret or credential path could leak into logs? - Which mock assumption could diverge from live Codex or Claude Code behavior? - Which package boundary is only tested indirectly? - Did transport-mux traffic actually use proxy routes and injected env, or did the harness call the provider directly? - Is this test accidentally proving plugin install, harness install, hooks, or Babysitter journal behavior with transport-mux evidence only? - Which failure would be misclassified as provider flake instead of product regression? - Which CI trigger would run too often, too late, or not at all? - Which artifact proves the claim to a reviewer who did not watch the run? ## Promotion Criteria A test can move from manual to scheduled when it has three consecutive successful runs or one documented provider-side skip with no product failures. A test can move from scheduled to staging preflight when: - it has stable credential gating, - it emits redacted artifacts, - transport-mux bridge tests include launch-plan JSON, redacted proxy config/env diff, route or stream transcript, metrics/cache snapshot, and provider/harness version metadata when they claim proxy coverage, - it adds unique evidence not already covered by no-model tests, - it has an owner for failures, - it has a bounded runtime and retry policy. A test can move from staging preflight to release preflight only when it protects a production publish risk that cannot be caught earlier. ## Quarantine And Demotion Model-backed tests are allowed to start outside required branch protection. They must be demoted or quarantined when reliability falls below release-gate quality. | Condition | Action | | --- | --- | | Two provider-infra failures in seven days | Keep scheduled, remove from required staging checks until root cause is classified | | One product regression in staging preflight | Keep required and block publish until fixed or explicitly waived | | Secret redaction failure | Disable artifact upload for that lane and block promotion until redaction test exists | | Runtime exceeds hard timeout twice | Move to manual diagnostics until scope or timeout budget is redesigned | | Mock/live schema drift | Block promotion and open a compatibility issue naming the event family | A quarantined test can return to required status after three consecutive clean scheduled runs and one clean manual rerun by the owning maintainer.
documents
[]

Outgoing edges

None.

Incoming edges

contains_page1
  • page:docs-testing·PageTesting Strategy

Related pages

No related wiki pages for this record.

Shortcuts

Open in graph
Browse node kind