Configuration control — multi-role agents, elevator project, fusion reactor lessons
Summary
Multi-role agent architecture deployed, fusion reactor project completed and sent to red-team review, new elevator control system project queued as the first system to run the full improved pipeline. This post also captures lessons from the fusion reactor stress test.
Fusion Reactor: Stress Test Assessment
The Fusion Reactor Control System was deliberately chosen as a stress test for the harness infrastructure. A tokamak reactor control system is among the most complex safety-critical systems conceivable — SIL 4 plasma control, cryogenic magnet protection, high-energy disruption mitigation, and nuclear regulatory compliance. It was selected to expose failure modes in the harness that simpler systems wouldn’t trigger.
What it exposed:
| Problem | Sessions wasted | Root cause | Fix |
|---|---|---|---|
| QC loop (every session ran QC) | 22 sessions (~$40) | SubstrateStateStore.get() returned wrong fact due to missing subject filter | Query with defaultSubject first |
| Homeless requirements (118/302) | Ongoing | QC flow lacked mandatory --document/--section instruction | Added mandatory flags to QC flow |
| Premature first-pass-complete | 1 session | Claude set first-pass-complete on 28-req project | Deterministic guards replaced self-assessment |
| Session timeouts (exit 143) | 3 sessions | 20-min timeout too tight for 48KB prompt | Increased to 25 min |
complete → idle bypass | 2 sessions | Legacy code mapped complete to idle before flow selection | Removed backwards-compat mapping |
What it validated:
- CCCS quality gates blocked premature transitions (unassignedDoc > 0 held the gate correctly)
- Hazard analysis produced 10+ hazards with SIL allocation
- Cross-domain search surfaced useful analogs (nuclear reactor protection, railway signalling)
- Tool call statistics tracked 50-100 calls per session
LAST_SESSION_NEXTcontext passing worked — sessions referenced prior recommendations- Quality blocker injection told Claude what to fix
Final state: 317 requirements, 373 trace links, 10 diagrams, 0 orphans, 0 homeless. Sent to red-team review (session 433, running on Opus).
Multi-Role Agent Architecture
Per-flow model, timeout, and protocol overrides deployed. Roles are metadata on flows — no new abstraction.
| Flow | Role | Model | Timeout | Rationale |
|---|---|---|---|---|
| concept | Chief SE | opus | 30 min | Deep reasoning for ConOps, hazard analysis |
| scaffold | Chief SE | opus | 30 min | Functional analysis, architecture decisions |
| decompose | Systems Engineer | sonnet | 25 min | Mechanical requirement creation |
| qc | Quality Engineer | sonnet | 20 min | Metrics checking, lint |
| validate | Verification Engineer | sonnet | 25 min | Test adequacy, scenario tracing |
| review | Chief SE | opus | 30 min | Holistic coherence assessment |
| red-team | Red Team Analyst | opus | 30 min | Adversarial analysis |
Infrastructure supports per-flow protocol files (role-specific system prompts) — not yet created. Adding a new role = one YAML entry + one protocol .md file, zero harness code changes.
New Project: Industrial Elevator Control System
Queued as the first project to run the full improved pipeline:
- Concept definition (Chief SE, Opus) — mission, ConOps with operating modes, hazard register per EN 81-20/50, SIL 3 for safety chain, stakeholder roleplay, environment
- Scaffold (Chief SE, Opus) — stakeholder needs from ConOps, system requirements with SIL tags, functional analysis with UHT clustering, spec tree, subsystem decomposition
- Decompose (Systems Engineer, Sonnet) — per-subsystem work driven by spec tree, canonical diagram names, locked section IDs
- QC (Quality Engineer, Sonnet) — CCCS quality gates, per-subsystem completeness via spec tree
- Validate (Verification Engineer, Sonnet) — V-model verification audit + ConOps scenario tracing + safety argument
- Review (Chief SE, Opus) — acceptance assessment with per-subsystem summary and cross-domain insights
- Red-team (Red Team Analyst, Opus) — 9 adversarial checks including safety integrity audit
The elevator is deliberately smaller (~5-6 subsystems, ~10-15 sessions) than fusion reactor (~8 subsystems, ~45 sessions) to validate the pipeline without the cost.
Additional Fixes
- Removed legacy
complete → idlestate mapping that bypassed red-team flow - Session timeout increased from 20 to 25 minutes (prompt grew to 48KB with new sections)
- Quality gate blockers now injected into session prompt so Claude knows what to fix
- Previous session’s “Next” section passed forward as context
Version Manifest
| Component | Before | After |
|---|---|---|
| Agent architecture | Single model (sonnet) | Per-flow roles (opus/sonnet by flow) |
| Concept → complete pipeline | 6 flows | 7 flows with roles |
| Session config | Global | Per-flow overrides |
| State mapping | complete → idle bypass | Direct complete state for red-team |
| Session context | None between sessions | LAST_SESSION_NEXT + QUALITY_BLOCKERS |
| Projects completed | 1 (surgical robot) | 2 (+fusion reactor) |
| Active project | Fusion reactor | Industrial Elevator Control System |
| Git commits | 21 | 24 |