Independent Research · Est. 2025


CIRWEL Systems

— A research preface, with running code

Runtime governance
for heterogeneous
AI-agent fleets.

CIRWEL Systems supplies the missing reading: continuous, class-calibrated self-state telemetry for every agent, with signed provenance behind every intervention. Most AI-safety work reasons about agents in two windows — pre-deployment evaluation and post-incident forensics. Between them, while agents are actually running, event traces are abundant but there is no canonical reading of an agent's own state.


— The receipts

Paper
UNITARES: Information-Theoretic Governance of Heterogeneous Agent Fleets · Wang, 2026 · CC-BY 4.0
DOI
10.5281/zenodo.19647159 (concept · resolves to latest)
Author
Kenny Wang · Independent Researcher · ORCID 0009-0006-7544-2374
Production
In continuous operation since November 2025 · governing CIRWEL's own development fleet
For
fleet operators · AI-safety teams · underwriters of agent-driven systems
Code
github.com/CIRWEL/unitares · server, Apache 2.0
github.com/CIRWEL/unitares-governance-plugin · Claude Code / Codex client, Apache 2.0
github.com/CIRWEL/unitares-paper-v6 · paper source (LaTeX)
huggingface.co/hikewa · datasets & distilled models

— Production snapshot

Governance events
93K+
Last 7 days
6856
Active agents (7d)
17682
KG discoveries
982

Self-traffic from a single-operator workstation, not external adoption — a stress test of the pipeline, not a market signal. Snapshot as of 2026-06-03, matched to the public README production snapshot.


§01 — The thesis

Runtime self-state is the missing layer.

Operating a heterogeneous agent fleet in continuous production surfaces a class of failure that pre-deployment evaluation and post-incident forensics don't catch: the slow, silent drift of agent state across hours and days of normal running. The traces show what the agent did, abundantly. They don't show what state it was in while doing it — because nothing on the agent or alongside it is producing that reading.

Logs are what an agent did.
Self-state is what it was while doing it.

CIRWEL's response is a runtime layer that gives each agent a continuous reading of its own state — a four-dimensional vector summarizing capacity, signal integrity, uncertainty, and the imbalance among them, updated from every check-in. The reading is calibrated against agents of the same class (a coding session, a research conversation, a resident cron, an embodied service, an ephemeral parser), because a long-running coding assistant does not behave like an ephemeral parser, and neither behaves like an embodied service. Drift is then detected against the right reference, not an averaged one.

The framework is described in a paper and has been governing CIRWEL's own development fleet without interruption since November 2025.


— A measurement on our own fleet

Per-class and fleet-wide baselines disagree on 29% of verdicts.

On a 30-day slice of our own production data — 13,310 governance observations across the fleet — replaying each decision with per-class baselines instead of one fleet-wide baseline produces a different verdict 28.9% of the time. The disagreement skews systematically: state vectors the fleet-wide baseline classifies as healthy or borderline are usually flagged as drifting under per-class baselines. Per-class flip rates range 15–33%.

This is internal measurement, not external validation. But the gap is large enough to show that averaging dissimilar agents into one distribution is not a benign default; class-conditional calibration is the response. (§11.6 of the paper.)


§02 — Three pillars

i

Class-conditional calibration

A coding agent and a research agent are not held to the same statistics. UNITARES learns separate baselines per agent class from production telemetry, so drift in one class is not masked by noise from another.

→ § how the layers fit together

ii

Drift detection at runtime

Continuous state observation catches behavioral drift while it is happening, not in the post-incident review. Verdicts arrive early enough to intervene, late enough to be evidence-based.

→ § verdicts in motion

iii

Auditable provenance

Every intervention carries a signed lineage back to the observation that triggered it. Regulators, underwriters, and the next-shift human can replay the chain — not just read a verdict.

→ § identity & lineage

§03 — In production

Governing its own development.

The system you read about on this page also wrote, tested, and shipped a meaningful fraction of itself. CIRWEL's development fleet — a heterogeneous mix of long-running resident agents, short-lived coding sessions, an embodied edge service, and a Discord bridge — has been governed continuously by UNITARES since November 2025.

Each check-in produces one of four governance verdicts — proceed, guide, pause, reject — together with a signed lineage back to the observation that triggered it.

Living under one's own framework is the cheapest credibility a research operator can offer. We treat it as the floor, not the ceiling.

This page is part of the loop. The colophon below shows the exact commit and build time that produced what you are reading.


§04 — Engage

Three ways in.