Runtime governance
for heterogeneous
AI-agent fleets.
CIRWEL Systems supplies the missing reading: continuous,
class-calibrated self-state telemetry for every agent, with signed
provenance behind every intervention. Most AI-safety work reasons
about agents in two windows — pre-deployment evaluation and
post-incident forensics. Between them, while agents are actually
running, event traces are abundant but there is no canonical
reading of an agent's own state.
Self-traffic from a single-operator workstation, not external adoption — a
stress test of the pipeline, not a market signal. Snapshot as of
2026-06-03, matched to the public README production snapshot.
§01 — The thesis
Runtime self-state is the missing layer.
Operating a heterogeneous agent fleet in continuous production surfaces
a class of failure that pre-deployment evaluation and post-incident
forensics don't catch: the slow, silent drift of agent state across
hours and days of normal running. The traces show what the
agent did, abundantly. They don't show what state it was in while
doing it — because nothing on the agent or alongside it is
producing that reading.
Logs are what an agent did. Self-state is what it was while doing it.
CIRWEL's response is a runtime layer that gives each agent a continuous
reading of its own state — a four-dimensional vector summarizing
capacity, signal integrity, uncertainty, and the imbalance among them,
updated from every check-in. The reading is calibrated against agents
of the same class (a coding session, a research conversation, a
resident cron, an embodied service, an ephemeral parser), because a
long-running coding assistant does not behave like an ephemeral parser,
and neither behaves like an embodied service. Drift is then detected
against the right reference, not an averaged one.
The framework is described in
a paper
and has been governing CIRWEL's own development fleet without
interruption since November 2025.
— A measurement on our own fleet
Per-class and fleet-wide baselines disagree on 29% of verdicts.
On a 30-day slice of our own production data — 13,310 governance
observations across the fleet — replaying each decision with
per-class baselines instead of one fleet-wide baseline produces
a different verdict 28.9% of
the time. The disagreement skews systematically: state vectors
the fleet-wide baseline classifies as healthy or borderline are
usually flagged as drifting under per-class baselines.
Per-class flip rates range 15–33%.
This is internal measurement, not external validation. But the
gap is large enough to show that averaging dissimilar agents
into one distribution is not a benign default; class-conditional
calibration is the response.
(§11.6 of the
paper.)
§02 — Three pillars
i
Class-conditional calibration
A coding agent and a research agent are not held to the same statistics.
UNITARES learns separate baselines per agent class from production
telemetry, so drift in one class is not masked by noise from another.
Continuous state observation catches behavioral drift while it is
happening, not in the post-incident review. Verdicts arrive early enough
to intervene, late enough to be evidence-based.
Every intervention carries a signed lineage back to the observation that
triggered it. Regulators, underwriters, and the next-shift human can
replay the chain — not just read a verdict.
The system you read about on this page also wrote, tested, and shipped a
meaningful fraction of itself. CIRWEL's development fleet — a heterogeneous
mix of long-running resident agents, short-lived coding sessions, an
embodied edge service, and a Discord bridge — has been governed
continuously by UNITARES since November 2025.
Each check-in produces one of four governance verdicts —
proceed,
guide,
pause,
reject — together with a signed
lineage back to the observation that triggered it.
Living under one's own framework is the cheapest credibility a research
operator can offer. We treat it as the floor, not the ceiling.
This page is part of the loop. The colophon below shows the exact
commit and build time that produced what you are reading.