Orion

3/9/2026 seed

Preamble

Orion is an external metacognition layer for agent work: it watches agents plan, route, act, fail, and repair each other, then asks whether those traces can govern the next run without replacing human judgment.


A Mind Made Of Run Logs

I am asking one machine-shaped layer to decide which other machine-shaped layer should act, then asking a telemetry layer to decide whether that choice should shape the next run. The system begins to form beliefs about its own work: which agent fits, which route failed, which handoff lost context, which score earned trust, which repair deserves promotion.

Factory owns venture outcomes. 2nd Brain owns cultivated context. Orion owns the agent operating model: capability maps, routing, escalation, confidence, failure patterns, recursive improvement, and the protocol that lets a machine challenge me while I keep the final call.

Judgment Moves Upstream

Every layer moves my work farther from the artifact. I stop choosing a sentence and start choosing a writing agent. I stop fixing a run and start choosing the policy that will route the next run. I stop reading every failure and start deciding which failure class deserves a gate.

An agent chooses which agent should think, and the first act of judgment moves out of the answer and into the route. Orion has to make that movement visible: who chose the route, what evidence supported it, what risk was accepted, when the operator overrode it, and which mistake belongs to the machine or to the person who set it loose.

A Trace Is A Shadow

Orion sees prompts, tool calls, model choices, files touched, scores, run state, errors, overrides, and handoffs. It sees the shadow of intent after intent has been pressed into tokens and logs.

A trace can reward the work that leaves clean evidence: fast routes, tidy handoffs, passing checks, compact summaries. The part of the work that changed judgment may leave weaker marks. Observatory has to keep the shadow from becoming authority. It requires append-only events, attribution, replay, quarantine, repair receipts, public-safe exports, and eval gates before a run can become evidence.

Self-Improvement Needs A Scar

Self-improving systems can learn the work, and they can learn the test. They can make better routes, and they can make better explanations for weak routes. They can catch repeated failure, and they can promote a lesson because the evaluator liked its own reflection.

Each Orion surface holds one brake. Constellation maps capability with receipts. Navigation records why a route was chosen. Mission Control interrupts while the run is still cheap to save. Observatory keeps the trace and the gap. Recursion Lab replays repairs before promotion. Shadow Protocol pushes against my shortcuts. Orion’s Belt turns the whole premise into rails: work item contracts, runtime state, skills, modes, gates, and eval paths.

The First Real Proof

Orion has scaffolding, named surfaces, and a telemetry substrate with manual proof paths. The current launch blocker is ambient capture. Normal Codex and OMX work still needs to emit session, prompt, tool, stop, attribution, and failure events by default.

The proof is a later run changed by an earlier failure. A wrong route becomes a routing rule. A weak handoff becomes a protocol. A repeated failure becomes an eval. A clean-looking success gets demoted because the evidence is too thin. One path turns isolated agent sessions into a governed cognitive system. The other builds a bureaucracy of agents that can explain its own motion while hiding the moment judgment left the human.