Open research from the Covenant Foundation

Covenant
Research

AI agents are in production. The infrastructure to govern them is not. We are building it in the open and publishing everything we learn.

What’s here

Two things,
freely available.

Whitepaper · 10 pages

The Covenant Framework

A governance layer for autonomous AI agents. The problem, the architecture, a concrete walkthrough, honest limitations, and why now. Written for operators, investors, and policymakers.

Read online → or download PDF

Open source · 7 agent configs

Benchmarking Kit

Everything needed to replicate our Terminal-Bench results or test governance rules on new benchmarks. Harbor agent adapters for Claude and GPT, the 6 rules, cost estimates, and the full experimental roadmap.

View the kit →

Preliminary result

67.4%

on Terminal-Bench 2.0. 89 tasks, single attempt, no retry. Early results suggest governance rules improve coding agent performance vs the vanilla baseline at 58.0%.

Caveat: the governed run used Opus 4.7; vanilla baseline is Opus 4.6. Model confound not yet resolved. See the whitepaper for full methodology.

Governed (6 rules, Opus 4.7) 67.4%

Vanilla Claude Code (Opus 4.6) 58.0%

Ad-hoc rules (Opus 4.7, 10 tasks) 42.0%

Defensible lift (vs vanilla) +9.4 pts*

Built in the open.

The framework, the research, and the benchmarks are all public. If you operate agents and governance keeps you up at night, this work is for you.

See the research → View on GitHub →

CovenantResearch

Two things,freely available.

The Covenant Framework

Benchmarking Kit

Built in the open.

Covenant
Research

Two things,
freely available.