Skip to main content
Open-source Python SDK

Your agents do the work.
We make sure they follow the rules.

Five lines to a pre-execution enforcement gate that writes to your existing Postgres. Enforcement gates for every action routed through the SDK — budget caps, scope checks, loop detection, human-in-the-loop approvals, agent presence monitoring, and a tamper-evident audit trail that compliance can verify themselves. See the threat model for what the SDK does and does not protect against.

Current version — v0.6.0

v0.6.0 ships Ed25519 per-row audit signing, HMAC chain key rotation, and a signed Article 12 evidence bundle export your compliance officer can hand to a regulator. Upgrade: pip install --upgrade "code-atelier-governance[migrations]>=0.6.0" then run governance migrate (applies base DDL + alembic upgrade head in one shot). See the changelog for the full upgrade runbook, including the killhalt audit-event-kind rename that affects downstream monitoring queries.

MIT licensed. Free and open source.

quickstart.pypython
import os
from codeatelier_governance import GovernanceSDK, AuditEvent

async with GovernanceSDK(database_url=os.environ["DATABASE_URL"]) as sdk:
    await sdk.audit.log(AuditEvent(agent_id="my-agent", kind="hello"))
Full example with scope & budget enforcement
from codeatelier_governance import ScopePolicy, BudgetPolicy

sdk.scope.register(ScopePolicy(
    agent_id="billing-agent",
    allowed_tools=frozenset({"read_invoice", "send_email"}),
    hidden_tools=frozenset({"delete_all"}),  # hidden from LLM context
))

sdk.cost.register(BudgetPolicy(
    agent_id="billing-agent",
    per_session_usd=0.50,
    per_session_seconds=300,  # 5-minute session time limit
    per_agent_usd_daily=10.00,
))

# Auto-compute cost from model name — no manual usd= needed:
await sdk.cost.track_usage("billing-agent", session_id,
    model="gpt-4o", input_tokens=1000, output_tokens=500)
Sync wrapper for Flask / Django
import os
from codeatelier_governance import GovernanceSDKSync, AuditEvent

with GovernanceSDKSync(database_url=os.environ["DATABASE_URL"]) as sdk:
    sdk.audit.log(AuditEvent(agent_id="my-agent", kind="tool.call"))
    sdk.scope.check("my-agent", tool="send_email")
    sdk.cost.check_or_raise("my-agent", session_id)
The problem

Most tools tell you what your agent did. After the damage is done.

LangSmith, Langfuse, and Helicone are observability platforms — they log what happened, after the fact. That does not stop a runaway agent from burning $600 overnight, calling a tool it should never touch, or modifying a patient record without approval. The Governance SDK is different: it gates decisions before the LLM call fires.

What you get

Enforcement modules. One substrate.

Core Enforcement

Decision Audit Trail

shipped

Every agent action is an HMAC-chained, append-only row in your Postgres. Tamper with any past row and every subsequent row's verification fails. Chain fork detection catches parallel branch insertion. Compliance can verify the chain themselves.

Action Scope Enforcement

shipped

Whitelist which tools and APIs each agent can call. Exact-match or explicit prefix. No regex, no eval. Violations are blocked and audit-logged automatically. Hidden tool policies remove tools from the LLM's context entirely.

Spend Limits & Budget Gates

shipped

Token and USD caps per session and per agent/day, plus session time limits. Built-in pricing for 24 models auto-computes USD. The gate denies the call before the LLM fires if the budget is exceeded. Fail-closed by default.

Human-in-the-Loop Gates

shipped

High-risk actions require a signed, single-use approval token from a human. Tokens are HMAC-bound to the specific action, time-limited, and replay-proof. Self-approval prevention blocks the operator who owns the agent from approving their own agent's requests (fail-closed).

Extended Modules

Loop & Anomaly Detection

shipped

Detect when an agent calls the same tool repeatedly in a sliding window. Kill runaway loops automatically or log them for review. Configurable per-agent thresholds.

Agent Presence & Kill Switch

shipped

Real-time visibility into which agents are live, idle, or unresponsive. Operator-triggered kill switch (v0.5.4): clicking Halt in the console fail-closes the agent's next enforcement gate within 5 seconds. Heartbeat-based lifecycle with automatic stale detection and operator identity tracking. No background worker needed.

Behavioral Contracts

shipped

Declarative pre/post conditions that wrap any tool call. Enforce budget, scope, and approval requirements in a single context manager.

EU AI Act Compliance

shipped

Automated Article 12 evidence reports from your audit data. Seven-section mapping with compliant/partial/non-compliant status per requirement.

Anthropic Adapter

shipped

Native wrap_anthropic() integration. Auto token tracking, USD cost estimation, and budget enforcement for Claude API calls.

Performance

Fast enough to gate every call.

Shared connection pool

Single SQLAlchemy engine with ~15 connections per SDK instance. No per-request connection overhead.

Concurrent audit writes

Pre-call audit events are backgrounded. Post-call audit, cost tracking, and presence updates run in parallel.

Combined budget query

Session + daily counters checked in one DB round-trip. No separate queries per cap type.

300+ tests passing. MIT licensed.

Just Postgres. Nothing else.

The only infrastructure dependency is a Postgres connection string. No ClickHouse, no Redis, no Kafka, no S3, no sidecar, no background worker. We write to the database your application already has. Even the optional console GUI reads from the same Postgres — zero new infrastructure, ever.

How we compare

Enforcement, not just tracing.

FeatureCode AtelierLangSmithLangfuseHeliconeMS AGT
In-process enforcement (blocks before LLM call)
HMAC-chained tamper-evident audit
Just Postgres (no ClickHouse/Redis/Kafka)
Cost / budget caps with fail-closed gate
Pre-execution approval gates (block before action)
Tool / API scope enforcement
Self-hosted (open source)
Read-only governance console GUI
OpenAI / LangChain / Anthropic adapters
Framework-agnostic (decorator API)
Behavioral contracts (pre/post conditions)
EU AI Act Article 12 evidence reports
Sync wrapper (Flask / Django)
Self-approval prevention (fail-closed)

Get started in five minutes

Install the SDK, apply the schema, log your first event.