Your agents do the work.
We make sure they follow the rules.
Five lines to a pre-execution enforcement gate that writes to your existing Postgres. Enforcement gates for every action routed through the SDK — budget caps, scope checks, loop detection, human-in-the-loop approvals, agent presence monitoring, and a tamper-evident audit trail that compliance can verify themselves. See the threat model for what the SDK does and does not protect against.
Current version — v0.6.0
v0.6.0 ships Ed25519 per-row audit signing, HMAC chain key rotation, and a signed Article 12 evidence bundle export your compliance officer can hand to a regulator. Upgrade: pip install --upgrade "code-atelier-governance[migrations]>=0.6.0" then run governance migrate (applies base DDL + alembic upgrade head in one shot). See the changelog for the full upgrade runbook, including the kill → halt audit-event-kind rename that affects downstream monitoring queries.
MIT licensed. Free and open source.
import os
from codeatelier_governance import GovernanceSDK, AuditEvent
async with GovernanceSDK(database_url=os.environ["DATABASE_URL"]) as sdk:
await sdk.audit.log(AuditEvent(agent_id="my-agent", kind="hello"))Full example with scope & budget enforcement
from codeatelier_governance import ScopePolicy, BudgetPolicy
sdk.scope.register(ScopePolicy(
agent_id="billing-agent",
allowed_tools=frozenset({"read_invoice", "send_email"}),
hidden_tools=frozenset({"delete_all"}), # hidden from LLM context
))
sdk.cost.register(BudgetPolicy(
agent_id="billing-agent",
per_session_usd=0.50,
per_session_seconds=300, # 5-minute session time limit
per_agent_usd_daily=10.00,
))
# Auto-compute cost from model name — no manual usd= needed:
await sdk.cost.track_usage("billing-agent", session_id,
model="gpt-4o", input_tokens=1000, output_tokens=500)Sync wrapper for Flask / Django
import os
from codeatelier_governance import GovernanceSDKSync, AuditEvent
with GovernanceSDKSync(database_url=os.environ["DATABASE_URL"]) as sdk:
sdk.audit.log(AuditEvent(agent_id="my-agent", kind="tool.call"))
sdk.scope.check("my-agent", tool="send_email")
sdk.cost.check_or_raise("my-agent", session_id)Most tools tell you what your agent did. After the damage is done.
LangSmith, Langfuse, and Helicone are observability platforms — they log what happened, after the fact. That does not stop a runaway agent from burning $600 overnight, calling a tool it should never touch, or modifying a patient record without approval. The Governance SDK is different: it gates decisions before the LLM call fires.
Enforcement modules. One substrate.
Core Enforcement
Decision Audit Trail
shippedEvery agent action is an HMAC-chained, append-only row in your Postgres. Tamper with any past row and every subsequent row's verification fails. Chain fork detection catches parallel branch insertion. Compliance can verify the chain themselves.
Action Scope Enforcement
shippedWhitelist which tools and APIs each agent can call. Exact-match or explicit prefix. No regex, no eval. Violations are blocked and audit-logged automatically. Hidden tool policies remove tools from the LLM's context entirely.
Spend Limits & Budget Gates
shippedToken and USD caps per session and per agent/day, plus session time limits. Built-in pricing for 24 models auto-computes USD. The gate denies the call before the LLM fires if the budget is exceeded. Fail-closed by default.
Human-in-the-Loop Gates
shippedHigh-risk actions require a signed, single-use approval token from a human. Tokens are HMAC-bound to the specific action, time-limited, and replay-proof. Self-approval prevention blocks the operator who owns the agent from approving their own agent's requests (fail-closed).
Extended Modules
Loop & Anomaly Detection
shippedDetect when an agent calls the same tool repeatedly in a sliding window. Kill runaway loops automatically or log them for review. Configurable per-agent thresholds.
Agent Presence & Kill Switch
shippedReal-time visibility into which agents are live, idle, or unresponsive. Operator-triggered kill switch (v0.5.4): clicking Halt in the console fail-closes the agent's next enforcement gate within 5 seconds. Heartbeat-based lifecycle with automatic stale detection and operator identity tracking. No background worker needed.
Behavioral Contracts
shippedDeclarative pre/post conditions that wrap any tool call. Enforce budget, scope, and approval requirements in a single context manager.
EU AI Act Compliance
shippedAutomated Article 12 evidence reports from your audit data. Seven-section mapping with compliant/partial/non-compliant status per requirement.
Anthropic Adapter
shippedNative wrap_anthropic() integration. Auto token tracking, USD cost estimation, and budget enforcement for Claude API calls.
Fast enough to gate every call.
Shared connection pool
Single SQLAlchemy engine with ~15 connections per SDK instance. No per-request connection overhead.
Concurrent audit writes
Pre-call audit events are backgrounded. Post-call audit, cost tracking, and presence updates run in parallel.
Combined budget query
Session + daily counters checked in one DB round-trip. No separate queries per cap type.
300+ tests passing. MIT licensed.
Just Postgres. Nothing else.
The only infrastructure dependency is a Postgres connection string. No ClickHouse, no Redis, no Kafka, no S3, no sidecar, no background worker. We write to the database your application already has. Even the optional console GUI reads from the same Postgres — zero new infrastructure, ever.
Enforcement, not just tracing.
| Feature | Code Atelier | LangSmith | Langfuse | Helicone | MS AGT |
|---|---|---|---|---|---|
| In-process enforcement (blocks before LLM call) | ✓ | ✕ | ✕ | ✕ | ✓ |
| HMAC-chained tamper-evident audit | ✓ | ✕ | ✕ | ✕ | ✕ |
| Just Postgres (no ClickHouse/Redis/Kafka) | ✓ | ✕ | ✕ | ✕ | ✕ |
| Cost / budget caps with fail-closed gate | ✓ | ✕ | ✕ | ✕ | ✕ |
| Pre-execution approval gates (block before action) | ✓ | ✕ | ✕ | ✕ | ✕ |
| Tool / API scope enforcement | ✓ | ✕ | ✕ | ✕ | ✓ |
| Self-hosted (open source) | ✓ | ✕ | ✓ | ✓ | ✓ |
| Read-only governance console GUI | ✓ | ✓ | ✓ | ✓ | ✕ |
| OpenAI / LangChain / Anthropic adapters | ✓ | ✓ | ✓ | ✕ | ✓ |
| Framework-agnostic (decorator API) | ✓ | ✕ | ✓ | ✕ | ✓ |
| Behavioral contracts (pre/post conditions) | ✓ | ✕ | ✕ | ✕ | ✕ |
| EU AI Act Article 12 evidence reports | ✓ | ✕ | ✕ | ✕ | ✕ |
| Sync wrapper (Flask / Django) | ✓ | ✕ | ✕ | ✕ | ✕ |
| Self-approval prevention (fail-closed) | ✓ | ✕ | ✕ | ✕ | ✕ |
Get started in five minutes
Install the SDK, apply the schema, log your first event.