Skip to main content

API Reference

Every public method on the SDK. All async-first, all type-hinted.

SDK Entry Point

GovernanceSDK(database_url=..., audit_secret=..., cost_fail_open=False, hot_reload=False, hot_reload_interval=30, enable_audit=True, enable_scope=True, enable_cost=True, enable_gates=True, enable_loop=True, enable_presence=True, enable_prompts=True, enable_routing=False, enable_coverage=False, verify_chain_on_read=False, warn_on_no_wrappers=True, default_max_tokens=None, cost_strict_unknown_models=True, cost_unknown_model_fallback_usd_per_million=None)

Main entry point. Connects to Postgres and initializes the enabled enforcement modules. Use as an async context manager for automatic start/close. Every module is opt-in via an enable_* flag — when False, the module is not constructed and accessing sdk.<module> raises AttributeError. verify_chain_on_read (v0.5.3) verifies the HMAC chain on every get_events() call. warn_on_no_wrappers (v0.5.3) emits a structlog warning at start() when no wrap_openai / wrap_anthropic has been registered. default_max_tokens (v0.5.3) sets the ceiling used by the projected budget gate when callers do not declare max_tokens. As of v0.6.2, halt enforcement spans scope + cost + gates + the wrap_openai / wrap_anthropic wrappers; calling any of those against a halted agent raises AgentHaltedError. cost_strict_unknown_models (v0.6.2, default True) flips CostModule to raise UnknownModelError on pricing lookups for unknown model IDs instead of silently zero-costing — set to False (plus a fallback rate) to restore v0.6.1 observe-only behaviour.

Returns: GovernanceSDK instance with the modules constructed for the enabled flags

Raises: ValueError if neither database_url nor api_key is provided, if the audit secret is too short / too weak, or if default_max_tokens is < 1

Note: v0.6.2 adds cost_strict_unknown_models, cost_unknown_model_fallback_usd_per_million, and enable_coverage. Halt enforcement now covers scope + cost + gates + wrappers (was scope-only through v0.5.4/v0.6.1). See the /governance/upgrade-to-0.6.2 guide for the four BREAKING-block defaults.

Examplepython
# Full enforcement (all modules, wrapper-coverage warning, projected budget ceiling)
async with GovernanceSDK(
    database_url=os.environ["DATABASE_URL"],
    default_max_tokens=4096,
    verify_chain_on_read=True,
) as sdk:
    await sdk.audit.log(AuditEvent(agent_id="x", kind="y"))

# Lightweight — no loop detection, no presence, audit-only
async with GovernanceSDK(
    database_url=os.environ["DATABASE_URL"],
    enable_loop=False,
    enable_presence=False,
    warn_on_no_wrappers=False,
) as sdk:
    ...  # sdk.loop / sdk.presence raise AttributeError on access

GovernanceSDKSync (Sync Wrapper)v0.5.0

GovernanceSDKSync(database_url=..., audit_secret=..., **kwargs)

Synchronous wrapper around GovernanceSDK. Spins up a daemon thread with a dedicated asyncio event loop. All async module methods (audit.log, cost.track, scope.check, etc.) are exposed as blocking sync calls. Supports context-manager usage. The background thread is a daemon thread so it does not prevent interpreter shutdown.

Returns: GovernanceSDKSync instance with .audit, .scope, .cost, .gates, .loop, .presence, .contracts modules (all sync)

Raises: RuntimeError if the background event loop stops unexpectedly

Examplepython
from codeatelier_governance import GovernanceSDKSync, AuditEvent

with GovernanceSDKSync(database_url=os.environ["DATABASE_URL"]) as sdk:
    sdk.audit.log(AuditEvent(agent_id="my-agent", kind="tool.call"))
    sdk.scope.check("my-agent", tool="send_email")
    sdk.cost.check_or_raise("my-agent", session_id)
GovernanceSDKSync.start() / .close()

Explicit lifecycle management. start() initializes the underlying async SDK (audit flusher, hot-reload, etc.). close() drains in-flight events and stops the background loop. Both are called automatically when using the context manager.

Top-level Exportsv0.5.0

These classes are now exported from the top-level codeatelier_governance package for convenience:

AuditEvent — previously only from codeatelier_governance.audit

ScopePolicy — previously only from codeatelier_governance.scope

BudgetPolicy — previously only from codeatelier_governance.cost

LoopPolicy, LoopDetected — from codeatelier_governance.loop

Contract, PreCondition, PostCondition, ContractViolation — from contracts

AgentStatus, GovernanceConfig, GovernanceSDKSync

AgentHaltedError (v0.6.0, primary) — raised by sdk.presence.assert_alive(), sdk.scope.check(), sdk.cost.check_or_raise(), sdk.gates.request(), and both LLM wrappers (wrap_openai / wrap_anthropic) when an operator has halted the agent via the console. Import via from codeatelier_governance.presence import AgentHaltedError. The AgentKilledError alias is still importable for v0.6.x and is removed in v0.7 — update your except clauses now.

UnknownModelError (v0.6.2) — raised by sdk.cost.track_usage() when pricing lookup misses and cost_strict_unknown_models=True (the new default). Carries the offending model name; fix by adding a pricing entry or passing cost_strict_unknown_models=False with cost_unknown_model_fallback_usd_per_million.

TokenVersionTooOldError (v0.6.2) — raised by sdk.gates.grant() / deny() after accept_v1_until expires when a v1-format token is presented. Rotate reviewers to the v2 token format (opt in via GatesModule(enable_v2_tokens=True)) before the cutoff, or extend the window via accept_v1_until.

Error Hierarchyv0.5.0

GovernanceError — base class for all SDK exceptions. Every subclass carries a recovery_hint field with an actionable next step.

ScopeViolation — tool or API not in the whitelist. Error message now includes the allowed tools list for debugging.

BudgetExceeded — session or daily cap hit. Includes recovery_hint with the specific cap that was exceeded.

LoopDetected — agent exceeded max calls in sliding window.

ContractViolation — pre or post condition failed.

ChainIntegrityError — HMAC verification or fork detection failed.

PolicyNotRegistered — scope check on an agent with no policy.

UnknownModelError (v0.6.2)cost.track_usage() lookup missed and cost_strict_unknown_models=True (new v0.6.2 default). Recovery hint lists: (1) add the model to cost/pricing.py::MODEL_PRICING, (2) set cost_strict_unknown_models=False plus cost_unknown_model_fallback_usd_per_million, or (3) pass an explicit usd= via cost.track(). Docs: upgrade-to-0.6.2.

TokenVersionTooOldError (v0.6.2) — v1 gate token presented past accept_v1_until (default ~2026-07-17). Recovery hint: mint a fresh token with GatesModule(enable_v2_tokens=True) or extend the window via GatesModule(accept_v1_until=...) / GOVERNANCE_GATES_ACCEPT_V1_UNTIL. Pre-cutoff v1 parses emit DeprecationWarning and gates.legacy_token_parsed INFO events so you see the deprecation before it turns into a raise.

AgentHaltedError (v0.6.0, RuntimeError subclass — NOT under GovernanceError) — raised when an operator has halted the agent via the console. As of v0.6.2 this fires across scope.check, cost.check_or_raise, gates.request, wrap_openai, and wrap_anthropic (was scope-only through v0.5.4/v0.6.1). Important: this is a RuntimeError, not a GovernanceError subclass — a bare except GovernanceError will NOT catch it. The v0.5.x AgentKilledError name is preserved as a back-compat alias in v0.6.x and removed in v0.7. Operator-facing gates.grant() / gates.deny() on tokens minted BEFORE the halt still resolve (the reviewer, not the agent, is the principal on resolution). Recovery hint tells you which enforcement path fired and suggests restoring the agent or short-circuiting the loop.

Audit Module

AuditEvent.model: str | Nonev0.2.1

Optional model name (e.g. "gpt-4o-mini"). First-class field included in the HMAC computation, so changing it after the fact breaks the chain. Max 128 characters.

AuditEventRecord.is_placeholder: bool (property)v0.2.1

Returns True when the record was synthesized because both primary and fallback storage were unreachable. A placeholder has hmac == "0" * 64 and metadata["audit.unavailable"] == True. Operators should alert on these.

await sdk.audit.log(event: AuditEvent)

Log an audit event. Computes the HMAC chain atomically with the insert (advisory lock per session). Non-breaking: never raises on internal failure — returns a placeholder record instead.

Returns: AuditEventRecord (check record.is_placeholder to detect degraded state)

Note: Observation surface — never breaks the host call.

Examplepython
from codeatelier_governance.audit import AuditEvent

record = await sdk.audit.log(AuditEvent(
    agent_id="billing-agent",
    kind="llm.call",
    model="gpt-4o-mini",
    metadata={"prompt_tokens": 150},
))
@sdk.audit.track(kind="...", agent_id="...")

Decorator that logs entry/exit events for an async function. Computes input_hash from args and output_hash from the return value. On exception, logs a .error event and re-raises.

Returns: The wrapped function's return value (unchanged)

Note: Async functions only. Observation surface.

Examplepython
@sdk.audit.track(kind="charge_card", agent_id="billing-agent")
async def charge(amount: int) -> str:
    return await stripe.charge(amount)
await sdk.audit.trace_session_chain(session_id: UUID)

Walk the HMAC chain for an entire session. Re-verifies every row's HMAC against the secret. The compliance officer's one-click verification.

Returns: list[AuditEventRecord] in chain order

Raises: ChainIntegrityError at the first tampered row

Examplepython
chain = await sdk.audit.trace_session_chain(session_id)
for event in chain:
    print(f"{event.kind} — {event.agent_id} — hmac verified")
await sdk.audit.verify_chain(session_id: UUID | None = None, from_seq: int | None = None, to_seq: int | None = None)

On-demand HMAC chain verification. Stable public API for compliance attestation and post-incident review. Supports partial-range verification via from_seq and to_seq — useful for verifying only the window relevant to a specific incident without paying the O(n) cost on the whole chain. Two-pass verification: recomputes each row's HMAC and checks prev_hash linkage (fork detection). Never returns False — clean chain returns True, any failure raises.

Returns: True on a clean chain

Raises: ChainIntegrityError with the offending sequence number on the first broken HMAC or prev_hash linkage

Note: Added in v0.5.3. Set verify_chain_on_read=True on GovernanceSDK to verify the chain automatically on every get_events() call.

Examplepython
# Full-chain verification for a compliance attestation run
await sdk.audit.verify_chain()

# Window verification — only events [100..200] in one session
await sdk.audit.verify_chain(session_id=sid, from_seq=100, to_seq=200)

# Automatic on-read verification
sdk = GovernanceSDK(database_url=..., verify_chain_on_read=True)
events = await sdk.audit.get_events(session_id=sid)  # raises on tamper
sdk.audit.subscribe(callback: Callable[[AuditEventRecord], Awaitable[None]])v0.2.1

Register a callback invoked after every successful log. Subscribers run after the record is enqueued for write — a failing subscriber never breaks the audit chain or the host application. Used by the OTel exporter and any custom consumer that wants to fan audit events out to a secondary system.

Returns: None

Note: Exceptions raised by subscribers are caught, logged, and swallowed.

Scope Module

sdk.scope.register(policy: ScopePolicy)

Register a scope policy for an agent. Call at app startup. Policies are also persisted to Postgres (best-effort) so the console GUI can display them.

Examplepython
from codeatelier_governance.scope import ScopePolicy

sdk.scope.register(ScopePolicy(
    agent_id="billing-agent",
    allowed_tools=frozenset({"read_invoice", "send_email"}),
    allowed_apis=frozenset({"GET https://api.stripe.com/v1/*"}),
    hidden_tools=frozenset({"delete_all", "admin_reset"}),
))
await sdk.scope.check(agent_id, tool=..., api=...)

Check if the agent is allowed to call the given tool or API. Default-deny: an agent with no registered policy fails every check.

Raises: ScopeViolation (auto audit-logged) or PolicyNotRegistered

Note: Enforcement surface — raises by contract.

Examplepython
await sdk.scope.check(agent_id="billing-agent", tool="read_invoice")  # passes
await sdk.scope.check(agent_id="billing-agent", tool="delete_user")   # raises ScopeViolation
sdk.scope.filter_tools(agent_id: str, tools: list[str]) -> list[str]v0.2.2

Remove hidden tools from a tool list before passing it to the LLM. The agent never sees the hidden tools in its context — they are removed entirely, not just blocked at call time. If no policy is registered for the agent, the full list is returned unchanged.

Returns: Filtered list[str] with hidden tools removed

Note: Synchronous method — no await needed.

ScopePolicy.hidden_tools: frozenset[str]v0.2.2

Set of tool names to remove from the LLM's context entirely. Unlike allowed_tools (which blocks calls at execution time), hidden tools are filtered out before the LLM sees the tool list — the agent cannot even attempt to call them.

Cost Module

CostModule(store, *, strict_unknown_models=True, unknown_model_fallback_usd_per_million=None)v0.6.2

Direct constructor for advanced embeddings. Under the new default (strict_unknown_models=True), a pricing lookup miss inside track_usage() raises UnknownModelError. To keep the v0.6.1 silent-zero behaviour set both knobs: strict_unknown_models=False and a unknown_model_fallback_usd_per_million (positive float) for the fallback rate. The flag is mirrored at the top-level constructor as GovernanceSDK(cost_strict_unknown_models=..., cost_unknown_model_fallback_usd_per_million=...) and via env vars GOVERNANCE_COST_STRICT_UNKNOWN_MODELS and GOVERNANCE_COST_UNKNOWN_MODEL_FALLBACK_USD_PER_MILLION.

sdk.cost.register(policy: BudgetPolicy)

Register a budget policy for an agent. Caps: per_session_usd, per_session_tokens, per_agent_usd_daily, per_agent_tokens_daily, per_session_seconds. At least one cap required.

Examplepython
from codeatelier_governance.cost import BudgetPolicy

sdk.cost.register(BudgetPolicy(
    agent_id="billing-agent",
    per_session_usd=0.50,
    per_session_seconds=300,  # 5-minute time limit
    per_agent_usd_daily=10.00,
))
await sdk.cost.check_or_raise(agent_id, session_id)

Pre-call enforcement: raises BudgetExceeded if any cap is hit, including session time limits. Fail-closed by default — if the cost store is unreachable, the call is denied.

Raises: BudgetExceeded (auto audit-logged)

Note: Enforcement surface — raises by contract. Pass cost_fail_open=True at SDK init for availability over safety.

Examplepython
await sdk.cost.check_or_raise("billing-agent", session_id)
# If this line runs, the budget is not exceeded — proceed with the LLM call
await sdk.cost.track_usage(agent_id, session_id, *, model, input_tokens, output_tokens)v0.2.2

Post-call: record actual usage with automatic USD computation from model name. Built-in pricing covers OpenAI, Anthropic (claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5), Google, Meta, and Mistral families. Prefix matching with longest-match wins: versioned model IDs like claude-sonnet-4-6-20260401 or gpt-4o-2024-05-13 resolve to their base entry, so you can upgrade to a newer snapshot of the same model family without bumping the SDK.

v0.6.2 strict-by-default (BREAKING). When pricing lookup misses entirely, track_usage() now raises UnknownModelError instead of returning $0.00. This closes a silent-zero budget-bypass vector where a misspelled or unreleased model ID would accrue spend for free. Opt-out knobs:

  • CostModule(strict_unknown_models=False, unknown_model_fallback_usd_per_million=...) — observe-only mode, uses the fallback rate.
  • GovernanceSDK(cost_strict_unknown_models=False, cost_unknown_model_fallback_usd_per_million=...) — same knob, surfaced at the top-level constructor.
  • Env vars GOVERNANCE_COST_STRICT_UNKNOWN_MODELS and GOVERNANCE_COST_UNKNOWN_MODEL_FALLBACK_USD_PER_MILLION for 12-factor deploys.

The bundled LangChain handler pins strict_unknown_models=False internally so observe-only workloads never crash host code — the raise is scoped to direct SDK callers.

Returns: None

Note: Enforcement surface as of v0.6.2 — raises UnknownModelError on unknown-model lookup under the new default. Convenience wrapper around track() that auto-computes usd.

await sdk.cost.reconcile(agent_id, session_id, *, model, input_tokens, output_tokens, reason=...)v0.6.2

End-of-stream reconciliation entry point. Called by the OpenAI and Anthropic wrappers when a streaming response terminates — normally, mid-stream exception, asyncio.CancelledError, or proxy-abandoned-without-iteration. Reconciles the projected max_tokens charge with actual usage (from the API's usage payload when available) or with a conservative chunks × 8 token estimate on cancellation / abandonment. Idempotent — double-reconcile is a no-op. You rarely call this directly; the wrappers do.

Returns: None

await sdk.cost.track(agent_id, session_id, tokens=..., usd=...)

Post-call: record actual usage with explicit token count and USD. Counters are monotonic (negative deltas rejected). Multi-process correct via Postgres UPSERT. Use track_usage() instead if you want auto USD computation.

Note: Observation surface — never raises.

Examplepython
await sdk.cost.track("billing-agent", session_id, tokens=150, usd=0.003)
BudgetPolicy.per_session_seconds: int | Nonev0.2.2

Maximum session duration in seconds (0-86400, max 24 hours). Enforced by check_or_raise() — raises BudgetExceeded when the session has been running longer than the limit. Session start time is tracked automatically in Postgres.

await sdk.cost.model_breakdown(agent_id: str)v0.3.0

Returns per-model cost aggregation for an agent. Each key is a model name with USD and token totals. Useful for understanding which models drive the most spend.

Returns: dict[str, {"usd": float, "tokens": int}] — e.g. {"gpt-4o": {"usd": 0.0075, "tokens": 1500}}

Note: Observation surface — never raises. Returns empty dict if no usage recorded.

await sdk.cost.snapshot(agent_id: str, session_id: UUID)v0.2.1

Read-only snapshot of current usage and remaining budget for an agent/session pair. Returns a BudgetSnapshot with fields for session and daily usage (USD + tokens) and the remaining budget under each registered cap.

Returns: BudgetSnapshot (session_usd_used, session_tokens_used, agent_daily_usd_used, agent_daily_tokens_used, plus *_remaining counterparts)

Note: Defensive — if storage fails, returns zeros and logs the error. Never raises.

Loop Detection Modulev0.3.0

sdk.loop.register(policy: LoopPolicy)

Register a loop detection policy for an agent. Defines the sliding window size, maximum calls allowed, and action to take on detection.

Examplepython
from codeatelier_governance.loop import LoopPolicy

sdk.loop.register(LoopPolicy(
    agent_id="my-agent",
    window_seconds=60,
    max_calls=5,
    action="raise",  # or "log"
))
await sdk.loop.record_call(agent_id: str, session_id: UUID, tool_name: str)

Record a tool call and check for loops. If the same tool has been called more than max_calls times within the sliding window for this session, the configured action fires. Tool names are normalized to lowercase for case-insensitive detection.

Raises: LoopDetected (when action="raise" and threshold exceeded)

Note: Enforcement surface when action="raise". Emits a loop.detected audit event on detection.

Examplepython
from codeatelier_governance.loop import LoopDetected

try:
    await sdk.loop.record_call("my-agent", session_id, "search_web")
except LoopDetected:
    # Agent called search_web 5+ times in 60s — break the loop
    break
await sdk.loop.check(agent_id: str, session_id: UUID)

Read-only check: returns whether the agent is currently in a detected loop state for the session. Does not record a new call.

Returns: bool — True if a loop is currently detected

Note: Observation surface — never raises.

Examplepython
if await sdk.loop.check("my-agent", session_id):
    # Loop detected — take corrective action
    await escalate_to_human(session_id)

Presence Modulev0.3.0

await sdk.presence.heartbeat(agent_id: str, metadata: dict | None = None, operator_id: str | None = None)

Mark an agent as live. Call periodically (e.g. every 30s) to maintain the "Live" status. Each heartbeat updates the last_seen timestamp in Postgres. Pass operator_id to bind the agent to a human operator — used by the console for self-approval prevention on HITL gates. Pass metadata for arbitrary JSON (max 64KB).

Note: Observation surface — never raises.

Examplepython
# In your agent's main loop:
await sdk.presence.heartbeat(
    "billing-agent",
    operator_id="alice@company.com",
    metadata={"version": "1.2.0"},
)
await sdk.presence.mark_idle(agent_id: str)

Explicitly transition an agent to "Idle" status. Use when the agent is waiting for work but still running.

Note: Observation surface — never raises.

Examplepython
# Agent finished processing, waiting for next task:
await sdk.presence.mark_idle("billing-agent")
await sdk.presence.close_agent(agent_id: str)

Remove an agent from the presence table. Call during graceful shutdown. The agent will no longer appear in list_agents().

Note: Observation surface — never raises.

Examplepython
# Graceful shutdown:
await sdk.presence.close_agent("billing-agent")
await sdk.presence.list_agents()

Return all agents with their current status (Live, Idle, or Unresponsive) and last_seen timestamp.

Returns: list[AgentPresence] with .agent_id, .status, .last_seen

Note: Observation surface — never raises.

Examplepython
agents = await sdk.presence.list_agents()
for agent in agents:
    print(f"{agent.agent_id}: {agent.status} (last seen {agent.last_seen})")
await sdk.presence.check_stale(timeout_seconds: int = 300)

Mark agents as "Unresponsive" if their last heartbeat is older than timeout_seconds. Call periodically from a health-check endpoint or cron job.

Returns: list[str] — agent IDs that were marked unresponsive

Note: Observation surface — never raises.

Examplepython
# Mark agents unresponsive if no heartbeat in 5 minutes:
stale = await sdk.presence.check_stale(timeout_seconds=300)
if stale:
    alert(f"Unresponsive agents: {stale}")
await sdk.presence.is_halted(agent_id: str)

Return True if an operator has written a halt marker for this agent (via the console halt button or a direct write to the presence row). Reads from a 5-second TTL in-memory cache backed by governance_agent_presence.metadata_json — the hot path is one dict lookup, and the worst-case delay between an operator clicking halt and enforcement is bounded by the TTL. If the governance database is unreachable, the cache holds the last known state and a structlog warning is logged instead of raising. This is OBSERVATION ONLY — it does not raise or block anything. For actual enforcement, call assert_alive() instead (or let scope / cost / gates / wrappers call it for you — all four enforcement paths are wired in v0.6.2).

Returns: bool

Note: Primary name as of v0.6.0. is_killed is preserved as a back-compat alias for v0.6.x and REMOVED in v0.7. Does not raise on DB outage — preserves CLAUDE.md Invariant #1 (host app keeps running when governance DB is unreachable). Observation-only: use assert_alive() for enforcement.

Examplepython
# Observation only — use assert_alive() for actual enforcement.
# This pattern is correct when you want to HANDLE the halt explicitly
# (e.g. flush state, notify upstream) before aborting.
if await sdk.presence.is_halted("billing-agent"):
    await flush_partial_results()
    await notify_upstream("halted by operator")
    raise RuntimeError("billing-agent halted — aborting task")

# For enforcement, prefer assert_alive():
await sdk.presence.assert_alive("billing-agent")  # raises AgentHaltedError on halt
await sdk.presence.assert_alive(agent_id: str)

Raise AgentHaltedError if the agent has been halted, with the operator, timestamp, and reason from the halt marker. As of v0.6.2 this is the fail-closed gate called from sdk.scope.check(), sdk.cost.check_or_raise(), sdk.gates.request(), wrap_openai, and wrap_anthropic. Can also be called directly if you want to gate your own enforcement path against the halt switch.

Returns: None (raises AgentHaltedError on halt)

Note: Primary surface. Wired automatically into scope + cost + gates + wrappers when enable_presence=True (the default). The AgentKilledError name is still importable as a back-compat alias through v0.6.x; update to AgentHaltedError before v0.7.

Examplepython
from codeatelier_governance.presence import AgentHaltedError

try:
    await sdk.presence.assert_alive("billing-agent")
except AgentHaltedError as e:
    logger.error(
        "agent halted",
        agent_id=e.agent_id,
        halted_by=e.halted_by,
        halted_at=e.halted_at,
        reason=e.reason,
    )
    raise
await sdk.presence.force_refresh_halted_cache()

Bypass the 5-second TTL and refresh the halted-agent cache immediately from the database. Used by tests and by push-based invalidation paths.

Returns: None

Note: Primary name as of v0.6.0. force_refresh_killed_cache is preserved as a back-compat alias through v0.6.x and REMOVED in v0.7. You should rarely need this in application code — the TTL cache is designed to be transparent. Primary use is tests.

Examplepython
# In a test after simulating a console halt:
await sdk.presence.force_refresh_halted_cache()
assert await sdk.presence.is_halted("test-agent")

Gates Module — v2 token formatv0.6.2

GatesModule(..., enable_v2_tokens=False, accept_v1_until=None)

Opt-in v2 gate-token format. enable_v2_tokens=True (or env GOVERNANCE_GATES_ENABLE_V2_TOKENS=true) makes new tokens mint in the v2:<key_prefix>:… format, which binds the token to the HMAC chain key version active at issue time — so pre-rotation tokens keep verifying after rotate-chain-key. Both v1 and v2 formats verify in v0.6.2. Default is False in v0.6.2 and flips to True in v0.6.3; rolling deploys that straddle the boundary should set the flag explicitly.

accept_v1_until (default ~2026-07-17, 90 days post-release) controls when v1 tokens stop verifying. After that timestamp, presenting a v1 token to gates.grant() or gates.deny() raises TokenVersionTooOldError. Override via accept_v1_until=datetime(...) or env GOVERNANCE_GATES_ACCEPT_V1_UNTIL. Pre-cutoff v1 parses emit a DeprecationWarning and a gates.legacy_token_parsed INFO audit row so the rollout is visible.

Identity — RevocationStorev0.6.2

RevocationStore(..., strict_chain=True)

Strict-by-default in v0.6.2. When the audit write inside revoke_with_chain_event() fails, the revocation now raises instead of writing the row with chain_event_id=None. The previous silent-degrade behaviour produced rows that would later fail Article 12 reconstruction checks. Pass strict_chain=False to restore degraded-mode behaviour; a WARN audit row identity.revocation_without_chain_event is emitted so the fallback path stays visible to auditors.

In-Memory Audit Storev0.6.2

InMemoryAuditStore(max_events=..., on_full="raise")

Strict-by-default in v0.6.2. Hitting max_events now raises StoreUnavailableError instead of silently evicting the oldest row — silent eviction was a latent chain-continuity hole (evicted rows could never be verified again). Callers that genuinely want ring-buffer semantics must opt in with on_full="evict". The internal BatchingWriter fallback sets on_full="evict" automatically so host apps never block on a full fallback buffer.

New helpers: .stats() returns current size, max, and eviction count; .verify_not_truncated() raises if on_full="evict" has actually evicted any rows since startup — useful at the end of a test run or at the head of a compliance report generation to fail loud on silent truncation.

Contracts Modulev0.4.0

Contract(agent_id: str, tool: str, pre: list[PreCondition], post: list[PostCondition])

Pydantic model binding pre/post conditions to an (agent_id, tool) pair. Frozen after construction - the agent cannot mutate its own contract at runtime. Maximum 20 pre-conditions and 20 post-conditions per contract.

PreCondition(check: str, message: str, params: dict = {}) / PostCondition(check: str, message: str, params: dict = {})

Declarative condition models. Built-in pre-condition checks: hitl_approved, budget_available, scope_allowed, custom. Built-in post-condition checks: audit_logged, custom. Unknown checks fail closed.

sdk.contracts.register(contract: Contract)

Register a behavioral contract for an (agent_id, tool) pair. Call at app startup. One contract per (agent_id, tool) pair - registering again overwrites the previous contract.

Examplepython
from codeatelier_governance.contracts.models import Contract, PreCondition, PostCondition

sdk.contracts.register(Contract(
    agent_id="billing-agent",
    tool="charge_customer",
    pre=[
        PreCondition(check="hitl_approved", message="Charges require human approval"),
        PreCondition(check="budget_available", message="Budget must not be exceeded"),
    ],
    post=[
        PostCondition(check="audit_logged", message="Charge must be audit-logged"),
    ],
))
async with sdk.contracts.enforce(agent_id: str, session_id: UUID, tool: str)

Async context manager that runs check_pre before the body and check_post after. Unifies scope, cost, HITL, and custom checks into one declarative surface. If any pre-condition fails, raises ContractViolation before your code runs. If any post-condition fails, raises ContractViolation after.

Raises: ContractViolation (auto audit-logged as contract.pre_violation or contract.post_violation)

Note: Enforcement surface - raises by contract. No-op if no contract registered for this (agent_id, tool).

Examplepython
async with sdk.contracts.enforce("billing-agent", session_id, "charge_customer"):
    # Pre-conditions verified - safe to proceed
    result = await stripe.charge(amount)
    # Post-conditions checked on exit
await sdk.contracts.check_pre(agent_id: str, session_id: UUID, tool: str)

Evaluate all pre-conditions for this (agent_id, tool) pair. If any pre-condition fails, emits a contract.pre_violation audit event and raises. No-op if no contract is registered.

Raises: ContractViolation

Note: Enforcement surface - raises by contract.

Examplepython
await sdk.contracts.check_pre("billing-agent", session_id, "charge_customer")
# All pre-conditions passed - safe to proceed
await sdk.contracts.check_post(agent_id: str, session_id: UUID, tool: str)

Evaluate all post-conditions for this (agent_id, tool) pair. If any post-condition fails, emits a contract.post_violation audit event and raises. No-op if no contract is registered.

Raises: ContractViolation

Note: Enforcement surface - raises by contract.

Examplepython
await sdk.contracts.check_post("billing-agent", session_id, "charge_customer")
# All post-conditions verified
sdk.contracts.register_check(name: str, fn: Callable[[str, UUID, str], Awaitable[bool]])

Register a custom async callable for use in custom pre/post conditions. The callable receives (agent_id, session_id, tool) and must return True (pass) or False (fail). Name must be 1-256 characters.

Returns: None

Raises: ValueError if name is empty or exceeds 256 characters

Compliance Modulev0.4.0

ComplianceReport(report_id, generated_at, format, agent_id, session_ids, date_range, sections, chain_integrity_status, coverage_caveat, coverage_pct)v0.5.3

Top-level compliance report model. Immutable once generated. Serializable to JSON for archival or internal compliance review. The format field indicates the report type (e.g. "article12" or "summary").

v0.5.3 fields: chain_integrity_status is "verified", "failed", or "unverified" (default when no chain verification was requested). coverage_caveat is a required field on the ComplianceReport model and ReportGenerator always populates it with the standard scoping language: "This report covers only actions routed through the SDK wrappers. Actions made via direct LLM client calls are not included." (Callers constructing ComplianceReport directly must supply it — the field has no default on the Pydantic model.) coverage_pct is a Pydantic-validated [0.0, 1.0] float or None; reserved for the v0.6 wrapper registry and always None in v0.5.x. The field is always present — its absence would imply 100% coverage.

ReportSection(title: str, description: str, data: list[dict], status: "compliant" | "partial" | "non_compliant")

A single section of a compliance report, mapping to a specific EU AI Act Article 12 requirement. The status field indicates whether the requirement is met based on available audit data.

ReportGenerator(database_url: str | None = None, *, audit_store: AuditStore | None = None, audit_module: AuditModule | None = None)

Constructor for the compliance report generator. Provide database_url for Postgres queries or audit_store for in-memory/test usage. Pass audit_module=sdk.audit (v0.5.3+) to enable chain verification inside reports — required when calling generate_article12(verify_chain=True) or generate_summary(verify_chain=True).

Note: audit_module parameter added in v0.5.3.

Examplepython
from codeatelier_governance.compliance.report import ReportGenerator

# For CLI / archival reports
generator = ReportGenerator(database_url=os.environ["DATABASE_URL"])

# For chain-verified reports — wire the audit module
generator = ReportGenerator(
    database_url=os.environ["DATABASE_URL"],
    audit_module=sdk.audit,
)
await generator.generate_article12(session_ids=None, agent_id=None, date_from=None, date_to=None, *, verify_chain=False)

Generate an EU AI Act Article 12 evidence report. Queries audit events with the given filters and maps them to the seven Article 12 automatic logging requirements: event registration, duration of use, reference database, input data, functioning, human oversight, and post-market monitoring. The report provides evidence for actions the SDK observed; it does not assert compliance for the overall deployment.

Returns: ComplianceReport with seven ReportSection entries, each with a compliant/partial/non_compliant status, plus coverage_caveat, chain_integrity_status, and coverage_pct fields

Raises: ValueError if verify_chain=True is requested but audit_module was not passed to the ReportGenerator constructor — no silent degradation. IMPORTANT: you must construct ReportGenerator with audit_module=sdk.audit before calling this with verify_chain=True. See the ReportGenerator constructor entry above.

Note: verify_chain keyword-only parameter added in v0.5.3. Available via both the CLI (governance compliance report) and direct import.

Examplepython
from codeatelier_governance.compliance.report import ReportGenerator
from datetime import datetime, timezone

# Chain-verified Article 12 attestation
generator = ReportGenerator(
    database_url=os.environ["DATABASE_URL"],
    audit_module=sdk.audit,
)

report = await generator.generate_article12(
    agent_id="billing-agent",
    date_from=datetime(2026, 1, 1, tzinfo=timezone.utc),
    date_to=datetime(2026, 4, 1, tzinfo=timezone.utc),
    verify_chain=True,  # runs HMAC chain verification
)
assert report.chain_integrity_status == "verified"
print(report.coverage_caveat)
for section in report.sections:
    print(f"{section.title}: {section.status}")
await generator.generate_summary(agent_id: str, date_from=None, date_to=None, *, verify_chain=False)

Generate a high-level summary compliance report for an agent. Includes three sections: agent overview (event counts, models, sessions), violation summary (scope/budget/loop violations), and human oversight (HITL gate activity).

Returns: ComplianceReport with three ReportSection entries

Raises: ValueError if verify_chain=True is requested but audit_module was not passed to the ReportGenerator constructor

Note: verify_chain keyword-only parameter added in v0.5.3.

Examplepython
report = await generator.generate_summary(
    agent_id="billing-agent",
    date_from=datetime(2026, 3, 1, tzinfo=timezone.utc),
    verify_chain=True,
)
for section in report.sections:
    print(f"{section.title}: {section.status}")

Anthropic Adapterv0.4.0

wrap_anthropic(client, sdk: GovernanceSDK, agent_id: str, session_id: UUID | None = None)

Patch an Anthropic client to emit governance audit events and track cost automatically. Supports both anthropic.Anthropic (sync) and anthropic.AsyncAnthropic (async). The client is monkey-patched in-place and returned so existing references continue to work. Each messages.create() call will: (1) run cost.check_or_raise before the LLM call, (2) audit-log the call, (3) extract token usage from the response, (4) compute USD cost via the built-in pricing table, and (5) track cost post-call.

Returns: The same client object, patched in-place

Raises: BudgetExceeded (enforcement surface - checked before every LLM call)

Note: Audit logging and cost tracking are observation surfaces and never raise. The budget pre-check is the only enforcement surface. Calling wrap_anthropic twice on the same client is a no-op (logs a warning).

Examplepython
import anthropic
from codeatelier_governance import GovernanceSDK
from codeatelier_governance.integrations.anthropic_wrap import wrap_anthropic

async with GovernanceSDK(database_url=os.environ["DATABASE_URL"]) as sdk:
    client = wrap_anthropic(anthropic.Anthropic(), sdk=sdk, agent_id="my-agent")

    # client.messages.create() now auto-audits + tracks cost
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello"}],
    )

Gates Module

await sdk.gates.request(kind, agent_id, payload=...)

Open an approval request. Returns a pending request with a signed token. Surface the request to a human (Slack, email, console GUI). The human calls grant(token) or deny(token).

Returns: ApprovalRequest with .request_id, .token, .expires_at

Note: Observation surface — never raises on internal failure.

Examplepython
req = await sdk.gates.request(
    kind="deploy.production",
    agent_id="deploy-agent",
    payload={"pr": 142, "environment": "production"},
)
print(f"Approval needed: {req.request_id}")
await sdk.gates.grant(token) / deny(token)

Resolve an approval request. Single-use: the same token cannot be used twice. Time-bound: expired tokens are rejected. Action-hash-bound: swapping the action invalidates the token.

Raises: ApprovalTokenError on forge, replay, expiry, or mismatch

Examplepython
await sdk.gates.grant(req.token)  # approval.granted audit row written
await sdk.gates.wait_for(request_id, timeout=...)

Block until the request is resolved. Polls the store every 500ms (with jitter). Multi-process correct: the grant can come from any worker.

Raises: ApprovalTimeout if nobody responds, ApprovalDenied if denied

Note: Enforcement surface — raises by contract.

Examplepython
granted = await sdk.gates.wait_for(req.request_id, timeout=600)
# If this line runs, a human approved the action
@sdk.gates.require_approval(kind=..., agent_id=..., timeout=600, blocking=True)v0.2.1

Decorator that opens an approval request before running the wrapped async function. If blocking=True (default), the call blocks until a human grants or denies. If blocking=False, raises ApprovalPending immediately and the caller resumes after resolution.

Returns:The wrapped function's return value (after approval)

Raises: ApprovalTimeout, ApprovalDenied, or ApprovalPending (non-blocking mode)

Continue to Integrations →