Changelog
Version history for the Governance SDK. Each release lists every shipped change grouped by category.
v0.6.0
newApril 16, 2026Ed25519 + HMAC rotation + Article 12 evidence export + fresh-install fix. Major release implementing F2–F9 of the v0.6 PRD plus a polish pass (Track A finish, LLM-theater tripwire closure, compliance export button, fresh-install wiring). Upgrade path for v0.5.x deployments: install the [migrations] extra and run governance migrate — it now applies the base DDL and runs alembic upgrade head in one shot. See the full release notes on PyPI.
Added
- Ed25519 agent identity with file, env, and ephemeral keystores. Per-row Ed25519 signature over the HMAC audit chain. Graceful degradation to
signature_status='unsigned_local_failure'when a signer cannot load its private key — the host call path never raises - HMAC chain key rotation with dual-signed rotation marker rows, salted fingerprint construction, bounded LRU resolution cache, and a
governance rotate-chain-keyCLI command - Article 12 evidence export —
POST /api/compliance/exportpackages the compliance report and chain verification into a single HMAC-signed JSON bundle (bundle_hash+bundle_signature) you can hand to a regulator. "Export Article 12 evidence" button in the v4 console with Windows-safe filename and screen-reader-announced download lifecycle - Wrapper coverage registry — opt-in registry of every wrapped LLM client at import time, surfaced via
GET /api/coverageand the new/health/governanceendpoint - GOVERNANCE_COMPLIANCE_RATE_LIMIT env var — overrides the default 1 req / 60s per-user ceiling on compliance endpoints
Changed
governance migratenow applies the base DDL and runsalembic upgrade headautomatically — one command takes a new database all the way to HEAD. Closes a latent P0 where fresh v0.6 installs that ran DDL-only silently dropped audit rows withStoreUnavailableErroragainst the missingsignature_statuscolumnkill→haltrename across SDK, console, and audit events. Backward-compat aliases preserved in v0.6, removed in v0.7. Historicagent.killedrows stay in the chain as-is; a SQL viewgovernance_audit_events_haltedunions both kinds- Append-only grants on
governance_audit_eventsenforced at the role level (previously only at the row-trigger level) — closes a latent invariant-2 gap
Upgrade notes
pip install "code-atelier-governance[migrations]>=0.6.0"governance migrate --database-url $DATABASE_URL(runs DDL + alembic; idempotent)- Three new columns on
governance_audit_events—signature,signing_key_fingerprint,signature_status. Strict-schema SIEM / BI pipelines must widen - Queries filtering on
kind='agent.killed'will silently stop matching new halt events. Either query thegovernance_audit_events_haltedview or filterIN ('agent.killed', 'agent.halted')
v0.5.4
April 14, 2026Kill switch enforcement hotfix. Closes a v0.5.3 production bug where the console halt button updated presence metadata and emitted an audit event, but the SDK enforcement path never read the kill marker. Agents kept running after being "killed." v0.5.4 wires scope.check() through a new presence kill check that fail-closes on operator-killed agents.
Kill switch enforcement (closes v0.5.3 production bug)
PresenceModule.is_killed(agent_id)andPresenceModule.assert_alive(agent_id)— read from a 5-second TTL in-memory cache backed by the operator-written kill marker on the presence rownewAgentKilledError— new exception raised byassert_alive()with full kill metadata (operator, timestamp, reason). Inherits fromRuntimeError, not fromGovernanceError— a bareexcept ScopeViolationwill NOT catch a kill. Catch explicitly withexcept AgentKilledErrorif you need to handle the kill in application codenewScopeModule.set_presence_module(presence)— wired automatically byGovernanceSDK.__init__when bothenable_scope=Trueandenable_presence=Truenewscope.check()callspresence.assert_alive(agent_id)first — fires beforePolicyNotRegisteredand before the missing-tool-or-apiValueError. Defense-in-depth ordering verified by testnewPresenceModule.force_refresh_killed_cache()— bypass the 5-second TTL on demand. Used by tests and by future push-based invalidation in v0.6new
Reliability — Invariant #1 preserved under DB outage
- When the governance database is unreachable, the kill cache holds the last known state and a structlog warning is logged. Already-killed agents stay killed, live agents stay live, and the host application keeps running. The refresh path never raises out to the callernew
- A failed refresh bumps the cache timestamp so the SDK does not hammer the database during a sustained outagenew
- Detection uses the operator-written kill marker on the presence row, NOT the
statuscolumn. The status column is overloaded by stale-heartbeat detection, so gating on it would also block agents that simply went idle. The metadata marker is the unambiguous kill signalnew
Tests
- 441 → 469 backend tests. New:
tests/presence/test_kill_switch.py(28 tests across 7 categories: AgentKilledError construction, happy path, cache TTL behavior including 50-concurrent double-checked locking, Invariant #1 DB outage resilience, in-memory mode, edge cases (unicode IDs, 500-agent scaling, partial metadata, un-kill, idempotent re-kill), and ScopeModule integration with two ordering tests) - Wheel build → clean-venv install → smoke test was run against real Postgres before publish (editable installs hide packaging bugs). The full
scripts/live_test.pyintegration harness remains the source of truth for release gating
Scope deferred to v0.6
- Only
scope.check()is patched in v0.5.4.cost.check_budget(),gates.check(), and the LLM wrappers (wrap_anthropic,wrap_openai) still need kill-check wiring — that lands in v0.6 as part of the broader enforcement-path expansion - Cache invalidation is TTL-only — worst-case 5-second delay between an operator clicking halt and SDK enforcement. Sub-second invalidation via Postgres
LISTEN/NOTIFYis on the v0.6 roadmap - No restore endpoint yet. To restore a killed agent, clear the kill marker from the presence row directly. v0.6 ships an admin restore action
Migration
- Drop-in upgrade from v0.5.3. No schema migration required: kill detection reads from an existing JSONB metadata field that the v0.5.3 console already writes
pip install --upgrade code-atelier-governance==0.5.4
v0.5.3
April 13, 2026Enforcement integrity patch: scope check wired into wrappers, projected budget gate, streaming token tracking, startup wrapper-coverage warning, on-demand HMAC chain verification API, Article 12 report coverage caveat, and a written threat model. Every enforcement surface now gates actions at the wrapper layer.
Enforcement wiring
wrap_openaiandwrap_anthropicnow callsdk.scope.check()with achat.completions.create/messages.createsentinel before the LLM call — scope enforcement is active at the wrapper layer, not only at the tool-execution layernew- Projected budget gate —
cost.check_or_raise()now projects forward using the call's declaredmax_tokensand denies the call before the stream opens if the projected total would exceed the capnew - Streaming token tracking — streaming calls record actual usage from the final usage object; fall back to the projected
max_tokensceiling if the API does not return a usage payloadnew warn_on_no_wrappers— structlog warning atsdk.start()when nowrap_openaiorwrap_anthropichas been registered. The only startup signal that enforcement is not covering your LLM calls. DefaultTruenew
Audit — on-demand chain verification
await sdk.audit.verify_chain(session_id=None, from_seq=None, to_seq=None)— public API for on-demand HMAC chain verification. Supports partial-range verification. ReturnsTrueon a clean chain; raisesChainIntegrityErrorwith the bad sequence number on tampernewverify_chain_on_readSDK config option — whenTrue, verifies the chain on everyget_events()call. DefaultFalse(O(n) in returned events). Use for compliance reporting or post-incident reviewnew
Compliance — Article 12 report integrity
ReportGenerator(audit_module=sdk.audit)— new constructor parameter for wiring the audit module to enable chain verification inside compliance reportsnewgenerate_article12(verify_chain=True)andgenerate_summary(verify_chain=True)— new keyword-only parameter runs HMAC chain verification and setschain_integrity_statusto"verified"or"failed"in the report. RaisesValueErrorimmediately ifverify_chain=Trueis requested without anaudit_module— no silent degradationnewComplianceReport.coverage_caveat— new field that always prints the scoping language: "This report covers only actions routed through the SDK wrappers. Actions made via direct LLM client calls are not included."newComplianceReport.coverage_pct— Pydantic-validated[0.0, 1.0]float orNone. Reserved for v0.6 wrapper registry; alwaysNonein v0.5.x. The field is always present — its absence would imply 100% coveragenew
Config — opt-in module flags (CLAUDE.md invariant restored)
enable_loop(defaultTrue) — whenFalse,sdk.loopis not constructed; accessing it raisesAttributeError. Emits a structlogsdk.loop_disabledwarning at initnewenable_presence(defaultTrue) — whenFalse,sdk.presenceis not constructed. Emits a structlogsdk.presence_disabledwarning at initnewdefault_max_tokens— default ceiling used by the projected budget gate when the caller does not declaremax_tokenson the API call. Suppresses themax_tokens_not_declaredwarning. Must be>= 1(enforced by__post_init__)new- All module toggles now documented in the README "Configuration Reference" section with module toggles, audit options, wrapper options, and four deployment-pattern examplesnew
Docs — threat model + scoping language
- New "Threat Model" section in the README documents what the SDK protects against (scope violations, budget overruns, HITL bypass, audit tampering) and what it does NOT protect against (direct client bypass, process-level bypass, streaming cost precision, on-demand tampering detection only, HITL non-blocking mode, tool invocations inside LLM responses)new
- Positioning tightened across README and docs site — "enforcement gates for every action routed through the SDK" replaces unscoped claims. Article 12 compliance framed as evidence for actions the SDK observednew
- Removed the "Prompt Versioning" claim from the shipping-feature list.
enable_promptsremains as a forward-compat stubnew
Fixes
GovernanceSDK.close()now guardsself.loop.close()andself.presence.close()withhasattr—async with GovernanceSDK(enable_loop=False)no longer raisesAttributeErroron teardown_run_chain_verificationreturn type narrowed toChainIntegrityStatusliteral — mypy strict clean across 65 source filesscripts/live_test.pyswitched toAsyncOpenAI(the sync client crashed inside the running event loop)
Tests
- 436 → 441 backend tests. New:
tests/test_v053_edge_cases.py(11 edge cases), partial-range chain verification boundary test, strengthened streaming cost-tracked assertion, 3 tests forclose()with disabled modules, 2 tests forgenerate_article12/generate_summary(verify_chain=True)withoutaudit_moduleraisingValueError, full end-to-end wrap_anthropic + budget + audit flow - 26/26 live integration tests passing against real Postgres and real OpenAI
v0.5.2
April 12, 2026Console gate workflow hardening: reviewer tracking, batch approve with per-item self-approval blocking, deny rationale in the HMAC chain, halt endpoint, and SSE-backed presence broadcast.
Console - Gate Workflow
POST /api/gates/{id}/denyrecords a rationale in audit metadata so denials are tamper-evident in the HMAC chainnewPOST /api/gates/{id}/claimand/escalatetrack reviewer assignment and hand-offnewPOST /api/gates/batch-approvewith a 50-request hard cap and per-item self-approval enforcement (one blocked item does not abort the batch)newPOST /api/agents/{id}/killhalt endpoint for operatorsnewGET /api/events/statsaggregate counters for the console dashboardnew- SSE broadcast layer with session revocation on logoutnew
Database
- Migration
a1b2c3d4e5f6_gates_reviewer_columnsaddsclaimed_by,claimed_at,escalated_to,escalated_attogovernance_gates_pendingnew - New
triggers.sqlenforces append-only semantics on the audit log at the DB layernew
Security
- Deny rationale is stored in
application/jsonresponses only — the console never renders user-supplied text as HTML, locked by regression testnew - Batch approve has a belt-and-suspenders runtime length guard in addition to the Pydantic field constraintnew
- Self-approval check called inside the batch loop — an operator cannot approve their own actions in bulknew
Frontend (v4, opt-in via CONSOLE_UI_VERSION)
- Parallel-route drill panel for the agents listnew
mapAgentStatusliveness heuristic with a 15-second idle threshold so dormant agents no longer render as pulsing greennew- Stream page pause indicator reflects paused state (disconnected > connecting > paused > connected precedence)new
- Two-pass error-message sanitizer strips
Bearer,token,secret,authorizationvalues before displaynew - vitest harness with 34 frontend unit testsnew
Tests
- 369 backend tests (up from 356), plus 34 vitest cases on the console frontend
- New multi-agent live integration test (
scripts/live_test_multi_agent.py) exercises 6 concurrent agents against real OpenAI, covering scope violation, budget exceeded, HITL contract, and loop detectionnew - Posture endpoint status-literal contract locked textually — frontend and backend break loudly on driftnew
v0.5.1
April 12, 2026Hotfix covering four findings from a DX audit: three activation-consistency bugs and one silent HITL failure on Postgres. No audit data was lost or corrupted.
Security
- HITL gates silently broken on Postgres.
ContractsModule._check_hitl_approvedreturnedFalseunconditionally forPostgresGatesStore, causing HITL-gated actions to be blocked even after human approval (over-blocking, not bypass). Fix: newGatesStore.has_granted_approval()abstract method with strict agent_id + expiry filtering - ScopeModule.filter_tools silently returned the full tool list when no policy was registered, bypassing
hidden_toolsand contradicting the module's documented default-deny contract. Now raisesPolicyNotRegistered; the LangChain handler catches it and fails closed
Breaking changes
enable_audit,enable_scope,enable_cost,enable_gates,enable_promptsflags are now honored. In v0.2–v0.5.0 these flags were accepted byGovernanceConfigbut never read. Customers setting any of these flags expecting them to disable the corresponding module should review their deployment immediatelyScopeModule.filter_tools("unknown_agent", ...)now raisesPolicyNotRegisteredinstead of returning the full tool list
New module - Routing (model selection policy)
- Advisory
sdk.routing.suggest()remaps the requested model based on remaining budget (cost_aware) or explicit rewrite rulesnew - Off by default — both
enable_routing=TrueAND a registeredRoutingPolicyare requirednew - Honors
ScopePolicy.allowed_modelsas a hard constraintnew - Emits
routing.policy_changedandrouting.suggestionaudit eventsnew - Wraps
wrap_openaiandwrap_anthropictransparentlynew
Fixes
asyncio.run()no longer called from the sync registration path in scope, cost, and routing modules — removes a hidden sync-over-async deadlock risk (invariant #3)- Background policy-upsert tasks hold strong references via per-module
_pending_upsert_taskssets ScopePolicy.allowed_modelsfrozen set — a hard ceiling routing cannot exceed
Tests
- 322 → 356 tests (+21 routing, +13 hotfix regression pins in
tests/test_hotfix_v0_5_1.py)
v0.5.0
April 12, 2026Self-approval prevention, chain fork detection, a sync facade for Flask/Django, and a shared-pool performance sweep that cut Postgres connection count from ~74 to ~15 per SDK.
Security
- Self-approval prevention (fail-closed) — HITL gates compare the granting
operator_idagainst the session'suser_id; an agent cannot approve its own action. DDL adds anoperator_idcolumn to the gates tablenew - Chain fork detection —
audit.trace_session_chainraisesChainIntegrityErrorwhen two events share the sameprev_hash, surfacing tamper attempts or concurrent-write corruptionnew - New coverage: budget race at the cap boundary, SQL injection payloads on every user-controllable field, case-sensitivity scope bypass, production error leakage via
sanitize_db_error, weak-secret entropy rejection, account enumeration parity
New
GovernanceSDKSync— sync facade for Flask/Django and any non-async host. Runs an asyncio loop on a background thread and dispatches viarun_coroutine_threadsafe. Matches the async SDK surface one-to-onenew- SSE endpoint
GET /api/stream/eventsdelivers live audit events to the console (session auth, keepalive frames)new - Halt agent UI — renamed from "kill" because the SDK blocks gates, it does not terminate the host processnew
- Multi-agent OpenAI integration test script exercising delegation workloads end to endnew
Performance
- Shared engine pool — consolidated seven separate
AsyncEngineinstances into one shared pool per SDK. Dropped Postgres max connections per SDK from ~74 to ~15 - Concurrent audit writes — pre-call audit log backgrounded, post-call audit + cost tracking run under
asyncio.gather, saving 4–12 ms per LLM call on the critical path - Combined budget query — session + daily counter reads merged into one round-trip in
PostgresCostStore, halving pre-call enforcement latency
Fixes
- Streaming cost-tracking bypass now detected and logged (users must call
sdk.cost.track()manually after consuming the stream) - Serverless cold-start policy preload in
sdk.start()eliminates the 30-second gap where_policieswas empty on first request (AWS Lambda) - JSONL audit fallback tolerates read-only filesystems and rotates at 50 MB
- Session time budget uses Postgres-side elapsed computation to avoid mixed-clock skew
- Sync wrapper coroutine-leak fix in the Anthropic/OpenAI integrations
- Policy upsert SQL cast corrected (
::jsonb→CAST AS jsonb) so scope and budget policies persist across restarts - Top-level
__init__.pyexportsScopePolicy,BudgetPolicy,AuditEvent command_timeout=5on the shared engine prevents pool exhaustion under slow-query storms
Tests
- 258 → 322 tests (+64). New suites: streaming detection, JSONL fallback, cold start, sync wrapper, console endpoints, SSE, error handling, SQL injection, case-sensitivity bypass, chain fork detection, end-to-end enforcement. Test suite runs in ~5 s (was ~12 s)
v0.4.0
April 10, 2026Behavioral contracts for pre/post conditions on tool calls, EU AI Act Article 12 compliance reports, and a native Anthropic SDK adapter with automatic cost tracking.
SDK - Behavioral Contracts (new module)
sdk.contractsmodule with declarative pre/post conditions on tool callsnew- PreConditions enforce that budget is available, scope is allowed, or HITL is approved before any tool firesnew
- PostConditions verify audit was logged after tool executionnew
- Context manager API:
async with sdk.contracts.enforce(agent_id, session_id, tool):new - 18 test casesnew
SDK - EU AI Act Article 12 Compliance Reports (new module)
- Automated compliance evidence reports mapping audit data to the seven sections of Article 12 (binding August 2, 2026)new
- Three-state status model:
compliant,partial,non_compliantnew - Privacy-preserving by default - hashes, not raw contentnew
- Immutable frozen reports for internal compliance reviewnew
- CLI:
governance compliance reportnew - 10 test casesnew
Integrations - Anthropic SDK Adapter (new)
wrap_anthropic()patches message calls with automatic token trackingnew- USD cost estimation using Anthropic pricingnew
- Audit event generation and budget enforcementnew
- Async and sync supportnew
- Idempotent - safe to patch multiple timesnew
- Install via
pip install "code-atelier-governance[anthropic]"new - 11 test casesnew
v0.3.0
April 10, 2026Loop/anomaly detection, agent presence tracking, policy hot-reload, per-model cost aggregation, console GUI overhaul with Code Atelier branding, and security hardening.
SDK — Loop Detection (new module)
sdk.loopmodule withLoopPolicyandLoopDetectednewawait sdk.loop.record_call(agent_id, session_id, tool_name)— records tool call and checks for loopsnewawait sdk.loop.check(agent_id, session_id)— read-only loop status checknew- Sliding-window detection on
(session_id, tool_name)pairsnew - Tool names normalized to lowercase (case-insensitive detection)new
- Configurable action:
raise(kill the loop) orlog(observe only)new - Emits
loop.detectedaudit event on detectionnew - DDL:
governance_loop_trackingtablenew
SDK — Agent Presence (new module)
sdk.presencemodule with heartbeat-based lifecyclenewawait sdk.presence.heartbeat(agent_id)— mark agent as Livenewawait sdk.presence.mark_idle(agent_id)— transition to Idlenewawait sdk.presence.close_agent(agent_id)— remove from presence tablenewawait sdk.presence.list_agents()— all agents with statusnewawait sdk.presence.check_stale(timeout_seconds=300)— mark unresponsive agentsnew- Three states: Live, Idle, Unresponsivenew
- DDL:
governance_agent_presencetablenew
SDK — Policy Hot-Reload
GovernanceSDK(hot_reload=True, hot_reload_interval=30)— opt-in policy pollingnew- Polls
governance_policiestable at configurable intervalnew - Atomically replaces scope and cost policies in memorynew
- In-process asyncio task — no background workernew
SDK — Cost Module
track_usage()now acceptsmodel=for per-model cost trackingnewawait sdk.cost.model_breakdown(agent_id)— per-model cost aggregationnew
Console
- Code Atelier brand alignment (violet accent, Inter/JetBrains Mono fonts)new
- Login page with RBAC authenticationnew
- User management page (admin only)new
- Loading, empty, and error states on all pagesnew
- NavBar with admin-only tabsnew
GET /api/agents/presenceendpoint — agent statusnewGET /api/cost/modelsendpoint — per-model cost breakdownnew
Security
- Login rate limiting: 5 attempts per IP per 60 secondsnew
- Gate audit events now included in HMAC chainnew
- Metadata size cap: 64KB maximumnew
- Request model size constraintsnew
CLI
governance migratenow includes loop and presence DDLnew
v0.2.2
April 10, 2026Session time limits, built-in model pricing for 24 models, hidden tool policies, console RBAC authentication, and LangChain hidden tool filtering.
SDK — Cost Module
per_session_secondsfield on BudgetPolicy — session time limits enforced incheck_or_raise()new- Built-in pricing table for 24 models (OpenAI, Anthropic, Google, Meta, Mistral)new
await sdk.cost.track_usage(agent_id, session_id, model="gpt-4o", input_tokens=N, output_tokens=N)— auto-computes USDnew- Prefix matching for versioned model names (
gpt-4o-2024-05-13matchesgpt-4o)new - Unknown models return $0 (non-breaking)new
SDK — Scope Module
hidden_toolsfield on ScopePolicy — tools removed from LLM context entirelynewsdk.scope.filter_tools(agent_id, tool_list)— removes hidden tools before passing to LLMnew
Console
- RBAC authentication with PBKDF2-HMAC-SHA256 password hashing (600k iterations, OWASP recommendation)new
- Postgres-backed sessions with httpOnly cookies (8h TTL, revocation)new
- Two roles:
viewer(read-only) andadmin(full access + user management)new - API endpoints: POST /api/auth/login, POST /api/auth/logout, GET /api/auth/menew
- Admin endpoints: GET/POST /api/auth/users, PATCH /api/auth/users/{id}, DELETE /api/auth/sessions/{id}new
Integrations
- LangChain handler: hidden tool filtering in
on_llm_start(enforcement mode)new - OpenAI wrapper: auto USD via built-in pricing tablenew
- OpenAI wrapper: double-wrap sentinel warns and returns if client already wrappednew
CLI
governance console add-user / list-users / disable-user / reset-passwordcommandsnewgovernance migratenow includes console DDL (auth tables)new
v0.2.1
April 10, 2026Enforcement mode for LangChain, fail-closed cost gates by default, durable JSONL fallback, and a major Console upgrade with Grant/Deny buttons, event detail panel, and inline violation details.
SDK
- LangChain handler
enforce=Truemode — scope enforcement through callbacksnew wrap_openaidouble-wrap sentinel — calling it twice is now idempotentnewmodelas first-class field on AuditEvent (included in the HMAC chain)new- Policies persisted to
governance_policiesPostgres tablenew - JSONL durable fallback — audit events survive process crashesnew
- Weak secret rejection (minimum 8 unique bytes)new
Console
- Grant/Deny buttons on pending approvalsnew
- Event detail slide-out panel (full HMAC, prev_hash links, verify button)new
- Inline violation details on posture cardsnew
- Real Live/Idle status badge (no longer hardcoded)new
- Recursive metadata redaction (nested PII stripped)new
- Events page pagination (25/50/100 per page)new
- CORS wildcard rejectionnew
- Posture query performance bounds (LIMIT on all sub-queries)new
CLI
governance migrate --dry-runflagnew- Improved error messages for invalid UUIDsnew
Security
- Fail-closed cost gate by defaultnew
- Console auth warning for dev modenew
Docs
- Fixed "5 lines" hero claimnew
- Blue to violet design token consistency on sub-pagesnew
- Quickstart split into SDK-only + Console sectionsnew
- HMAC diagram field-coverage clarificationnew
v0.2.0
April 10, 2026Initial public release of the Governance SDK with all four enforcement modules and the read-only Governance Console.
SDK
- LangChain
BaseCallbackHandleradapter - OpenAI SDK
wrap_openaiadapter - OpenTelemetry GenAI exporter
- Multi-process correctness via Postgres advisory locks
- HMAC-chained tamper-evident audit trail
- Scope enforcement (tool/API whitelisting)
- Spend limits with fail-closed budget gates
- Human-in-the-loop signed approval tokens
Console
- Initial release of the Governance Console (read-only dashboard)
CLI
governanceCLI with migrate, verify, tail, and budget commands