Cost Intelligence Layer
Cost governance unified across two dimensions: design-time infrastructure cost estimation before any resource is committed, and runtime token cost tracking in production via the RFC-0001 SpanForge standard. Cost is not a report — it is a governed signal at every stage of the AI lifecycle.
Cost governed at both ends of the lifecycle
AI cost overruns happen at two distinct points. At Design time: infrastructure and model spend is committed without a cost estimate, and the first bill arrives after architecture decisions have already locked in. At production scale: token consumption accumulates across thousands of agent runs with no per-call attribution, no budget enforcement, and no audit trail regulators can verify.
The SpanForge Cost Intelligence Layer closes both gaps. Design-time cost estimation uses infrastructure configuration inputs to produce scenario-compared estimates before any resource is provisioned. Runtime cost tracking uses the SpanForge llm.cost.* namespace to record every token, every call, and every session cost — as a cryptographically signed, HMAC-chained event — alongside the rest of the compliance audit trail.
“Cost is a compliance signal, not a finance report. Every token consumed is evidence of a decision made — and the Cost Intelligence Layer ensures it is recorded, attributed, and auditable.”
Six capabilities across the full lifecycle
From Design-time infrastructure estimates through production token attribution — cost intelligence embedded as a first-class compliance signal.
01
Design-time infrastructure cost estimation
Infrastructure configuration or model selection produces a cost estimate before any resource is created. Cost is known at the decision point — not discovered on the first bill. Includes scenario comparison across cheapest, balanced, and performance profiles.
02
Runtime token cost tracking (SpanForge)
Every LLM call emits a structured cost event via the SpanForge llm.cost.* namespace — input tokens, output tokens, cached tokens, and reasoning tokens tracked per-call and aggregated across sessions. All included in the HMAC-signed audit trail.
03
Cross-provider unified cost model
A single pricing surface across OpenAI, Anthropic, Google Gemini, AWS Bedrock, Groq, and Together AI. Pricing snapshots are recorded in every CostBreakdown event for auditability. One cost model regardless of which providers your architecture uses.
04
Cost attribution by user, team, and dimension
Costs are attributed to org_id, team_id, and actor_id fields on every SpanForge event — enabling chargeback, budget enforcement, and per-initiative cost accountability across complex multi-agent workflows.
05
Multi-agent cost rollup
Child agent run costs propagate automatically to the parent AgentRunPayload.total_cost, including all nested child costs. Hierarchical cost accountability across orchestrated agent workflows — no manual aggregation required.
06
Gate Readiness Score™ — Cost Readiness
Cost intelligence evidence feeds directly into the sixth dimension of the Gate Readiness Score™. A Design Exit Gate evidence package is not complete without a documented infrastructure cost estimate and scenario comparison.
Cost as a compliance event
Under the RFC-0001 SpanForge standard, every cost record is structured as a typed event with a schema-versioned payload, ULID event ID, UTC timestamp, HMAC signature, and audit chain linkage via prev_id. Cost events are emitted automatically when using SpanForge auto-instrumentation for OpenAI, Gemini, and Bedrock — or explicitly via the SpanForge tracing API.
| Event Type | Payload | Description |
|---|---|---|
llm.cost.token_recorded | CostTokenRecordedPayload | Per-LLM-call cost record: input_cost, output_cost, cached_input_cost, reasoning_cost, total_cost, currency, provider, model, pricing_snapshot_date. |
llm.cost.session_recorded | CostSessionRecordedPayload | Session-aggregate cost: cumulative cost across all LLM calls in an agent run or user session, with session_id grouping. |
llm.cost.attributed | CostAttributedPayload | Cost attribution record: cost allocated to org_id, team_id, or actor_id for chargeback and budget accountability. |
All cost events include a pricing_snapshot_date field for auditability. Pricing tables are updated continuously and versioned. Events emitted with stale pricing are flagged by the SpanForge linter (AO004).
Unified pricing across all major providers
The spanforge.integrations._pricing module searches all provider tables automatically via get_pricing(model_id). One call returns input, output, cached-input, and reasoning token rates for any supported model.
| Provider | Models | Status |
|---|---|---|
| OpenAI | gpt-4o, gpt-4-turbo, gpt-3.5, o1 reasoning models | Live |
| Anthropic | Claude 3 Opus, Sonnet, Haiku; Claude 2.1 | Live |
| Google Gemini | gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash | Live (2.1.0+) |
| AWS Bedrock | Claude 3 series, Llama, Mistral, Titan, Cohere | Live (2.1.0+) |
| Groq | Mixtral, Llama series | Live |
| Together AI | Open-source model portfolio | Live |
| LLM inference cost estimation | Full API cost projection at Design time | Roadmap 2027 |
Cost accountability is a governance obligation
The T.R.U.S.T. Framework's Responsibility dimension requires that named, accountable owners understand the financial implications of their AI systems before committing to them. The Cost Intelligence Layer is the technical mechanism that makes this possible — at Design time via infrastructure cost estimation, and in production via continuous token cost attribution against the responsible owner's identity in every SpanForge event.
Design phase — Gate Readiness Score™ Cost Readiness
The Cost Intelligence Layer is the required evidence source for the sixth Gate Readiness Score™ dimension. No Design Exit Gate evidence package is complete without a documented infrastructure cost estimate with scenario comparison — demonstrating that the architecture decision was made with cost visibility, not cost blindness.
Scale phase — continuous cost audit trail
In production, every token consumed is recorded as a cost event with actor_id attribution, session aggregation, and HMAC chain linkage. The cost audit trail is inseparable from the compliance audit trail — both live in the same RFC-0001 event stream, cryptographically sealed together.
Cost-aware from Design. Accountable in production.
Start with the Design phase to get your infrastructure cost estimate — or connect to the Scale phase for runtime cost tracking via SpanForge.