SpanForge Platform — Cost Intelligence

Cost Intelligence Layer

Cost governance unified across two dimensions: design-time infrastructure cost estimation before any resource is committed, and runtime token cost tracking in production via the RFC-0001 SpanForge standard. Cost is not a report — it is a governed signal at every stage of the AI lifecycle.

The problem it solves

Cost governed at both ends of the lifecycle

AI cost overruns happen at two distinct points. At Design time: infrastructure and model spend is committed without a cost estimate, and the first bill arrives after architecture decisions have already locked in. At production scale: token consumption accumulates across thousands of agent runs with no per-call attribution, no budget enforcement, and no audit trail regulators can verify.

The SpanForge Cost Intelligence Layer closes both gaps. Design-time cost estimation uses infrastructure configuration inputs to produce scenario-compared estimates before any resource is provisioned. Runtime cost tracking uses the SpanForge llm.cost.* namespace to record every token, every call, and every session cost — as a cryptographically signed, HMAC-chained event — alongside the rest of the compliance audit trail.

“Cost is a compliance signal, not a finance report. Every token consumed is evidence of a decision made — and the Cost Intelligence Layer ensures it is recorded, attributed, and auditable.”

Core capabilities

Six capabilities across the full lifecycle

From Design-time infrastructure estimates through production token attribution — cost intelligence embedded as a first-class compliance signal.

Design-time infrastructure cost estimation

Infrastructure configuration or model selection produces a cost estimate before any resource is created. Cost is known at the decision point — not discovered on the first bill. Includes scenario comparison across cheapest, balanced, and performance profiles.

Runtime token cost tracking (SpanForge)

Every LLM call emits a structured cost event via the SpanForge llm.cost.* namespace — input tokens, output tokens, cached tokens, and reasoning tokens tracked per-call and aggregated across sessions. All included in the HMAC-signed audit trail.

Cross-provider unified cost model

A single pricing surface across OpenAI, Anthropic, Google Gemini, AWS Bedrock, Groq, and Together AI. Pricing snapshots are recorded in every CostBreakdown event for auditability. One cost model regardless of which providers your architecture uses.

Cost attribution by user, team, and dimension

Costs are attributed to org_id, team_id, and actor_id fields on every SpanForge event — enabling chargeback, budget enforcement, and per-initiative cost accountability across complex multi-agent workflows.

Multi-agent cost rollup

Child agent run costs propagate automatically to the parent AgentRunPayload.total_cost, including all nested child costs. Hierarchical cost accountability across orchestrated agent workflows — no manual aggregation required.

Gate Readiness Score™ — Cost Readiness

Cost intelligence evidence feeds directly into the sixth dimension of the Gate Readiness Score™. A Design Exit Gate evidence package is not complete without a documented infrastructure cost estimate and scenario comparison.

RFC-0001 SpanForge — llm.cost.* namespace

Cost as a compliance event

Under the RFC-0001 SpanForge standard, every cost record is structured as a typed event with a schema-versioned payload, ULID event ID, UTC timestamp, HMAC signature, and audit chain linkage via prev_id. Cost events are emitted automatically when using SpanForge auto-instrumentation for OpenAI, Gemini, and Bedrock — or explicitly via the SpanForge tracing API.

Event Type	Payload	Description
`llm.cost.token_recorded`	`CostTokenRecordedPayload`	Per-LLM-call cost record: input_cost, output_cost, cached_input_cost, reasoning_cost, total_cost, currency, provider, model, pricing_snapshot_date.
`llm.cost.session_recorded`	`CostSessionRecordedPayload`	Session-aggregate cost: cumulative cost across all LLM calls in an agent run or user session, with session_id grouping.
`llm.cost.attributed`	`CostAttributedPayload`	Cost attribution record: cost allocated to org_id, team_id, or actor_id for chargeback and budget accountability.

All cost events include a pricing_snapshot_date field for auditability. Pricing tables are updated continuously and versioned. Events emitted with stale pricing are flagged by the SpanForge linter (AO004).

Provider coverage

Unified pricing across all major providers

The spanforge.integrations._pricing module searches all provider tables automatically via get_pricing(model_id). One call returns input, output, cached-input, and reasoning token rates for any supported model.

Provider	Models	Status
OpenAI	gpt-4o, gpt-4-turbo, gpt-3.5, o1 reasoning models	Live
Anthropic	Claude 3 Opus, Sonnet, Haiku; Claude 2.1	Live
Google Gemini	gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash	Live (2.1.0+)
AWS Bedrock	Claude 3 series, Llama, Mistral, Titan, Cohere	Live (2.1.0+)
Groq	Mixtral, Llama series	Live
Together AI	Open-source model portfolio	Live
LLM inference cost estimation	Full API cost projection at Design time	Roadmap 2027

T.R.U.S.T. Framework — Responsibility

Cost accountability is a governance obligation

The T.R.U.S.T. Framework's Responsibility dimension requires that named, accountable owners understand the financial implications of their AI systems before committing to them. The Cost Intelligence Layer is the technical mechanism that makes this possible — at Design time via infrastructure cost estimation, and in production via continuous token cost attribution against the responsible owner's identity in every SpanForge event.

Design phase — Gate Readiness Score™ Cost Readiness

The Cost Intelligence Layer is the required evidence source for the sixth Gate Readiness Score™ dimension. No Design Exit Gate evidence package is complete without a documented infrastructure cost estimate with scenario comparison — demonstrating that the architecture decision was made with cost visibility, not cost blindness.

Scale phase — continuous cost audit trail

In production, every token consumed is recorded as a cost event with actor_id attribution, session aggregation, and HMAC chain linkage. The cost audit trail is inseparable from the compliance audit trail — both live in the same RFC-0001 event stream, cryptographically sealed together.

Cost-aware from Design. Accountable in production.

Start with the Design phase to get your infrastructure cost estimate — or connect to the Scale phase for runtime cost tracking via SpanForge.

Design Phase →Scale Phase →