CLI Reference
Full reference for the spanforge-secrets command-line interface.
Usage
spanforge-secrets COMMAND [OPTIONS] [ARGS]
Commands
| Command | Description |
|---|---|
scan | Scan files or stdin for PII and exposed API keys |
verify-chain | Verify the HMAC audit chain of a JSONL event log |
Exit Codes
| Code | Meaning |
|---|---|
0 | All inputs are clean — no violations found |
1 | At least one violation detected (PII or exposed API key); or the audit chain is invalid / tampered |
2 | Usage or argument error (e.g. no paths supplied, unknown flag) |
3 | I/O or format error (unreadable file, invalid JSON, git not found) |
scan
Scan one or more files, directories, or stdin for PII and API key leakage.
spanforge-secrets scan [OPTIONS] [PATH ...]
Positional arguments
| Argument | Description |
|---|---|
PATH ... | Files or directories to scan. Directories are walked recursively. Omit when using --stdin or --diff. |
Options
--stdin
Read from standard input instead of files. Treats the entire input as a plain UTF-8 text blob.
echo "alice@example.com" | spanforge-secrets scan --stdin
cat prompts.txt | spanforge-secrets scan --stdin
Mutually exclusive with PATH and --diff.
--diff
Scan only the lines added in git diff --staged. Designed for use as a pre-commit hook. Only lines beginning with + (excluding +++ diff headers) are scanned.
spanforge-secrets scan --diff
Requires git to be installed and the working directory to be inside a git repository. Exits 3 if git is not found or returns a non-zero status.
--format {json,sarif}
Output format. Default: json.
| Value | Description |
|---|---|
json | CI-Gate-01 JSON summary (default) — see JSON Output Format |
sarif | SARIF 2.1.0 document — see SARIF Output Format |
spanforge-secrets scan data/ --format sarif > results.sarif
--ignore-file FILE
Path to a file containing fnmatch glob patterns to ignore (one pattern per line). Lines starting with # are treated as comments.
spanforge-secrets scan data/ --ignore-file ci/secrets-ignore.txt
If this flag is omitted, the scanner auto-detects .spanforge-secretsignore in the current working directory. See Ignore Patterns.
--no-scan-raw
Disable raw string regex scanning. Returns a clean (empty) result for all inputs. Default is enabled (--scan-raw).
Note: this flag exists for API compatibility with
spanforge.redact.contains_pii(scan_raw=False). In normal usage, leave raw scanning enabled (the default).
--scan-raw
Explicitly enable raw string regex scanning (this is the default and does not need to be specified).
Supported file types
| Extension | Handling |
|---|---|
.json | Parsed as a JSON object; entire structure is walked recursively |
.jsonl, .ndjson | Parsed line-by-line; each line is a separate JSON object |
| Anything else | Treated as UTF-8 plain text |
Files with known binary extensions (.png, .pdf, .zip, .exe, etc.) are automatically skipped. Files larger than 50 MB are also skipped with a warning on stderr.
Non-UTF-8 files produce a warning on stderr and are skipped (not an error).
verify-chain
Verify the HMAC integrity of a JSONL audit-log file.
spanforge-secrets verify-chain AUDIT_LOG --secret HMAC_SECRET
Positional arguments
| Argument | Description |
|---|---|
AUDIT_LOG | Path to the JSONL audit log. Each line must be a valid JSON object representing a spanforge.event.Event. Blank lines are skipped. |
Options
--secret HMAC_SECRET
The HMAC signing secret that was used when the audit chain was created. This is passed directly to spanforge.signing.verify_chain(org_secret=...). If omitted, the SPANFORGE_HMAC_SECRET environment variable is used. Exits 2 if neither is provided.
Output
Prints a JSON object to stdout:
{
"valid": true,
"first_tampered": null,
"gaps": [],
"tampered_count": 0,
"tombstone_count": 0
}
| Field | Type | Description |
|---|---|---|
valid | bool | true if the chain is intact |
first_tampered | int | null | 0-based index of the first tampered event, or null |
gaps | list[int] | List of positions where chain linkage breaks |
tampered_count | int | Number of events with invalid signatures |
tombstone_count | int | Number of tombstone events in the chain |
Exit code is 0 when valid is true, 1 otherwise.
JSON Output Format
The scan sub-command emits a single JSON object to stdout.
{
"gate": "CI-Gate-01",
"clean": false,
"total_violations": 3,
"results": [
{
"source": "data/training.jsonl",
"clean": false,
"violation_count": 2,
"scanned_strings": 120,
"hits": [
{
"entity_type": "email",
"path": "messages[3].content",
"match_count": 1,
"sensitivity": "medium",
"category": "pii"
},
{
"entity_type": "ssn",
"path": "messages[7].content",
"match_count": 1,
"sensitivity": "high",
"category": "pii"
}
]
},
{
"source": "prompts/system.txt",
"clean": false,
"violation_count": 1,
"scanned_strings": 1,
"hits": [
{
"entity_type": "openai_api_key",
"path": "<text>",
"match_count": 1,
"sensitivity": "high",
"category": "api_key"
}
]
}
]
}
Top-level fields
| Field | Type | Description |
|---|---|---|
gate | string | Always "CI-Gate-01" |
clean | bool | true when total_violations == 0 |
total_violations | int | Sum of all per-file violation counts |
results | array | One entry per scanned source |
Per-result fields
| Field | Type | Description |
|---|---|---|
source | string | File path, "<stdin>", or "diff:path/to/file" |
clean | bool | true when this source has no violations |
violation_count | int | Number of hits for this source |
scanned_strings | int | Number of string values inspected |
hits | array | Detection hits — see below |
Hit fields
| Field | Type | Values |
|---|---|---|
entity_type | string | email, phone, ssn, credit_card, ip_address, uk_national_insurance, aadhaar, pan, date_of_birth, address, openai_api_key, anthropic_api_key, aws_access_key_id, aws_secret_access_key, gcp_service_account_key |
path | string | Dot/bracket JSON path, or "<text>" for raw strings |
match_count | int | Number of distinct regex matches |
sensitivity | string | "high", "medium", or "low" |
category | string | "pii" or "api_key" |
Privacy: matched values are never included in the output — only type, path, count, and sensitivity level.
SARIF Output Format
The SARIF 2.1.0 output is compatible with GitHub Advanced Security / Code Scanning. When uploaded via actions/upload-sarif, findings appear as pull-request annotations.
spanforge-secrets scan data/ --format sarif > results.sarif
The SARIF document maps sensitivity levels to SARIF severities:
| Sensitivity | SARIF level |
|---|---|
high | error |
medium | warning |
low | note |
See CI Integration for a complete GitHub Actions workflow that uploads SARIF results.
Stderr messages
The CLI writes informational messages to stderr (not captured in the JSON output):
| Message | Cause |
|---|---|
skipping binary file: <path> | File has a binary extension |
skipping <path> (X MB > 50 MB limit) | File exceeds the 50 MB size guard |
skipping non-UTF-8 file: <path> | File cannot be decoded as UTF-8 |
ignoring <path> | File matches an ignore pattern |
Errors (fatal) are written to stderr and cause a non-zero exit:
spanforge-secrets: error: <message>