Skip to main content

How Stave Works

Stave is a pipeline with four stages: inputs, schema validation, evaluation, and structured output. Each stage is shown below.

1. Inputs

Two inputs feed the pipeline — observation snapshots captured from your infrastructure, and control definitions that encode your safety policies.

flowchart TD
OBS["Observations"]
CTL["Controls"]

OBS --> NEXT["Schema Validation"]
CTL --> NEXT

Each observation file is a flat JSON snapshot of your infrastructure at a specific point in time. Each control is a YAML file defining a safety property that must hold true. Stave ships with 43 S3 controls, and you can write your own.

2. Schema Validation

Both inputs are validated against embedded JSON Schema (Draft 2020-12) before evaluation begins. If validation fails, Stave exits with code 2 and no evaluation runs.

Valid inputs proceed to evaluation:

flowchart TD
OBS["Observations"] --> OV{"Observation Schema"}
CTL["Controls"] --> CV{"Control Schema"}

OV -->|Pass| ENGINE["Evaluation Engine"]
CV -->|Pass| ENGINE

Invalid inputs halt the pipeline:

flowchart TD
OBS["Observations"] --> OV{"Observation Schema"}
CTL["Controls"] --> CV{"Control Schema"}

OV -->|Fail| STOP["No Evaluation"]
CV -->|Fail| STOP
# Validate inputs before evaluation
stave validate --controls ./controls --observations ./observations

3. Evaluation

After validation, the engine builds a timeline for each resource across all snapshots, then evaluates every control against every resource.

flowchart TD
SNAP["Snapshots"] --> TL["Timeline Builder"]
TL --> EV["Control Evaluator"]
CTL["Controls"] --> EV

EV --> SAFE["Safe"]
EV --> UNSAFE["Violation"]
TypeBehavior
unsafe_stateViolation if the resource is attack surface
unsafe_durationViolation if the resource has been continuously unsafe longer than --max-unsafe
unsafe_recurrenceViolation if the resource has toggled unsafe repeatedly
prefix_exposureViolation if protected S3 prefixes are publicly readable

4. Output

Stave's output conforms to the out.v0.1 schema, enforced at the Go type level. The output is structured JSON designed for machine consumption by downstream systems — CI/CD pipelines, dashboards, ticketing integrations, and audit tools.

flowchart TD
EV["Evaluation Engine"] --> OUT["Findings"]
OUT --> OS{"Output Schema"}

OS --> CI["CI/CD"]
OS --> DASH["Dashboards"]
OS --> TICKET["Ticketing"]
OS --> AUDIT["Audit"]

The output schema guarantees:

  • A run object with tool version, --now timestamp, snapshot count, and deterministic input hashes
  • A summary with resources_evaluated, attack_surface, and violations counts
  • A findings array where each finding includes control_id, resource_id, evidence, and mitigation
  • Deterministic output: same inputs with --now always produce byte-for-byte identical JSON
# Evaluate and pipe to downstream tools
stave apply \
--controls controls/s3 \
--observations ./observations \
--max-unsafe 7d \
--now 2026-01-15T00:00:00Z \
| jq '.findings[] | select(.severity == "critical")'

Schema Locations

SchemaFormatLocation
obs.v0.1JSON Schema Draft 2020-12schema_canonical/obs.v0.1.schema.json
ctrl.v1JSON Schema Draft 2020-12schema_canonical/ctrl.v1.schema.json
out.v0.1Go struct contractEnforced by internal/adapters/output/writer.go

Observations and controls are validated at runtime against the embedded JSON Schema files. The output schema is defined by Go struct types — downstream consumers can rely on the out.v0.1 field contract being stable across patch releases.