Skip to main content

Create Observation Snapshots

Stave evaluates observation snapshots — JSON files that describe your infrastructure at a point in time. This guide is a set of recipes for producing obs.v0.1 snapshots from a live AWS account, from Terraform, or by hand.

For the field-by-field spec of the S3 property groups these recipes populate, see Observation Export Schema. For the schema envelope (top-level structure, assets, identities, validation rules), see Observation Schema (obs.v0.1).

From a Live AWS Environment

Use the AWS CLI to query your S3 configuration, then convert the output to observation format using stave ingest --profile mvp1-s3.

Step 1: Export raw AWS data

#!/bin/bash
SNAPSHOT_DIR="./aws-snapshot-$(date +%Y-%m-%d)"
mkdir -p "$SNAPSHOT_DIR"/{get-bucket-policy,get-bucket-acl,get-public-access-block,get-bucket-tagging}

aws s3api list-buckets > "$SNAPSHOT_DIR/list-buckets.json"

for bucket in $(jq -r '.Buckets[].Name' "$SNAPSHOT_DIR/list-buckets.json"); do
aws s3api get-bucket-policy --bucket "$bucket" \
> "$SNAPSHOT_DIR/get-bucket-policy/$bucket.json" 2>/dev/null || true
aws s3api get-bucket-acl --bucket "$bucket" \
> "$SNAPSHOT_DIR/get-bucket-acl/$bucket.json" 2>/dev/null || true
aws s3api get-public-access-block --bucket "$bucket" \
> "$SNAPSHOT_DIR/get-public-access-block/$bucket.json" 2>/dev/null || true
aws s3api get-bucket-tagging --bucket "$bucket" \
> "$SNAPSHOT_DIR/get-bucket-tagging/$bucket.json" 2>/dev/null || true
done

Required IAM permissions:

{
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:GetBucketPolicy",
"s3:GetBucketAcl",
"s3:GetPublicAccessBlock",
"s3:GetBucketTagging"
],
"Resource": "*"
}

The expected directory structure after export:

aws-snapshot-2026-01-15/
list-buckets.json
get-bucket-policy/<bucket-name>.json
get-bucket-acl/<bucket-name>.json
get-public-access-block/<bucket-name>.json
get-bucket-tagging/<bucket-name>.json

Step 2: Convert to observation format

stave ingest --profile mvp1-s3 \
--input ./aws-snapshot-2026-01-15 \
--output observations/2026-01-15.json \
--include-all \
--now 2026-01-15T00:00:00Z

ingest --profile mvp1-s3 reads the raw AWS CLI JSON and produces a single observation file that conforms to the obs.v0.1 schema. It maps each AWS API response to the property groups described in the Observation Export Schema.

By default, ingest --profile mvp1-s3 filters to buckets tagged with DataDomain=health or containsPHI=true. To change this:

# Extract all buckets (no filtering)
stave ingest --profile mvp1-s3 --input ./aws-snapshot-2026-01-15 --out obs.json --include-all

# Extract specific buckets by name
stave ingest --profile mvp1-s3 --input ./aws-snapshot-2026-01-15 --out obs.json \
--bucket-allowlist acme-patient-records \
--bucket-allowlist acme-audit-logs

How AWS API responses map to properties

AWS API callProperties populated
get-bucket-policyvisibility.public_read_via_policy, visibility.public_list_via_policy, access.has_external_access, access.external_accounts, access.has_wildcard_policy, policy.*, encryption.in_transit_enforced
get-bucket-aclvisibility.public_read_via_acl, visibility.public_write (via ACL)
get-public-access-blockcontrols.public_access_block.*, controls.public_access_fully_blocked
get-bucket-taggingtags
Derived from abovevisibility.public_read, visibility.public_list (union of policy and ACL signals)

Additional AWS API calls for complete coverage (not required by ingest --profile mvp1-s3 but can be included in custom exporters):

AWS API callProperties populated
get-bucket-encryptionencryption.at_rest_enabled, encryption.algorithm, encryption.kms_key_id
get-bucket-versioningversioning.enabled, versioning.mfa_delete_enabled
get-bucket-logginglogging.enabled, logging.target_bucket, logging.target_prefix
get-bucket-lifecycle-configurationlifecycle.*
get-object-lock-configurationobject_lock.*

From Terraform

If your S3 buckets are managed by Terraform, you can extract observation data from Terraform state or plan output without querying AWS APIs directly.

From Terraform state

terraform show -json > tf-state.json

The state JSON contains the current configuration of every managed resource. Extract S3 bucket attributes and map them to the observation schema:

Terraform resource attributeObservation property
aws_s3_bucket.bucketstorage.name
aws_s3_bucket_public_access_block.block_public_aclscontrols.public_access_block.block_public_acls
aws_s3_bucket_public_access_block.ignore_public_aclscontrols.public_access_block.ignore_public_acls
aws_s3_bucket_public_access_block.block_public_policycontrols.public_access_block.block_public_policy
aws_s3_bucket_public_access_block.restrict_public_bucketscontrols.public_access_block.restrict_public_buckets
aws_s3_bucket_server_side_encryption_configuration.rule.apply_server_side_encryption_by_default.sse_algorithmencryption.algorithm
aws_s3_bucket_server_side_encryption_configuration.rule.apply_server_side_encryption_by_default.kms_master_key_idencryption.kms_key_id
aws_s3_bucket_versioning.versioning_configuration.statusversioning.enabled ("Enabled" = true)
aws_s3_bucket_logging.target_bucketlogging.target_bucket
aws_s3_bucket_logging.target_prefixlogging.target_prefix
aws_s3_bucket_lifecycle_configuration.rulelifecycle.*
aws_s3_bucket_object_lock_configuration.rule.default_retentionobject_lock.*
aws_s3_bucket_policy.policyParse JSON to derive visibility.*, access.*, policy.*

From Terraform plan

To evaluate what Terraform is about to deploy (before apply):

terraform plan -out=plan.tfplan
terraform show -json plan.tfplan > observations/2026-01-15.json

Evaluate Terraform plan output the same way as any other observation:

stave apply \
--controls controls/s3 \
--observations ./observations \
--max-unsafe 7d

Writing a Terraform-to-observation converter

For full control, write a script that reads terraform show -json output and produces obs.v0.1 JSON. A minimal approach using jq:

terraform show -json | jq '{
schema_version: "obs.v0.1",
captured_at: (now | todate),
generated_by: {
source_type: "terraform.state_json",
tool: "terraform",
tool_version: "1.9.8"
},
resources: [
.values.root_module.resources[]
| select(.type == "aws_s3_bucket")
| {
id: ("res:aws:s3:bucket:" + .values.bucket),
type: "storage_bucket",
vendor: "aws",
properties: {
storage: {
kind: "bucket",
name: .values.bucket,
visibility: { public_read: false, public_list: false },
controls: { public_access_fully_blocked: false },
encryption: {
at_rest_enabled: false,
algorithm: "",
in_transit_enforced: false
},
versioning: { enabled: false },
logging: { enabled: false, target_bucket: "", target_prefix: "" }
}
}
}
]
}' > observations/2026-01-15.json

This produces a baseline observation. Enrich it by joining data from related Terraform resources (aws_s3_bucket_public_access_block, aws_s3_bucket_server_side_encryption_configuration, etc.) to populate additional property groups.

By Hand

You can create observation JSON manually. This is useful for testing custom controls or evaluating hypothetical configurations. Here is a complete example with all property groups populated:

{
"schema_version": "obs.v0.1",
"generated_by": {
"source_type": "manual",
"tool": "hand-written"
},
"captured_at": "2026-01-15T00:00:00Z",
"resources": [
{
"id": "res:aws:s3:bucket:acme-patient-records",
"type": "storage_bucket",
"vendor": "aws",
"properties": {
"storage": {
"kind": "bucket",
"name": "acme-patient-records",
"visibility": {
"public_read": false,
"public_list": false,
"public_write": false
},
"controls": {
"public_access_fully_blocked": true
},
"encryption": {
"at_rest_enabled": true,
"algorithm": "aws:kms",
"kms_key_id": "arn:aws:kms:us-east-1:123456789012:key/mrk-abc123",
"in_transit_enforced": true
},
"versioning": {
"enabled": true,
"mfa_delete_enabled": false
},
"logging": {
"enabled": true,
"target_bucket": "acme-access-logs",
"target_prefix": "patient-records/"
},
"tags": {
"data-classification": "phi",
"data-retention": "7years"
}
}
}
}
]
}

As long as the file conforms to the obs.v0.1 schema, Stave will accept it. Use stave validate to check your files before evaluation:

stave validate --observations ./observations --controls controls/s3

Snapshot Cadence

For unsafe_duration controls, Stave needs multiple snapshots to calculate how long a resource has been unsafe:

  • Minimum: 2 snapshots at different times
  • Recommended: Daily snapshots (via cron or CI) for accurate duration tracking
  • Gaps: Observation gaps larger than 12 hours reduce confidence in findings

For unsafe_state controls (no duration tracking), a single snapshot is sufficient.

observations/
2026-01-13.json
2026-01-14.json
2026-01-15.json

Point Stave at the directory and it evaluates all snapshots together, building resource timelines automatically.

See also