The bucket that called itself confidential and then served itself to everyone

Metadata

Title: The bucket that called itself confidential and then served itself to everyone
Source of the case: Real HackerOne report #361438 (Uber)
AWS service(s): S3
Risk archetype: Self-contradicting state — the org's own classification tag disagrees with the live policy
One-line hook: Can you prove a bucket tagged confidential is actually private?

0. The challenge (what the reader does first)

Scenario given to the reader:

An S3 bucket named ubergreece holds operational data for Uber's Greece market. Someone on the data team tagged it data-classification: confidential, so on paper this is locked-down internal data. The bucket also carries a bucket policy and has no Public Access Block configured. You have the export in front of you and nothing else.

Evidence they're handed (and nothing else):

{
  "bucket": "ubergreece",
  "tags": {"data-classification": "confidential"},
  "bucket_policy": {"Statement": [{"Effect": "Allow", "Principal": "*", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::ubergreece/*"}]},
  "public_access_block": null
}

The JSON export above. No AWS credentials. No live account. No scripts.

The questions they must answer from the evidence alone:

Is ubergreece readable by anonymous callers right now, given the policy and the absent Public Access Block?
The tag says confidential. Is that tag a control, or just a label that nothing enforces — and could the gap between intent and config get worse silently?
Which exposure path is live here — the bucket policy granting Principal: *?
Is there a second path (ACL, object-level grant) that could also open this bucket, or is the policy the only door?
What single rule, applied account-wide, would have made the confidential tag and the live access agree?

1. The manual problem

To answer by hand you have to read the policy like an IAM evaluator: Effect: Allow, Principal: *, Action: s3:GetObject, Resource: .../*. That is an anonymous read grant on every object. Then you have to notice what is missing — public_access_block is null, so nothing overrides the policy. Then you have to hold the tag in your head and realize the organization already declared this data should not be public. Three facts, three different parts of the export, and the dangerous conclusion only appears when you combine them. The tag is not a security boundary; it is a sticky note. Nothing in the config makes the sticky note true.

2. The reasoning wall

What they hit	What they said / would say
The tag reads `confidential`, so the eye relaxes before reading the policy	"I trusted the label. I assumed someone who wrote `confidential` also locked it down."
`public_access_block: null` is an absence, and absences are easy to skim past	"Nothing jumped out as wrong. The thing that made it public was the thing that wasn't there."
Knowing it's public is not the same as proving the org contradicted itself	"I can see it's open. What I actually need is: this violates a rule we set for ourselves."

The insight the reader should reach on their own:

A tag is a promise. The config is the truth. Nobody was checking that the promise and the truth still agreed.

3. Why scanners miss or flatten it

A per-setting scanner will happily flag "bucket policy allows Principal: *" — that part is easy. What it cannot do is connect that finding to the data-classification: confidential tag and report the contradiction: the organization's own declared intent is being violated. To the scanner the tag is unrelated metadata in a different field. It reports a generic "public bucket" the same way it would for an intentionally public marketing CDN, with no way to say this one was supposed to be confidential and the policy disagrees. The severity that matters here is not "public" — it is "public, and you told us it must not be." That requires reasoning across the tag and the policy together, which a setting-by-setting checklist structurally cannot do.

Pivot point. Everything above is the gap. Everything below is Stave filling it. The reader has now done the work and hit the wall. Only now does the tool appear.

4. The evidence Stave consumes

The same static export the reader had — no new privileges, no live cloud:

{
  "bucket": "ubergreece",
  "tags": {"data-classification": "confidential"},
  "bucket_policy": {"Statement": [{"Effect": "Allow", "Principal": "*", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::ubergreece/*"}]},
  "public_access_block": null
}

Stave normalizes this into an observation snapshot: a bucket asset with its policy statements, classification tag, and the (absent) Public Access Block all as evaluable facts on one asset.

5. The reasoning Stave performs

Control / invariant: CTL.S3.PUBLIC.001 — no bucket may grant read to Principal: *. CTL.S3.PUBLIC.002 — a bucket classified confidential must not be publicly readable.
What it evaluates: CTL.S3.PUBLIC.001 inspects every policy statement for an Allow to Principal: * on a read action with no Public Access Block to neutralize it. CTL.S3.PUBLIC.002 joins that public-read fact to the data-classification tag and fires only when the data was declared confidential — encoding the contradiction the reader had to assemble by hand.
Verdict produced: Both controls fire and consolidate into one Issue. The classification tag is the org's own evidence against the policy.

Issue: ubergreece — public read on confidential data

CTL.S3.PUBLIC.001  NON_COMPLIANT
  asset:    s3://ubergreece
  evidence: bucket_policy statement Allow Principal:* s3:GetObject on /*;
            public_access_block = null (nothing overrides it)
  verdict:  bucket is anonymously readable

CTL.S3.PUBLIC.002  NON_COMPLIANT
  asset:    s3://ubergreece
  evidence: tag data-classification=confidential AND public read = true
  verdict:  declared-confidential data is publicly readable — intent violated

security_state: NON_COMPLIANT

6. The prevention artifact Stave produces

Artifact: A bucket-policy Deny statement plus an account-level Public Access Block recommendation, generated from the violating state.
What it forecloses: The latent gap from question 2 — a confidential tag that nothing enforces. With PAB on, no future policy edit can re-open this bucket, so the tag and the access permanently agree.

# Bucket policy guardrail (explicit deny wins over any Allow)
{
  "Sid": "DenyPublicReadOnConfidential",
  "Effect": "Deny",
  "Principal": "*",
  "Action": "s3:GetObject",
  "Resource": "arn:aws:s3:::ubergreece/*",
  "Condition": {"StringEquals": {"aws:PrincipalOrgID": "NOT_o-myorg"}}
}

# Account-level Public Access Block (neutralizes any future public policy)
aws s3control put-public-access-block \
  --account-id <acct> \
  --public-access-block-configuration \
    BlockPublicPolicy=true,RestrictPublicBuckets=true,\
    BlockPublicAcls=true,IgnorePublicAcls=true

7. What the team no longer does manually

Before	After Stave
Read each policy statement as an IAM evaluator to decide if it grants anonymous read	One control proves public-read across policy and PAB deterministically
Hold the classification tag in your head and manually cross-check it against access	The tag-vs-access contradiction is encoded as an invariant and re-checked every run
Hope a `confidential` tag means someone also locked the bucket	A generated PAB guardrail makes the tag enforceable, not aspirational

Positioning line for this case

Stave proves that ubergreece is publicly readable, proves it contradicts the org's own confidential classification, and emits the Public Access Block that makes that contradiction impossible to recreate.

Reuse checklist

A reader could attempt section 0 with zero Stave knowledge
Stave is not named or shown before the pivot point
Section 2 quotes are real (or honestly plausible), not slogans
Section 3 names the specific thing per-setting tools can't see
Section 6 closes the exact latent state raised in section 0, question 2
The title names the failure, not the product

Metadata​

0. The challenge (what the reader does first)​

1. The manual problem​

2. The reasoning wall​

3. Why scanners miss or flatten it​

4. The evidence Stave consumes​

5. The reasoning Stave performs​

6. The prevention artifact Stave produces​

7. What the team no longer does manually​

Positioning line for this case​

Reuse checklist​