The bucket that called itself confidential and then served itself to everyone
Metadata
- Title: The bucket that called itself confidential and then served itself to everyone
- Source of the case: Real HackerOne report #361438 (Uber)
- AWS service(s): S3
- Risk archetype: Self-contradicting state — the org's own classification tag disagrees with the live policy
- One-line hook: Can you prove a bucket tagged
confidentialis actually private?
0. The challenge (what the reader does first)
Scenario given to the reader:
An S3 bucket named ubergreece holds operational data for Uber's Greece
market. Someone on the data team tagged it data-classification: confidential,
so on paper this is locked-down internal data. The bucket also carries a
bucket policy and has no Public Access Block configured. You have the export
in front of you and nothing else.
Evidence they're handed (and nothing else):
{
"bucket": "ubergreece",
"tags": {"data-classification": "confidential"},
"bucket_policy": {"Statement": [{"Effect": "Allow", "Principal": "*", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::ubergreece/*"}]},
"public_access_block": null
}
- The JSON export above. No AWS credentials. No live account. No scripts.
The questions they must answer from the evidence alone:
- Is
ubergreecereadable by anonymous callers right now, given the policy and the absent Public Access Block? - The tag says
confidential. Is that tag a control, or just a label that nothing enforces — and could the gap between intent and config get worse silently? - Which exposure path is live here — the bucket policy granting
Principal: *? - Is there a second path (ACL, object-level grant) that could also open this bucket, or is the policy the only door?
- What single rule, applied account-wide, would have made the
confidentialtag and the live access agree?
1. The manual problem
To answer by hand you have to read the policy like an IAM evaluator: Effect: Allow, Principal: *, Action: s3:GetObject, Resource: .../*. That is an
anonymous read grant on every object. Then you have to notice what is missing
— public_access_block is null, so nothing overrides the policy. Then you
have to hold the tag in your head and realize the organization already declared
this data should not be public. Three facts, three different parts of the
export, and the dangerous conclusion only appears when you combine them. The
tag is not a security boundary; it is a sticky note. Nothing in the config
makes the sticky note true.
2. The reasoning wall
| What they hit | What they said / would say |
|---|---|
The tag reads confidential, so the eye relaxes before reading the policy | "I trusted the label. I assumed someone who wrote confidential also locked it down." |
public_access_block: null is an absence, and absences are easy to skim past | "Nothing jumped out as wrong. The thing that made it public was the thing that wasn't there." |
| Knowing it's public is not the same as proving the org contradicted itself | "I can see it's open. What I actually need is: this violates a rule we set for ourselves." |
The insight the reader should reach on their own:
A tag is a promise. The config is the truth. Nobody was checking that the promise and the truth still agreed.
3. Why scanners miss or flatten it
A per-setting scanner will happily flag "bucket policy allows Principal: *"
— that part is easy. What it cannot do is connect that finding to the
data-classification: confidential tag and report the contradiction: the
organization's own declared intent is being violated. To the scanner the tag is
unrelated metadata in a different field. It reports a generic "public bucket"
the same way it would for an intentionally public marketing CDN, with no way to
say this one was supposed to be confidential and the policy disagrees. The
severity that matters here is not "public" — it is "public, and you told us it
must not be." That requires reasoning across the tag and the policy together,
which a setting-by-setting checklist structurally cannot do.
Pivot point. Everything above is the gap. Everything below is Stave filling it. The reader has now done the work and hit the wall. Only now does the tool appear.
4. The evidence Stave consumes
The same static export the reader had — no new privileges, no live cloud:
{
"bucket": "ubergreece",
"tags": {"data-classification": "confidential"},
"bucket_policy": {"Statement": [{"Effect": "Allow", "Principal": "*", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::ubergreece/*"}]},
"public_access_block": null
}
Stave normalizes this into an observation snapshot: a bucket asset with its policy statements, classification tag, and the (absent) Public Access Block all as evaluable facts on one asset.
5. The reasoning Stave performs
- Control / invariant:
CTL.S3.PUBLIC.001— no bucket may grant read toPrincipal: *.CTL.S3.PUBLIC.002— a bucket classifiedconfidentialmust not be publicly readable. - What it evaluates:
CTL.S3.PUBLIC.001inspects every policy statement for anAllowtoPrincipal: *on a read action with no Public Access Block to neutralize it.CTL.S3.PUBLIC.002joins that public-read fact to thedata-classificationtag and fires only when the data was declared confidential — encoding the contradiction the reader had to assemble by hand. - Verdict produced: Both controls fire and consolidate into one Issue. The classification tag is the org's own evidence against the policy.
Issue: ubergreece — public read on confidential data
CTL.S3.PUBLIC.001 NON_COMPLIANT
asset: s3://ubergreece
evidence: bucket_policy statement Allow Principal:* s3:GetObject on /*;
public_access_block = null (nothing overrides it)
verdict: bucket is anonymously readable
CTL.S3.PUBLIC.002 NON_COMPLIANT
asset: s3://ubergreece
evidence: tag data-classification=confidential AND public read = true
verdict: declared-confidential data is publicly readable — intent violated
security_state: NON_COMPLIANT
6. The prevention artifact Stave produces
- Artifact: A bucket-policy Deny statement plus an account-level Public Access Block recommendation, generated from the violating state.
- What it forecloses: The latent gap from question 2 — a
confidentialtag that nothing enforces. With PAB on, no future policy edit can re-open this bucket, so the tag and the access permanently agree.
# Bucket policy guardrail (explicit deny wins over any Allow)
{
"Sid": "DenyPublicReadOnConfidential",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::ubergreece/*",
"Condition": {"StringEquals": {"aws:PrincipalOrgID": "NOT_o-myorg"}}
}
# Account-level Public Access Block (neutralizes any future public policy)
aws s3control put-public-access-block \
--account-id <acct> \
--public-access-block-configuration \
BlockPublicPolicy=true,RestrictPublicBuckets=true,\
BlockPublicAcls=true,IgnorePublicAcls=true
7. What the team no longer does manually
| Before | After Stave |
|---|---|
| Read each policy statement as an IAM evaluator to decide if it grants anonymous read | One control proves public-read across policy and PAB deterministically |
| Hold the classification tag in your head and manually cross-check it against access | The tag-vs-access contradiction is encoded as an invariant and re-checked every run |
Hope a confidential tag means someone also locked the bucket | A generated PAB guardrail makes the tag enforceable, not aspirational |
Positioning line for this case
Stave proves that
ubergreeceis publicly readable, proves it contradicts the org's ownconfidentialclassification, and emits the Public Access Block that makes that contradiction impossible to recreate.
Reuse checklist
- A reader could attempt section 0 with zero Stave knowledge
- Stave is not named or shown before the pivot point
- Section 2 quotes are real (or honestly plausible), not slogans
- Section 3 names the specific thing per-setting tools can't see
- Section 6 closes the exact latent state raised in section 0, question 2
- The title names the failure, not the product