ML
AWS · Cloud

S3 Patterns: Multipart, Versioning, Lifecycle Done Right

S3 is the most boring service that breaks the most production systems. Get these five patterns right and it disappears.

March 15, 20269 min readAWSStorage

S3 is the closest thing to infinite, durable storage we have. It is also the closest thing to a footgun-shaped bucket of options. Five patterns separate teams that pay S3 attention quarterly from teams that pay it attention at 3am.

1. Multipart upload — for everything over 100 MiB

A single PUT over 5 GiB is impossible by spec; over 100 MiB it is brittle. Multipart splits the object into parts (5 MiB to 5 GiB each), uploads them in parallel, and assembles them server-side. The benefits are not just speed:

  • Parts retry independently. A flaky connection retries one 8 MiB part, not your 4 GiB file.
  • Throughput scales with concurrency. 30 parallel parts on a fast link saturates the pipe.
  • You can resume an interrupted upload by listing the parts already uploaded.
aws s3 cp ./backup.tar.gz s3://bucket/key \
  --expected-size 8589934592 \
  --cli-write-timeout 0

The SDKs do this transparently above a threshold. The gotcha: incomplete multipart uploads still cost money. If your job dies mid-upload, the parts stay in the bucket and accrue storage indefinitely. Add a lifecycle rule.

2. Lifecycle policies — set them on day one

Every bucket needs at minimum:

  1. Abort incomplete multipart uploads after 7 days. Cleans up the previous mess.
  2. Transition old objects to a cheaper tier. S3 Standard → Standard-IA at 30 days, Standard-IA → Glacier Instant at 90 days for logs/backups.
  3. Expire ancient objects if regulation allows. Or transition to Glacier Deep Archive at a year.

Lifecycle is free. Skipping it costs real money over years.

3. Versioning — useful, expensive if mishandled

With versioning on, deleting an object writes a delete marker; the previous versions are retained. Great for accidental-delete recovery. Without a lifecycle policy to expire old versions, you keep paying for every version of every object that has ever existed.

{
  "Rules": [{
    "ID": "expire-old-versions",
    "Status": "Enabled",
    "NoncurrentVersionExpiration": { "NoncurrentDays": 30 }
  }]
}

This is the single most common "why is my S3 bill 5× last quarter" cause. Versioned bucket, no expiry, somebody loops a job that rewrites a million objects daily — every one of those rewrites silently retains the previous version forever.

4. Strong consistency is real now (since 2020)

S3 used to be eventually consistent on overwrites. It is now read-after-write strongly consistent for new objects and overwrites. You no longer need DynamoDB-backed indexing libraries for "did my write land yet" — the write is visible immediately to anyone who can read.

What is still not strongly consistent: cross-region replication. Replication is asynchronous; treat the destination bucket as a follower with a few-second lag.

5. Access patterns — the one that actually matters

S3 charges three things: storage, requests, and egress. Requests are cheap individually and expensive in aggregate. A million HEAD calls a day adds up.

Mistakes I have paid for:

  • Listing a prefix on every request to "see if a file exists" — use a database for that instead.
  • Tiny objects in their millions — each one charges a minimum 1 KiB billing unit. Pack small files into archives (Parquet, gzipped JSONL).
  • Cross-region reads in a hot path — pay the cross-region transfer every time. Replicate the bucket or move the compute.

6. Security defaults to set on creation

  • Block Public Access at the bucket level. The default is correct; never relax it broadly.
  • Enable default encryption (SSE-S3 minimum, SSE-KMS if compliance requires it).
  • Enable Object Lock in Compliance mode for backup buckets. Even a compromised IAM user cannot delete the data.
  • Turn on Access Logs to a separate bucket. Cheap; pays for itself the first audit.

The rules of thumb

  • Multipart everything over 100 MiB.
  • Lifecycle policies on every bucket, day one.
  • Versioning on, with expiry of non-current versions.
  • Block Public Access stays on.
  • If S3 is part of a hot path, measure requests/sec — the surprise is usually request cost, not storage cost.
SharePostLinkedIn

Reader Discussion

2 replies// weighed in

TopNewestAuthor
Add to the thread
Disagree, agree harder, or share your own experience…
Email instead →markdown okbe kind
  1. Rachel Gold· Staff SREAgrees

    the on-call framing throughout this piece is what makes it land. too many infra articles assume you never get paged. those are written by people who never got paged.

    Mar 18, 2026·3 days later
  2. Omar Khalil· Senior SWEKind words

    this is the third article from this blog I've sent to my team this month. you're cooking. don't switch to crypto.

    Mar 20, 2026·5 days later

Worked on something similar? Email ducminhldm@gmail.com — I read every one. The good ones become future posts.

Comments seeded · live discussion via email