AWS S3 (Simple Storage Service)

AWS S3 stands for Simple Storage Service. It is an object storage service that stores any type of file — documents, images, videos, backups, code, logs — in the cloud. S3 is infinitely scalable, highly durable, and accessible from anywhere in the world.

S3 is not like a hard drive or a file system. It stores data as objects inside buckets. Think of a bucket as a folder and an object as a file inside that folder — except S3 buckets can hold trillions of objects and never run out of space.

S3 Core Concepts

Buckets

A bucket is a container for storing objects. Each bucket must have a globally unique name across all AWS accounts worldwide. A bucket is created in a specific AWS Region, and data stays in that Region unless explicitly configured to replicate elsewhere.

Naming rules for S3 buckets:

  • Must be between 3 and 63 characters long.
  • Can contain only lowercase letters, numbers, and hyphens.
  • Must start with a letter or number, not a hyphen.
  • Cannot be formatted as an IP address (e.g., 192.168.1.1).

Objects

An object is any file stored inside an S3 bucket. Each object consists of:

  • Key: The object's name/path within the bucket. Example: images/profile/user123.jpg
  • Value: The actual file data (content).
  • Metadata: Information about the file — content type, creation date, custom tags.
  • Version ID: When versioning is enabled, each upload creates a new version.

A single S3 object can be up to 5 TB in size. The maximum file upload in a single PUT request is 5 GB. Files larger than 5 GB require Multipart Upload.

S3 Storage Classes

S3 offers different storage classes based on how frequently data is accessed. Choosing the right class reduces costs significantly:

Storage ClassAccess FrequencyRetrieval TimeUse Case
S3 StandardFrequentMillisecondsWebsites, apps, active data
S3 Intelligent-TieringUnknown or changingMillisecondsUnpredictable access patterns
S3 Standard-IAInfrequentMillisecondsBackups, disaster recovery files
S3 One Zone-IAInfrequentMillisecondsRecreatable data, lower cost backups
S3 Glacier InstantRareMillisecondsArchives accessed occasionally
S3 Glacier FlexibleRareMinutes to hoursLong-term archives
S3 Glacier Deep ArchiveVery rareUp to 12 hoursCompliance records, rarely accessed data

S3 Durability and Availability

S3 Standard is designed for 99.999999999% (11 nines) durability. This means storing 10 million objects in S3, on average, might result in losing one object every 10,000 years. Data is automatically replicated across multiple Availability Zones within a Region.

S3 Standard availability is 99.99% — meaning it is expected to be unavailable for less than 1 hour per year.

S3 Access Control

Bucket Policies

Bucket policies are JSON documents attached to a bucket that define who can access the bucket and what actions they can perform. They are similar to IAM policies but applied directly to the bucket.

Example — allow public read access to all objects (for a static website):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-website-bucket/*"
    }
  ]
}

Block Public Access

By default, new S3 buckets block all public access. This setting must be explicitly changed to make a bucket public. This protects against accidentally exposing sensitive data.

S3 Pre-Signed URLs

A pre-signed URL is a temporary link that grants access to a specific S3 object for a limited time — without making the object public. Useful for sharing a private file for download that expires after 1 hour, for example.

S3 Versioning

Versioning keeps multiple versions of an object in the same bucket. When versioning is enabled and a file is overwritten or deleted, the previous version is preserved. This protects against accidental data deletion or corruption.

Bucket: my-documents
  |-- report.pdf (Version 1: original upload)
  |-- report.pdf (Version 2: after first edit)
  |-- report.pdf (Version 3: after second edit) ← current version

Restoring an older version is as simple as selecting it from the console.

S3 Lifecycle Policies

Lifecycle policies automate the transition of objects between storage classes over time, reducing costs without manual effort.

Example lifecycle rule for a log archive bucket:

Day 0:   Uploaded to S3 Standard (active access)
         |
Day 30:  Automatically moved to S3 Standard-IA (less access)
         |
Day 90:  Automatically moved to S3 Glacier Flexible (archive)
         |
Day 365: Automatically deleted

Static Website Hosting with S3

S3 can host a static website — HTML, CSS, and JavaScript files served directly from a bucket. No web server is needed. Steps to enable:

  1. Create a bucket with the same name as the domain (e.g., www.mysite.com).
  2. Upload the website files (index.html, style.css, etc.).
  3. Enable static website hosting in bucket properties.
  4. Set a bucket policy to allow public read access.
  5. The website is accessible at the S3 endpoint URL or via a custom domain through Route 53.

S3 Event Notifications

S3 can trigger actions automatically when certain events happen — like a file being uploaded or deleted. These notifications integrate with Lambda, SQS, and SNS.

Example workflow — automatic image processing:

[User uploads image to S3 bucket]
           |
[S3 Event Notification triggered]
           |
[Lambda function invoked automatically]
           |
[Lambda resizes the image and saves thumbnail back to S3]

Real-World Example — Media Storage Platform

A video sharing platform stores all uploaded videos in S3:

  • Newly uploaded videos go to S3 Standard for fast access during peak popularity.
  • After 30 days, older videos move to S3 Standard-IA automatically via a lifecycle rule.
  • Videos older than 1 year move to S3 Glacier for cheap long-term storage.
  • CloudFront serves the videos globally from Edge Locations for fast streaming.

Summary

  • S3 stores objects (files) inside buckets. Bucket names are globally unique.
  • S3 Standard offers 11 nines of durability by replicating data across multiple AZs.
  • Different storage classes (Standard, IA, Glacier) optimize cost based on access frequency.
  • Versioning protects against accidental deletion. Lifecycle policies automate cost optimization.
  • S3 can host static websites and trigger Lambda functions via event notifications.

Leave a Comment