Every real cloud application needs to store files somewhere. Images, backups, logs, static assets, data exports — all of this needs a place to live that is reliable, scalable, and cheap. That is what Amazon S3 is for. In this part, we learn how to work with S3 directly from our EC2 Linux instance using the AWS CLI.
S3 stands for Simple Storage Service. It is AWS's object storage service — you store files (called objects) in containers (called buckets). S3 is designed for 99.999999999% (eleven nines) durability, meaning data loss is essentially impossible. It scales to unlimited storage. Pricing is based on how much you store and how much you transfer, not on provisioning any servers.
S3 is used for hosting static websites, storing media files, data lake storage, application backups, log archives, and distribution of large files via CloudFront (AWS's CDN).
The AWS Command Line Interface (CLI) lets you interact with all AWS services from your terminal. Amazon Linux 2023 usually comes with the AWS CLI pre-installed. Let us check and install if needed:
# Check if already installed
aws --version
# If not installed, on Amazon Linux:
sudo dnf install awscli -y
# For the latest version (v2):
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
Before using the CLI, you need to authenticate it. The best practice for EC2 is to use an IAM Role (we cover that in Part 7). For now, let us configure it with access keys:
aws configure
# It will ask for:
# AWS Access Key ID: [your access key]
# AWS Secret Access Key: [your secret key]
# Default region name: ap-south-1
# Default output format: json
To get access keys, go to AWS Console → IAM → Users → your user → Security credentials → Create access key. Never share these keys or commit them to code repositories.
A bucket is like a top-level folder in S3. Bucket names must be globally unique across all of AWS:
# Create a bucket (name must be globally unique)
aws s3 mb s3://my-srjahir-bucket-2026
# List all your buckets
aws s3 ls
# List contents of a specific bucket
aws s3 ls s3://my-srjahir-bucket-2026/
# Upload a file to S3
aws s3 cp myfile.txt s3://my-srjahir-bucket-2026/
# Upload a file to a specific folder in S3
aws s3 cp myfile.txt s3://my-srjahir-bucket-2026/backups/
# Download a file from S3
aws s3 cp s3://my-srjahir-bucket-2026/myfile.txt ./
# Upload entire directory
aws s3 cp /var/www/html/ s3://my-srjahir-bucket-2026/website/ --recursive
# Sync a directory (only uploads changed files)
aws s3 sync /var/log/nginx/ s3://my-srjahir-bucket-2026/logs/ --recursive
# Delete a file
aws s3 rm s3://my-srjahir-bucket-2026/myfile.txt
# Delete all files in a folder
aws s3 rm s3://my-srjahir-bucket-2026/backups/ --recursive
# Delete an empty bucket
aws s3 rb s3://my-srjahir-bucket-2026
# Force delete bucket and all its contents
aws s3 rb s3://my-srjahir-bucket-2026 --force
By default, S3 buckets and their contents are private. Only the AWS account that owns them can access them. To make files publicly accessible (for static website hosting, for example), you need to configure the bucket policy and disable "Block Public Access":
# An S3 object URL looks like:
# https://my-srjahir-bucket-2026.s3.ap-south-1.amazonaws.com/myfile.txt
# Generate a pre-signed URL (temporary access for 1 hour)
aws s3 presign s3://my-srjahir-bucket-2026/myfile.txt --expires-in 3600
Pre-signed URLs are very useful — they let you share a private file with someone for a limited time without making it permanently public.
#!/bin/bash
# Simple backup script
DATE=$(date +%Y-%m-%d)
tar -czf /tmp/nginx-logs-${DATE}.tar.gz /var/log/nginx/
aws s3 cp /tmp/nginx-logs-${DATE}.tar.gz s3://my-srjahir-bucket-2026/backups/
rm /tmp/nginx-logs-${DATE}.tar.gz
echo "Backup completed: nginx-logs-${DATE}.tar.gz"
Save this as backup.sh, make it executable with chmod +x backup.sh, and run it to back up your Nginx logs to S3. In Part 8, we will see how to schedule this to run automatically using cron.
In Part 7, we cover IAM — the proper way to grant your EC2 instance permission to access S3 and other AWS services without using access keys.
S3 access control works through multiple mechanisms that interact: IAM policies on users and roles, S3 bucket policies attached to the bucket, and S3 ACLs on individual objects. Bucket policies are particularly useful for cross-account access and for making specific bucket paths publicly accessible for static website hosting. A bucket policy is a JSON document attached to the bucket:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-website-bucket/*"
}
]
}
For production buckets containing sensitive data, the correct policy is a deny-all public access policy enforced through S3 Block Public Access settings, with access granted only through IAM roles to specific services or users that need it.
S3 lifecycle rules automatically transition objects to cheaper storage classes or delete them after a specified time. Infrequently accessed objects can move from S3 Standard to S3 Standard-IA after 30 days, then to Glacier for long-term archival after 90 days, and be deleted after 365 days. Defining lifecycle rules on buckets that accumulate data — log buckets, backup buckets, data lake buckets — prevents unbounded storage growth and keeps costs controlled automatically without manual intervention.
Create an S3 bucket for static website hosting: enable static website hosting in the bucket properties, upload an index.html file, apply a bucket policy allowing public read access, and access the website using the S3 website endpoint URL. Then configure a lifecycle rule that would transition objects older than 30 days to S3 Standard-IA (you can set it up without waiting for it to trigger). Finally, practice using versioning: enable versioning, upload a modified version of the file, and use the AWS CLI to list all versions and retrieve a specific version.