DevOps Roadmap -- Part 11: Cloud Platforms for DevOps

By Suraj Ahir 2025-11-24 11 min read

← Part 10DevOps Roadmap · Part 11 of 12Part 12 →
DevOps Roadmap -- Part 11: Cloud Platforms for DevOps

The cloud has fundamentally changed infrastructure. Instead of buying servers and waiting weeks for them to arrive, you provision them in seconds via API. Instead of managing physical networks, you define them in code. DevOps engineers need to understand cloud platforms deeply -- not just how to click through a console, but how to architect, automate, and operate cloud infrastructure at scale.

AWS Core Services for DevOps

Key AWS services every DevOps engineer needs
# Compute
EC2          - Virtual machines (instances)
ECS          - Container service (Docker at scale)
EKS          - Managed Kubernetes
Lambda       - Serverless functions
Fargate      - Serverless containers (no EC2 management)

# Storage
S3           - Object storage (static files, backups, artifacts)
EBS          - Block storage attached to EC2 instances
EFS          - Shared file system for multiple instances
ECR          - Docker image registry

# Database
RDS          - Managed relational databases (PostgreSQL, MySQL)
DynamoDB     - NoSQL key-value store
ElastiCache  - Managed Redis/Memcached

# Networking
VPC          - Virtual private cloud (your isolated network)
ALB/NLB      - Load balancers
Route 53     - DNS management
CloudFront   - CDN

# DevOps Services
CodePipeline - CI/CD
CodeBuild    - Build service
CloudWatch   - Monitoring and logging
IAM          - Identity and access management

IAM Best Practices

Secure AWS access management
# Never use root account for daily work
# Create IAM users with minimum required permissions
# Use IAM roles for EC2 instances and CI/CD (not access keys)

# Example IAM policy for S3 read-only access
{
    "Version": "2012-10-17",
    "Statement": [{
        "Effect": "Allow",
        "Action": ["s3:GetObject", "s3:ListBucket"],
        "Resource": [
            "arn:aws:s3:::my-bucket",
            "arn:aws:s3:::my-bucket/*"
        ]
    }]
}

# AWS CLI -- use named profiles
aws configure --profile production
aws s3 ls --profile production

# Check current identity
aws sts get-caller-identity

Cost Optimisation Strategies

Controlling cloud spend
# EC2 cost savings
Reserved Instances  -- 1-3 year commitment, 30-60% savings
Spot Instances      -- 70-90% savings, can be interrupted
Auto Scaling        -- Scale down when load is low

# Right-sizing
# Monitor CPU/memory utilisation
# Downsize underutilised instances
aws cloudwatch get-metric-statistics     --metric-name CPUUtilization     --namespace AWS/EC2     --statistics Average     --period 3600     --start-time 2026-01-01T00:00:00Z     --end-time 2026-01-07T00:00:00Z     --dimensions Name=InstanceId,Value=i-1234567890

# S3 cost savings
# Set lifecycle rules to move old data to Glacier
# Delete incomplete multipart uploads
# Enable S3 Intelligent Tiering for access pattern uncertainty

Frequently Asked Questions

AWS vs GCP vs Azure -- which should I learn first?

AWS has the largest market share (~32%) and the most job listings. Learn AWS first. GCP is strong for data/ML workloads. Azure dominates enterprise Microsoft environments. Once you know AWS deeply, GCP and Azure concepts transfer easily -- they solve the same problems with different interfaces.

What AWS certifications are worth getting?

AWS Solutions Architect Associate is the most recognised entry-level certification. AWS DevOps Engineer Professional is specifically relevant to DevOps roles. For cloud engineers, SAA-C03 (Solutions Architect Associate) is the standard starting point and widely respected by hiring managers.

What is a VPC and why does it matter?

A VPC (Virtual Private Cloud) is your isolated network in AWS. Resources inside a VPC can communicate with each other but are isolated from other customers' resources. Use public subnets for resources that need internet access and private subnets for databases and internal services.

What is the difference between ECS and EKS?

ECS is AWS's proprietary container service -- simpler and cheaper to operate. EKS is managed Kubernetes -- more complex but portable to other Kubernetes environments. Use ECS for simple containerised applications. Use EKS when you need Kubernetes features or portability.

How do I avoid accidental cloud cost spikes?

Set AWS Budget alerts for 80% and 100% of monthly budget. Enable Cost Anomaly Detection. Tag all resources with environment and project tags for attribution. Use Cost Explorer to analyse spending. Restrict who can create expensive resources with IAM policies.

In Part 12, we build a complete DevOps project bringing together everything from this series: CI/CD, Docker, Kubernetes, Terraform, and monitoring.

Key takeaways

Continue reading
Part 12 — Full DevOps Capstone
Bring it all together.
Suraj Ahir — author of SRJahir Tech

Written by

Suraj Ahir

Cloud & DevOps engineer running four live production services on my own AWS infrastructure. I write everything on this site myself — no ghostwriters, no AI filler.

← Part 10DevOps Roadmap · Part 11 of 12Part 12 →
← Back to Blog
Disclaimer: Educational content only.

AWS Cost Optimisation for DevOps Teams

Practical cost reduction strategies
# 1. Use Spot Instances for non-critical workloads
# 70-90% cheaper than On-Demand
# Good for: CI/CD runners, batch processing, dev environments
resource "aws_spot_fleet_request" "ci_runners" {
  allocation_strategy = "diversified"
  spot_price         = "0.05"
  iam_fleet_role     = aws_iam_role.fleet.arn
  target_capacity    = 2
  
  launch_specification {
    instance_type = "t3.medium"
    ami           = data.aws_ami.ubuntu.id
  }
}

# 2. Reserved Instances for stable workloads
# 1-year commitment: 30-40% discount
# 3-year commitment: 50-60% discount
# Use Compute Savings Plans for flexibility

# 3. S3 lifecycle policies for storage cost
resource "aws_s3_bucket_lifecycle_configuration" "logs" {
  bucket = aws_s3_bucket.logs.id
  rule {
    id     = "archive-and-expire"
    status = "Enabled"
    transition {
      days          = 30
      storage_class = "STANDARD_IA"
    }
    transition {
      days          = 90
      storage_class = "GLACIER"
    }
    expiration {
      days = 365
    }
  }
}

Multi-Region Disaster Recovery

Active-passive DR architecture
# RDS Multi-Region Read Replica (becomes primary on failover)
aws rds create-db-instance-read-replica   --db-instance-identifier mydb-dr   --source-db-instance-identifier mydb   --source-region us-east-1   --region ap-south-1

# S3 Cross-Region Replication
aws s3api put-bucket-replication   --bucket source-bucket   --replication-configuration file://replication.json

# Route 53 health check failover
resource "aws_route53_record" "primary" {
  failover_routing_policy { type = "PRIMARY" }
  health_check_id = aws_route53_health_check.primary.id
  set_identifier  = "primary"
}