AWS SAA-C03: How to Design Cost-Optimized Compute Solutions

AWS SAA-C03: How to Design Cost-Optimized Compute Solutions

A practical guide to picking the right AWS compute service, pricing model, and scaling approach for SAA-C03 and for the kind of architecture decisions you actually make in the real world.

1. Cost-Optimized Compute: A decision method that actually holds up when you use it

For SAA-C03, the right compute answer usually isn’t the one that just looks cheapest at first glance. It is the lowest-cost option that still meets reliability, performance, security, and operational requirements. That distinction matters. Spot is cheap, but wrong for a single critical instance. Lambda is elegant, but wrong for jobs that run longer than 15 minutes per invocation. EKS is powerful, no question, but it’s often more platform than you actually need when ECS or Fargate can do the same job with less cost and a lot less operational overhead.

Use this exam and architecture method:

  • Identify workload shape: steady, bursty, event-driven, batch, or mostly idle.
  • Identify hard constraints: runtime limit, OS control, Kubernetes requirement, licensing, isolation, startup latency.
  • Eliminate invalid options: for example, Lambda for >15-minute work, Spot for non-interruptible single-instance workloads.
  • Choose the service model first: EC2, Lambda, ECS/Fargate, Batch, EKS, or managed platform.
  • Choose pricing model last: On-Demand, Savings Plans, Reserved Instances, Spot, or dedicated tenancy options.

Quick memory aid: Steady = commit. Bursty = scale. Interruptible = Spot. Event-driven = Lambda. Containers ≠ Kubernetes. Capacity guarantee ≠ discount.

2. Compute Service Decision Tree

Start with workload behavior, not service familiarity.

Workload pattern Best-fit service Key reason
24/7 predictable baseline EC2 + Savings Plans or Reserved Instances Committed usage reduces cost
Spiky stateless web tier EC2 Auto Scaling Groups, often mixed On-Demand + Spot Elasticity removes idle capacity
Short-lived event-driven processing Lambda Pay only when code runs
Queue-based parallel batch AWS Batch with Spot Built for retryable, interruption-tolerant jobs
Containers without Kubernetes requirement ECS on Fargate or ECS on EC2 Simpler and often cheaper than EKS operationally
Real Kubernetes requirement EKS Kubernetes API/ecosystem needed
Simple small web app Lightsail or App Runner Lower complexity and predictable deployment model

Elimination logic is exam gold: if the workload is event-driven and brief, Lambda moves up. If it needs full OS control or specific kernel behavior, Lambda drops out. If Kubernetes is not a stated requirement, deprioritize EKS. If the work is retryable and fault tolerant, Spot and Batch become strong answers.

3. EC2 cost optimization: rightsizing, pricing, and scaling without paying for capacity you don’t actually need

EC2 is still the go-to for a lot of production workloads because it gives you real control over the operating system, networking, storage, and the exact instance type you want. It is also where waste shows up fastest.

Rightsizing playbook: collect CPUUtilization, network throughput, disk IOPS/throughput, EBS queue depth, and application latency. For memory, remember that CloudWatch does not publish guest memory utilization by default; install the CloudWatch agent or another in-guest telemetry tool. Review at least a representative business cycle, then compare with AWS Compute Optimizer recommendations. Compute Optimizer is useful, but it only works for supported resources and needs enough metric history to make good recommendations.

Symptom Likely issue Likely action
Low CPU and low memory for weeks Oversized instance Downsize or scale in
High CPU, normal memory CPU-bound Move to compute-optimized family
Memory pressure, swap, normal CPU Memory-bound Move to memory-optimized family
High EBS queue length or throughput bottleneck Storage mismatch Adjust EBS type, IOPS, throughput, or redesign storage path

T-family caution: burstable instances are good for low baseline CPU with occasional spikes. Many run in Unlimited mode, which can incur surplus CPU credit charges during sustained high usage. Watch CPUCreditBalance and CPUSurplusCreditCharged. If the workload is consistently busy, move off T instances.

Graviton: often a strong price/performance lever. Validate architecture support for binaries, agents, container images, and libraries. In practice, I’d test with multi-architecture builds, measure latency and throughput, and then roll out in stages with canaries and a clean rollback path.

Storage matters to compute cost: gp3 is often a better cost-optimization choice than older gp2 because performance can be tuned independently. EFS is not inherently cheaper than EBS; choose it for shared managed file access and elastic NFS semantics. Instance store can be excellent for scratch or cache data, but it is ephemeral and only available on some instance types.

EC2 purchasing models: what the exam expects you to know

Option Best use Important nuance
On-Demand Unpredictable or short-term usage No commitment, highest unit cost
Compute Savings Plans Committed spend with broad flexibility Billing discount across EC2, Fargate, and Lambda
EC2 Instance Savings Plans Steady EC2 family usage in one Region More restrictive than Compute Savings Plans, more savings than broader flexibility
Standard Reserved Instances Very stable EC2 usage Usually deepest EC2 discount, least flexible
Convertible Reserved Instances Need commitment with change flexibility Lower discount than Standard RI, but exchange options
Spot Instances Fault-tolerant workloads Very low cost, but interruptible and capacity not guaranteed

Two exam distinctions matter a lot:

  • Savings Plans and most Reserved Instances are billing discounts, not capacity guarantees.
  • Zonal Reserved Instances can provide capacity reservation in a specific AZ; Regional RIs do not. If the question asks for guaranteed capacity, think Zonal RI or On-Demand Capacity Reservation, not just “cheaper pricing.”

A good practical pattern is to commit only to the known baseline, then let Amazon EC2 Auto Scaling handle burst capacity with On-Demand or Spot. That way, you’re not paying for peak traffic all day long when you’re really only hitting it for short windows.

4. Spot Design Patterns and Auto Scaling That Save Money Safely

Spot is one of the best cost tools in AWS when the workload is interruption tolerant. AWS can reclaim Spot capacity, typically with a two-minute interruption notice. That means your design must tolerate replacement.

Best practices:

  • Use multiple instance types and multiple AZs.
  • Prefer capacity-optimized or similar resilient allocation strategies.
  • Use mixed instances policies in Auto Scaling Groups.
  • Enable Capacity Rebalancing so the group launches replacement capacity when rebalance recommendations appear.
  • Design for checkpointing, queue-based work, idempotency, and graceful draining from target groups.

For EC2 fleets, target tracking scaling is the most exam-friendly default: keep CPU, request count per target, or another metric near a target value. Scheduled scaling works for known office-hour or campaign patterns. Predictive scaling can help when demand is cyclical. Set health checks correctly, tune instance warmup, and avoid scaling flaps caused by noisy metrics.

Worked pattern: ALB in front of an Auto Scaling Group across two AZs. Keep a small On-Demand baseline for reliability, cover that baseline with Savings Plans, and let Spot handle the burst if the app is stateless. Store sessions and state outside the instances — in DynamoDB, ElastiCache, RDS, S3, or EFS, depending on what kind of state you’re dealing with. That is both a strong production pattern and a strong exam answer.

5. Lambda: Cheapest for Intermittent Work, Not for Everything

Lambda usually shines when the compute is intermittent, event-driven, and short-lived. Pricing is based on requests, duration in GB-seconds, memory setting, architecture, and optional features such as Provisioned Concurrency and ephemeral storage above the included allocation. And yes, there’s a free tier too, which can really matter for low-volume workloads.

Important architecture limits and tuning points:

  • Maximum execution time: 15 minutes per invocation.
  • Memory affects CPU allocation: more memory can reduce duration and sometimes lower total cost.
  • Reserved concurrency: protects account concurrency and isolates a function.
  • Provisioned Concurrency: reduces cold starts for latency-sensitive paths, but costs money even when idle.
  • VPC attachment: can add startup overhead and networking complexity depending on design.

Typical good fits are S3 → Lambda for file processing, SQS → Lambda for asynchronous workers, and EventBridge → Lambda for schedules and automation. Typical bad fits are sustained 24/7 services, jobs that run longer than 15 minutes, or workloads that need deep OS-level control.

Here’s the basic break-even logic: if an API gets occasional bursts but sits idle most of the time, Lambda often wins because you’re not paying for always-on servers. If traffic turns into steady, high volume all day, always-on EC2 or containers with commitments can end up cheaper, especially if you need Provisioned Concurrency all the time.

6. Container Platform Cost Engineering: ECS, Fargate, and EKS

The exam loves this trap: “uses containers” does not mean “needs Kubernetes.”

Platform Cost profile Operational profile
ECS on EC2 Often lowest direct cost at good density You manage instances, patching, and cluster capacity
ECS on Fargate Higher direct compute cost in many cases No server management, simpler for small teams
EKS Control plane fee plus worker/Fargate cost Highest complexity; justify with real Kubernetes need

ECS on EC2 rewards good bin-packing and rightsized task reservations. ECS on Fargate trades some direct cost for reduced operational burden. Fargate Spot can cut cost for interruption-tolerant container tasks. EKS can run on EC2 or Fargate, but remember the extra per-cluster control plane charge and add-on costs such as ingress, observability, and often NAT or load balancers. Kubernetes can absolutely improve standardization, but portability across clouds doesn’t happen automatically. Even so, networking, IAM, storage, and observability still work a little differently in the real world.

A lot of container cost creep comes from things teams miss in reviews — over-requested CPU and memory, growing CloudWatch Logs, NAT Gateway charges for private subnet egress, ALB or NLB charges, and inter-AZ data transfer.

7. Batch, Beanstalk, App Runner, and Lightsail

Service Best fit Cost note
AWS Batch Queued, parallel, retryable jobs Excellent with Spot compute environments
Elastic Beanstalk is the managed app platform that sits on top of EC2 and related AWS services, so you get simpler deployment and less platform management. Managed app deployment on underlying AWS resources No additional charge for Beanstalk itself; you pay for EC2, ELB, EBS, and related resources
App Runner Simple HTTP apps and APIs from code or containers Can reduce ops cost, but compare against ECS, Fargate, or Lambda based on traffic profile
Lightsail Simple small workloads with bundled pricing Good for predictability, not usually for complex enterprise HA designs

Batch deserves special attention: define a job definition, submit to a queue, attach a compute environment, and use retry strategies and dependencies. It is a classic answer for rendering, simulations, analytics, and ETL where jobs can retry and checkpoint.

8. Hidden Cost Drivers Beyond Compute

Compute decisions change total architecture cost. Watch for:

  • Load balancers: ALB/NLB pricing and per-usage dimensions.
  • NAT Gateway: often a surprise cost for private subnet outbound traffic.
  • Inter-AZ data transfer: can matter in chatty multi-tier designs.
  • EBS performance charges: gp3, io1/io2 IOPS and throughput choices.
  • Public IPv4 charges where applicable.
  • Logging and metrics: CloudWatch Logs, custom metrics, tracing.
  • Licensing and tenancy: Windows, commercial software, Dedicated Hosts for some BYOL cases.

Dedicated Hosts and Dedicated Instances are not generic cost savers. Dedicated Hosts give host-level visibility and control, useful for certain BYOL or compliance scenarios. Dedicated Instances provide instance-level tenancy isolation without host-level control. Choose them because you actually need licensing support or isolation, not just because they sound more “enterprise.”

9. Governance, security, and troubleshooting: the part that keeps cost optimization from turning into a mess

Use Cost Explorer, AWS Budgets, Cost Anomaly Detection, Compute Optimizer, and the Cost and Usage Report. For deeper analysis, I usually pull detailed cost and usage data into analytics tools and then put the results into dashboards so the trends are easier to see. Trusted Advisor can help identify some idle or underused resources, but checks vary by support plan and service scope.

Tagging should be mandatory: Environment, Application, Owner, and CostCenter. Enforce standards with Organizations tag policies and account governance.

Security responsibility changes by compute model. With EC2 and ECS on EC2, you’re responsible for patching and hardening the operating system. With Fargate and Lambda, AWS takes care of more of the underlying infrastructure, but you still own IAM least privilege, secrets handling, network controls, and application security. Make sure you’re using instance profiles, task roles, and Lambda execution roles properly. Pull secrets from Secrets Manager or Parameter Store instead of baking them into images or user data. That’s one of those small changes that saves a lot of pain later.

Symptom Likely cause or likely reason Fix
EC2 spend high, utilization low Oversized or wrong family Rightsize, scale in, review commitments after optimization
Lambda cost spike Longer duration, too much memory, or idle Provisioned Concurrency Tune memory, timeout, code path, and concurrency settings
Spot fleet unstable Too few instance types/AZs, no rebalance handling Diversify pools and enable Capacity Rebalancing
Fargate spend high Oversized task CPU/memory Rightsize task definitions and autoscaling thresholds
EKS cost creep Idle clusters, add-ons, logging, ingress, NAT Review full platform cost, not just worker nodes

10. Exam Trap Patterns and Best Answers

Steady ERP app, 24/7: EC2 with Savings Plans or Standard RIs for the baseline, plus Auto Scaling if needed. On-Demand is the tempting but weaker answer.

Spiky stateless marketing site: ALB + Auto Scaling Group with On-Demand baseline and Spot burst. Fixed fleets waste money.

S3-triggered image processing: Lambda. Fargate is possible, but usually less natural and less cost-efficient for short event-driven execution.

Nightly ETL or rendering queue: AWS Batch with Spot. Lambda may hit the 15-minute limit or become awkward to orchestrate.

Containers, small ops team, no Kubernetes requirement: ECS on Fargate. EKS is the classic distractor.

Need guaranteed EC2 capacity in one AZ: Zonal RI or On-Demand Capacity Reservation. Savings Plans are discounts, not capacity guarantees.

11. Final SAA-C03 Cheat Sheet

Keyword Likely answer Common wrong answer
Predictable baseline Savings Plans or RIs All On-Demand
Fault-tolerant, retryable Spot On-Demand only
Event-driven, short-lived Lambda Always-on EC2
Queue-based parallel jobs AWS Batch Single large EC2 server
Containers, no K8s need ECS/Fargate EKS
BYOL with host visibility Dedicated Host Dedicated Instance
Guaranteed AZ capacity Zonal RI or Capacity Reservation Compute Savings Plan

The exam does not reward memorizing prices. It rewards recognizing workload shape, eliminating invalid options, and then choosing the cheapest architecture that still satisfies the real requirements. That is also how good AWS architecture works in production.