AWS SAA-C03: How to Design High-Performing and Scalable Network Architectures

AWS SAA-C03: How to Design High-Performing and Scalable Network Architectures

A practical guide to VPC design, load balancing, global routing, private access, and hybrid connectivity for AWS Certified Solutions Architect – Associate (SAA-C03).

Why this domain matters in SAA-C03

SAA-C03 networking questions are rarely about packet-level trivia. They’re really checking whether you can pick an architecture that scales cleanly, holds up under failure, stays secure, and doesn’t turn into a maintenance headache. Usually, the best answer is the simplest managed design that avoids single points of failure, keeps private traffic private, and lines up the protocol and routing requirement with the right AWS service.

For exam purposes, keep four ideas in mind: Multi-AZ is the production baseline, public exposure should be minimized, private AWS service access usually beats internet egress when possible, and wording matters. “Static IPs,” “path-based routing,” “many VPCs,” “predictable hybrid performance,” and “private access to S3” each point toward very different services.

VPC foundations: CIDR, subnets, and routing

Amazon VPC is your isolated network boundary. It defines IP space, subnets, route tables, gateways, and security controls. Good designs start with CIDR planning, because overlapping ranges and undersized subnets become painful later. VPC peering does not support overlapping CIDRs, and Transit Gateway designs are also much easier when address space is planned cleanly across accounts and Regions.

A few practical rules matter a lot:

  • AWS reserves 5 IP addresses in every subnet. Tiny subnets run out faster than many candidates expect.
  • Plan for growth. Auto Scaling, interface endpoints, containers in awsvpc mode, and failover capacity all consume IPs.
  • You can add secondary IPv4 CIDR blocks to a VPC later, but that does not erase poor original planning.
  • IPv6 subnets use /64 blocks, and dual-stack design is increasingly common.

A subnet is public when its route table has a route to an Internet Gateway. But that does not mean every resource in it is internet-reachable. For IPv4 internet access, an instance also needs a public IPv4 address or Elastic IP, plus security group and NACL rules that allow the traffic. That distinction is a classic exam trap.

Route tables decide where traffic goes. AWS uses longest-prefix match, so a more specific route wins over a default route. And that matters a lot when you’re mixing local VPC routes, NAT, peering, Transit Gateway, and gateway endpoints in the same design.

Public subnet route table 10.0.0.0/16 local 0.0.0.0/0 igw-1234 Private app subnet route table 10.0.0.0/16 local pl-s3prefix vpce-gw-s3 0.0.0.0/0 nat-az-a TGW-attached subnet route table 10.0.0.0/16 local 172.16.0.0/12 tgw-1234

That route example shows the logic pretty clearly: local traffic stays local, S3 can stay private through a gateway endpoint, outbound IPv4 internet traffic can go through NAT, and other private networks can be reached through Transit Gateway.

Designing for Multi-AZ and getting the IPv6 basics right

For production workloads, I’d strongly recommend spreading subnets and targets across at least two Availability Zones. A very common pattern is to put load balancers and NAT Gateways in public subnets, application servers or containers in private app subnets, and databases in private data subnets. In most real-world designs, each AZ should have its own NAT Gateway so you don’t create a sneaky single-AZ dependency or rack up cross-AZ data charges.

With IPv6, there are two things you really want to keep straight. First, NAT Gateway is IPv4-only. Second, outbound-only IPv6 internet access from private subnets uses an egress-only Internet Gateway, not NAT. So in a dual-stack design, IPv4 and IPv6 may follow different outbound paths, and that’s completely normal.

A VPC with 10.0.0.0/16 and an IPv6 CIDR block | +-- AZ-a | +-- Public subnet -> IGW | +-- Private app -> NAT GW-a for IPv4, egress-only IGW for IPv6 | +-- Private DB -> no direct internet route | +-- AZ-b +-- Public subnet -> IGW +-- Private app -> NAT GW-b for IPv4, egress-only IGW for IPv6 +-- Private DB -> no direct internet route

This layout represents a resilient dual-stack network design. Public subnets handle internet-facing entry points, private application subnets use controlled outbound paths for IPv4 and IPv6, and database subnets stay isolated from direct internet exposure.

Stateless application tiers usually scale best when they sit behind load balancers and Auto Scaling groups. Stateful data should live in managed services where possible. That is the default SAA-C03 pattern because it improves resilience and reduces operational pain.

Private access, NAT, and VPC endpoints

If private instances need general outbound access to public endpoints, NAT Gateway is usually the right choice. It sits in a public subnet, uses an Elastic IP, and sends outbound IPv4 traffic out through the VPC’s Internet Gateway. It doesn’t allow random inbound connections back to those private instances.

But if the workload only needs supported AWS services, VPC endpoints are usually the better answer. They reduce exposure, often reduce cost, and keep traffic on AWS networking paths.

Option Best use Key details
NAT Gateway Outbound IPv4 access to public endpoints Managed, scalable, per-AZ; hourly and per-GB cost; cross-AZ routing adds cost and risk
Gateway Endpoint Private access to S3 or DynamoDB No hourly charge; route-table based using AWS-managed prefix lists
Interface Endpoint Private access to many supported AWS or partner services Uses ENIs in subnets, consumes IPs, needs security groups, supports private DNS, has hourly and data charges

Gateway endpoints are only for S3 and DynamoDB. Many other services use interface endpoints through AWS PrivateLink, but not every AWS service supports them. Interface endpoints also matter for subnet sizing because each endpoint creates ENIs in selected subnets.

Endpoint policies can further restrict access for supported services. That is useful in exam scenarios where the requirement says “private access” and “least privilege.”

Load balancer selection: ALB, NLB, and GWLB

Choose the load balancer by protocol and traffic behavior, not by habit.

Load Balancer Best for Important exam clues
ALB HTTP/HTTPS/gRPC applications Host/path/header routing, redirects, WebSockets, WAF integration, internal or internet-facing
NLB TCP/UDP/TLS at very high scale Static IPs per AZ, optional TLS termination, low latency, commonly used when fixed addresses matter
GWLB Transparent appliance insertion For firewalls and inspection fleets, uses GENEVE, not a normal user-facing application balancer

ALB is the right answer when the question mentions HTTP semantics like /api and /images, redirects, or host-based routing. NLB is better when the requirement says TCP, UDP, TLS, or static IP addresses. If the question asks for fixed global IPs rather than fixed regional IPs, Global Accelerator is usually stronger than NLB alone.

Both ALB and NLB can be internal or internet-facing. Internal load balancers are common for service-to-service traffic in private subnets. ALB also supports sticky sessions and Lambda targets in some use cases. NLB commonly preserves source IP in many deployment patterns, but do not treat that as an absolute in every target mode.

Health checks catch people out more often than they really should. If an ALB is returning 503s, I’d start by checking the health check path, whether the app is listening on the right port, whether the target security group allows traffic from the load balancer, and whether the targets are in the correct subnets.

How Route 53, CloudFront, and Global Accelerator fit into global traffic patterns

These services overlap in conversation more than in function.

Service Primary role Best clue words
Route 53 DNS routing Weighted, failover, latency, geolocation, alias records
CloudFront CDN and edge acceleration for HTTP/HTTPS Caching, origin offload, static content, dynamic web acceleration
Global Accelerator Static anycast IPs and optimized global pathing Fast failover, TCP/UDP, global entry point, fixed global IPs

Route 53 answers DNS queries; it is not a proxy. Failover is affected by TTL and client resolver caching, so DNS failover is not instantaneous. CloudFront is not just for static files; it also improves dynamic HTTP/HTTPS delivery, adds edge presence, and commonly sits in front of ALB or S3. Global Accelerator improves entry onto the AWS global network and is excellent when you need static anycast IPs or faster failover characteristics than DNS-only approaches.

If a question says global TCP/UDP application with static IPs, think Global Accelerator in front of regional NLBs. If it says global website performance and caching, think CloudFront. If it says weighted or latency-based DNS steering, think Route 53.

Connecting VPCs and hybrid networks with peering, Transit Gateway, and hybrid connectivity

VPC peering is simple and useful for a small number of one-to-one connections. But it is non-transitive, does not allow overlapping CIDRs, and does not let you transit through a peer’s IGW, NAT Gateway, or VPN. That makes it poor for large meshes.

Transit Gateway is the scalable hub-and-spoke option. It provides centralized routing between attachments according to TGW route tables, associations, and propagations. In other words, it enables controlled transitive connectivity; it does not automatically connect everything to everything.

Need Better fit
Two or three VPCs, simple direct connectivity VPC peering
Many VPCs, multiple accounts, on-prem integration, segmentation Transit Gateway

For hybrid connectivity, Site-to-Site VPN is the fast, encrypted, internet-based option. Direct Connect is a private dedicated connection with more predictable performance, but it is not encrypted by default. If encryption is required, use VPN over Direct Connect or MACsec where supported. BGP is used with both Direct Connect and dynamic VPN designs, even though deep protocol tuning is outside associate-level scope.

A common enterprise pattern is Transit Gateway plus Direct Connect as primary connectivity, with Site-to-Site VPN as backup. In larger environments, Direct Connect Gateway is often used to connect Direct Connect to multiple VPCs or a Transit Gateway design.

Security, segmentation, and inspection

Security groups are stateful and allow-only. They are the main least-privilege control on ENI-backed resources. NACLs are stateless, processed in numbered order, and support both allow and deny rules. Because they are stateless, return traffic must also be allowed explicitly. That makes NACLs useful for broad subnet-level guardrails, but security groups usually do the real work.

A clean three-tier design usually ends up looking something like this:

  • ALB security group: allow inbound 443 from the internet
  • App security group: allow 443 or 80 only from the ALB security group
  • DB security group: allow the database port only from the app security group

For centralized inspection, Gateway Load Balancer can insert firewall appliances transparently, often alongside Transit Gateway in an inspection VPC. That is the scalable answer when a question asks for appliance-based traffic inspection without brittle manual routing everywhere.

Troubleshooting patterns and exam elimination strategy

When a network design looks fine on paper but still doesn’t work, I usually check routing first, then addressing, then security, then DNS, and finally health checks.

  • EC2 in a public subnet has no internet: verify IGW route, public IPv4 or Elastic IP, and outbound security rules.
  • Private EC2 cannot reach S3: check whether a gateway endpoint exists, whether the route table has the S3 prefix-list route, and whether an endpoint or bucket policy blocks access.
  • ALB unhealthy targets: verify target port, health check path, app listener, and security group rules from ALB to targets.
  • VPN is up but traffic fails: check route advertisement or propagation, attachment route tables, and asymmetric routing.

AWS-native tools worth remembering: VPC Flow Logs, Reachability Analyzer, Route 53 Resolver query logs, CloudWatch metrics, and ELB access logs.

For exam elimination, I’d use this order: identify the protocol, decide whether the traffic has to stay private, figure out the scope, whether it’s AZ, Region, global, or hybrid, rule out answers with a hidden single point of failure, and then pick the managed option with the least exposure and the least operational overhead.

SAA-C03 rapid review: keyword-to-service map

Requirement clue Think first about
Path-based HTTP routing ALB
TCP/UDP or static regional IPs NLB
Static global anycast IPs Global Accelerator
Private S3 or DynamoDB access Gateway endpoint
Private access to supported AWS APIs Interface endpoint
Outbound internet from private IPv4 subnets NAT Gateway
Outbound-only IPv6 internet Egress-only Internet Gateway
Many VPCs and centralized routing Transit Gateway
Quick encrypted hybrid link Site-to-Site VPN
Predictable private hybrid connectivity Direct Connect
Global web acceleration and caching CloudFront
DNS failover or weighted routing Route 53

Final takeaways

The exam is really testing judgment. Public subnet does not mean public reachability. NAT Gateway is not for private AWS service access and does not handle IPv6. Gateway endpoints are only for S3 and DynamoDB. VPC peering is non-transitive. Direct Connect is private, not automatically encrypted. Route 53 failover isn’t instant, because DNS caching is always part of the story.

If you remember just one framework, make it this: identify the traffic type, identify the protocol, identify the scope, keep private traffic private, and eliminate single points of failure. That mindset turns most SAA-C03 networking questions from confusing to predictable.