AWS SAA-C03: How to Design Cost-Optimized Network Architectures

One of the fastest ways to overspend in AWS is to ignore packet paths. Networking charges rarely look dramatic on a whiteboard, but they pile up through NAT Gateway data processing, cross-AZ traffic, unnecessary interface endpoints, centralized inspection hairpinning, and public IPv4 usage that nobody meant to keep. On the SAA-C03 exam, AWS is not asking for the absolute cheapest design. It is asking for the lowest-cost design that still satisfies security, availability, and performance requirements.

1. Core Principle: Follow the Packet and the Billing Model

Cost-optimized network design in AWS means choosing the simplest compliant traffic path. For exam purposes, think in four billing buckets: hourly resource charges, per-GB processing charges, data transfer charges, and indirect operational cost.

Service: NAT Gateway
Main Cost Shape: Hourly plus per-GB processed
What to Remember: Convenient and scalable, but expensive when too much traffic goes through it

Service: Gateway Endpoint
Main Cost Shape: No hourly or per-GB endpoint charge
What to Remember: Only for S3 and DynamoDB; often a major cost win

Service: Interface Endpoint
Main Cost Shape: Hourly per AZ plus per-GB processed
What to Remember: PrivateLink is useful, but many endpoints across many AZs add up

Service: Transit Gateway
Main Cost Shape: Attachment hourly plus per-GB processed
What to Remember: Great for scale, not for tiny environments

Service: ALB / NLB
Main Cost Shape: Hourly plus LCU or NLCU usage
What to Remember: Load balancers are not free; pick based on protocol and features

Service: Site-to-Site VPN
Main Cost Shape: Hourly per connection plus transfer
What to Remember: Lower entry cost for hybrid, uses two tunnels by default

Service: Direct Connect
Main Cost Shape: Port hours plus transfer and possible provider fees
What to Remember: Justified by steady enterprise traffic, not by habit

High-yield cost traps for the exam: NAT for S3 or DynamoDB, cross-AZ NAT usage, chatty Multi-AZ designs, Transit Gateway for only a couple of VPCs, centralized egress without a governance requirement, and overusing interface endpoints for everything. Also remember that public IPv4 addresses now have direct hourly cost implications, so minimizing public IPv4 usage matters.

2. NAT Gateway, NAT Instance, IPv6, and AZ-Local Egress

Private subnets do not always need internet access. Some workloads are fully private and use only VPC endpoints, private package mirrors, or hybrid private connectivity. But when private resources do need outbound internet access, the main options are NAT Gateway, a custom NAT instance, or IPv6 outbound access with an egress-only internet gateway.

A NAT Gateway is a managed, zonal service. It is resilient within one Availability Zone, but it is not a Multi-AZ resource. For production-grade Multi-AZ egress, deploy one NAT Gateway per AZ and route each private subnet to the NAT Gateway in the same AZ. Routing an instance in AZ-A to a NAT Gateway in AZ-B adds cross-AZ transfer cost and creates an AZ dependency.

Conceptual private subnet route table for IPv4 internet egress:

10.0.0.0/16 - local
0.0.0.0/0 - nat-gw-in-same-az

Dual-stack example:

10.0.0.0/16 - local
0.0.0.0/0 - nat-gw-in-same-az
::/0 - egress-only-igw

IPv6 matters because outbound IPv6 traffic can use an egress-only internet gateway, which allows outbound-only internet access for IPv6 without NAT. But it does not translate IPv4, and it helps only when destinations support IPv6. In dual-stack environments, you may still need NAT Gateway for IPv4-only dependencies.

NAT instances are still technically possible, but they are a custom EC2 pattern, not the modern AWS default. AWS strongly recommends NAT Gateway in most cases, and on SAA-C03 a NAT instance is rarely the best answer unless the question pushes hard on very low cost, low throughput, and acceptance of self-management. If you do use one, you must disable source and destination checks, enable IP forwarding, configure packet forwarding rules, attach an Elastic IP, and build your own monitoring, scaling, patching, and failover.

That operational burden is the real lesson: a NAT instance may lower direct AWS spend in a small lab, but it raises management cost and risk.

3. Gateway Endpoints vs Interface Endpoints vs NAT

This is one of the most testable areas on SAA-C03.

Gateway endpoints support only Amazon S3 and DynamoDB. They are associated with route tables, not placed as elastic network interfaces in your subnets. During creation, you select the route tables, and AWS adds endpoint routes for the AWS-managed prefix lists for S3 or DynamoDB. They are extremely cost-effective because there is generally no hourly or per-GB endpoint charge.

Interface endpoints are PrivateLink-powered elastic network interfaces with private IP addresses in selected subnets. They are protected by security groups, billed hourly per AZ plus per GB, and are zonal resources. If you want high availability and AZ-local access, you usually deploy them in each AZ where clients run. That improves resilience, but it can multiply hourly cost.

NAT is for general internet-bound traffic. It is not the best default for S3 or DynamoDB access from private subnets.

Option: Gateway Endpoint
Best Use: Private access to S3 or DynamoDB
Cost Model: Typically no hourly or per-GB endpoint charge
Key Exam Fact: First choice for S3 or DynamoDB from private subnets

Option: Interface Endpoint
Best Use: Private access to supported AWS or partner services
Cost Model: Hourly per AZ plus per-GB
Key Exam Fact: Useful for private access, not automatically cheaper than NAT

Option: NAT Gateway
Best Use: General outbound internet access
Cost Model: Hourly plus per-GB
Key Exam Fact: Use deliberately, not as a catch-all path

Private DNS is a common exam clue. With private DNS enabled on an interface endpoint, the standard regional service hostname resolves to the endpoint’s private IPs from within the VPC. Without private DNS, clients must use the endpoint-specific DNS names or custom DNS. Otherwise they may resolve the public service endpoint and require internet or NAT reachability.

Security matters here too. Security groups are stateful; network ACLs are stateless. For interface endpoints, allow inbound HTTPS from the application subnets or source security groups. For gateway endpoints, use endpoint policies and, for S3, bucket policies that restrict access to a specific VPC endpoint or VPC.

Important hybrid nuance: gateway endpoints are for VPC access in-region and do not solve private on-premises access to S3 or DynamoDB the same way PrivateLink can solve private service access. If the scenario is hybrid private access to supported services, think interface endpoints or other private connectivity patterns, not gateway endpoints.

4. Hidden Transfer Costs: Cross-AZ, Cross-Region, and Hairpinning

Cross-AZ data transfer is commonly billable and should always be treated as a cost factor. The exact pricing varies by service and direction, but the architectural rule is simple: if traffic keeps crossing AZ boundaries, investigate it.

Common hidden cost paths:

  • EC2 in AZ-A using a NAT Gateway in AZ-B
  • Clients in one AZ hitting an interface endpoint only deployed in another AZ
  • ALB forwarding heavily to targets in another AZ
  • Chatty app, cache, and database tiers spread across AZs without locality awareness
  • Centralized inspection VPC forcing spoke traffic through multiple extra hops

Multi-AZ is often required and correct. The exam usually wants you to preserve availability, then optimize inside that constraint: one NAT Gateway per AZ, local endpoints where needed, fewer synchronous east-west calls, and careful placement of tightly coupled services.

Cross-Region transfer is usually tied to replication, disaster recovery, backup copies, or global application communication. Do it only when the business requirement justifies it.

5. VPC Peering vs Transit Gateway

VPC peering is direct, non-transitive, and usually best for a small number of VPCs. It also has limitations that are classic exam eliminators: overlapping CIDR ranges prevent peering, and you cannot use a peer VPC’s internet gateway, NAT Gateway, or VPN as transit. Peering can still incur transfer charges depending on traffic pattern, AZ placement, and Region scope, so it is not universally cheaper from a traffic perspective.

Transit Gateway provides transitive routing and becomes attractive when you have many VPCs, shared services, centralized governance, or segmentation needs. It uses attachments and route tables, and you can control propagation and isolation with Transit Gateway route tables. At scale, it often lowers overall operational complexity compared with a mesh of peering connections, even though its direct charges are higher.

For advanced architectures, appliance mode and route segmentation matter when traffic must pass through centralized inspection. That is useful in real environments, but for the exam the main rule is easier: a few VPCs means peering is often enough; many VPCs or shared services usually point to Transit Gateway.

6. Centralized vs Distributed Egress

Distributed egress means each VPC handles its own outbound internet path. Centralized egress means spoke VPCs route outbound traffic through a shared egress or inspection VPC, often using Transit Gateway, NAT Gateway, network firewall services, third-party firewalls, or gateway load balancer insertion patterns.

Centralized egress improves governance, inspection consistency, and outbound policy control. It also increases cost through extra hops, Transit Gateway data processing, possible cross-AZ transfer, appliance charges, and hairpinning. It requires careful route design and return-path symmetry so traffic leaves and returns through the expected inspection path.

Distributed egress is often cheaper and simpler for small environments, but it spreads policy management across more places.

Exam logic: choose centralized egress when compliance, uniform inspection, or shared policy is explicit. Choose distributed egress when the environment is smaller and no strong central governance requirement exists.

7. Hybrid Connectivity: VPN vs Direct Connect

Site-to-Site VPN is usually the lower entry-cost hybrid answer. It uses the internet, has hourly connection charges, and creates two tunnels by default for redundancy. It can use static routing or BGP. Performance is variable because the public internet path is variable. For many branch, backup, or early hybrid scenarios, that is acceptable.

Direct Connect is dedicated connectivity with port-hour charges, data transfer considerations, and often provider or cross-connect fees. It comes in different port speeds and can use private, public, or transit virtual interfaces depending on the design. It is not encrypted by default, so if encryption is required you may pair it with VPN or use MACsec where supported.

For the exam, the decision tree is simple: VPN for faster deployment, lower entry cost, and modest traffic; Direct Connect for sustained, predictable enterprise traffic and more consistent performance. A common production pattern is Direct Connect primary with VPN backup.

8. Edge and Delivery: CloudFront, Route 53, ALB, NLB

CloudFront can reduce total cost when cache hit ratio is good, origin egress is expensive, or traffic is global. It is not automatically cheaper for every workload. Low-cache-hit traffic or small regional workloads may not save money. The exam point is that CloudFront is both a performance and cost tool when caching actually works.

For secure origin design, prefer CloudFront with a private S3 origin using Origin Access Control rather than exposing an S3 static website endpoint. S3 website endpoints behave differently and do not provide the same modern HTTPS origin-security model.

Route 53 is a DNS service, not a packet-forwarding service. Its routing policies influence which endpoint clients resolve, not how packets are forwarded once a connection starts. Route 53 supports cost-aware architecture indirectly by steering users to the right endpoint or failover target, but it is usually not the primary network cost lever.

ALB is the default fit for HTTP and HTTPS applications needing Layer 7 features such as host-based or path-based routing and easy web application firewall integration. NLB fits Layer 4 use cases, static IP requirements, very high throughput, or protocols that do not need Layer 7 logic. Choose based on protocol and feature fit, not habit.

9. Security Controls That Change Network Cost

Private connectivity often improves both security and cost, but only when you choose the right mechanism. Gateway endpoints can reduce NAT data processing for S3 and DynamoDB. Interface endpoints can reduce public exposure for supported services, but each endpoint in each AZ adds recurring cost. CloudFront plus a web application firewall plus a private origin can reduce attack surface and sometimes origin load. Centralized inspection with network firewall services, gateway load balancer, or third-party appliances can be necessary in regulated environments, but it raises both direct spend and traffic-path complexity.

Logging also matters. VPC Flow Logs, Route 53 query logs, firewall logs, and packet mirroring improve visibility, but they create ingestion, storage, and analysis cost. Security is never free, but the exam expects you to avoid unnecessary public paths and use least-privilege controls such as endpoint policies, security groups, bucket policies, and service-control-policy-aligned account design where relevant.

10. Troubleshooting Unexpected AWS Network Spend

Use one compact workflow:

  • Check cost reporting by usage type: NAT Gateway, data transfer, Transit Gateway, public IPv4, load balancers, and content delivery usage.
  • Review VPC Flow Logs to identify top destinations and confirm whether S3 or DynamoDB traffic is still going through NAT.
  • Inspect route tables: look for cross-AZ NAT, missing gateway endpoints, or centralized egress detours.
  • Check monitoring metrics for NAT Gateway bytes processed and connections.
  • Validate interface endpoint DNS behavior and security group rules.
  • Use reachability analysis tools to verify the intended path.
  • For multi-VPC environments, review Transit Gateway flow logs and route tables for unnecessary data processing hotspots.

If a NAT bill spikes, the first suspects are usually S3 traffic still using NAT, a new package repository pattern, or a route change sending traffic to a NAT in another AZ.

11. Exam Decision Guide and Traps

If you see X, think Y:

  • Private S3 or DynamoDB access → Gateway Endpoint
  • Private access to a supported AWS service → Interface Endpoint
  • Few VPCs → VPC Peering
  • Many VPCs or shared services → Transit Gateway
  • Low-cost hybrid → Site-to-Site VPN
  • Predictable enterprise hybrid traffic → Direct Connect
  • Cacheable global content → CloudFront
  • Outbound-only IPv6 internet access → Egress-only internet gateway

Wrong-answer elimination: eliminate NAT-only answers when S3 or DynamoDB endpoint support is available; eliminate Transit Gateway for only two VPCs unless a transit or governance requirement exists; eliminate Direct Connect when the scenario is small, temporary, or budget-sensitive; eliminate single-AZ shortcuts when production high availability is required; eliminate public internet paths when the prompt explicitly demands private access.

Mini scenarios:

Private EC2 instances need S3 access at lowest cost and must stay off the internet. Answer: S3 Gateway Endpoint, not NAT-only.

Three VPCs need simple connectivity without transitive routing. Answer: VPC peering is usually enough.

A regulated company needs shared inspection and centralized outbound policy across many VPCs. Answer: Transit Gateway plus centralized inspection may be the right answer even if direct cost is higher.

12. Final Takeaways

For SAA-C03, the winning habit is simple: trace the packet path, then trace the billing path. Ask which hops are necessary, which are managed conveniences, which are crossing AZs, and which can be replaced by endpoints, locality, caching, or simpler connectivity.

Remember the most testable facts: gateway endpoints support only S3 and DynamoDB; interface endpoints are zonal and billed per AZ; VPC peering is non-transitive; Transit Gateway provides transitive routing; Site-to-Site VPN uses internet-based tunnels; Direct Connect is dedicated but not encrypted by default; egress-only internet gateways are for outbound IPv6 only; and public IPv4 usage has direct cost impact.

Cheapest is not the goal. Lowest-cost compliant architecture is. That distinction is what AWS tests, and it is what good architects practice in the real world.