VPC Design Patterns for Multi-Account AWS Organizations

When your AWS footprint outgrows a single account, the first thing that breaks is networking. You start with a handful of VPCs peered together, and within a year you're staring at a mesh of 30+ peering connections, overlapping CIDR ranges, and a security group ruleset that nobody dares touch. The fix isn't more peering — it's a deliberate VPC architecture that matches how your organization actually ships software.

This article walks through the three VPC topologies we deploy most often for mid-market and enterprise clients running AWS Organizations, along with the CIDR planning strategy that keeps each one sane over time.

Why Multi-Account in the First Place?

AWS Organizations isn't just a billing convenience. Separate accounts give you hard blast-radius boundaries for IAM, service quotas, and network isolation. A misconfigured security group in your staging account can't accidentally expose production databases if those databases live in a different account with its own VPC.

The most common account structure we see looks something like this: a management account for Organizations and billing, a shared-services account for DNS, logging, and CI/CD tooling, a networking account that owns the Transit Gateway and VPN connections, and then workload accounts per environment or team — dev, staging, prod, data-platform, and so on.

The challenge is connecting these accounts so workloads can reach shared services (like a private DNS resolver or artifact registry) without opening the blast-radius doors you just closed.

Pattern 1: Hub-and-Spoke with Transit Gateway

This is the pattern we recommend for most organizations with 5–50 accounts. A single AWS Transit Gateway (TGW) sits in the networking account. Every workload VPC attaches to the TGW, and routing is controlled centrally through TGW route tables.

The key design decisions are:

One TGW per region, owned by the networking account. Share it to workload accounts via AWS Resource Access Manager (RAM). Workload teams create their own TGW attachments, but they can't modify the route tables — that stays with the networking team.
Separate TGW route tables for isolation tiers. A "production" route table that only sees prod VPCs and shared-services. A "non-prod" route table that sees dev and staging but not prod. An "inspection" route table that hairpins traffic through a centralized firewall VPC.
Blackhole routes for explicit deny. If dev should never talk to prod, add a blackhole route for the prod CIDR in the non-prod route table. Don't rely on the absence of a route — make the denial explicit.

Here's the Terraform skeleton for the TGW and RAM share:

# networking account — transit_gateway.tf

resource "aws_ec2_transit_gateway" "main" {
  description                     = "org-central-tgw"
  default_route_table_association = "disable"
  default_route_table_propagation = "disable"
  auto_accept_shared_attachments  = "enable"

  tags = { Name = "org-tgw-${var.region}" }
}

resource "aws_ram_resource_share" "tgw_share" {
  name                      = "tgw-org-share"
  allow_external_principals = false
}

resource "aws_ram_resource_association" "tgw" {
  resource_arn       = aws_ec2_transit_gateway.main.arn
  resource_share_arn = aws_ram_resource_share.tgw_share.arn
}

resource "aws_ram_principal_association" "org" {
  principal          = var.org_arn   # share to entire org
  resource_share_arn = aws_ram_resource_share.tgw_share.arn
}

Key insight: Disable the default TGW route table association and propagation. If you leave them enabled, every new VPC attachment automatically joins a single flat route domain — exactly the blast-radius problem you're trying to avoid. Create explicit route tables per isolation tier instead.

Pattern 2: Shared Services VPC with Private Link

Not everything needs full network-layer routing. If your shared services are API-driven — an internal artifact registry, a secrets manager proxy, a centralized logging endpoint — you can expose them via VPC Endpoint Services (AWS PrivateLink) instead of routing through the TGW.

The advantage is granularity. Each consumer VPC gets an ENI in its own subnet that resolves to a private IP. Traffic never traverses the TGW, never crosses CIDR boundaries, and the service owner controls which accounts can connect via an allowlist on the endpoint service.

When to reach for PrivateLink over TGW routing:

The shared service sits behind a Network Load Balancer (NLB) or Gateway Load Balancer.
You want per-service access control rather than per-network access control.
Consumer VPCs have overlapping CIDR ranges (PrivateLink doesn't care about CIDRs).
You're exposing services to accounts outside your Organization (partners, acquisitions).

In practice, most organizations use both: TGW for general east-west routing and PrivateLink for high-value shared services that need tighter access control.

Pattern 3: Inspection VPC with Centralized Egress

Once you have a TGW, the next question is usually: where does internet-bound traffic exit? The two options are distributed NAT Gateways (one per workload VPC) or a centralized egress VPC that all workload VPCs route through.

Centralized egress is almost always the right call for organizations that need to inspect or log outbound traffic. The pattern looks like this:

Create a dedicated egress VPC in the networking account with public subnets and NAT Gateways.
Attach the egress VPC to the TGW. Configure TGW route tables so that the default route (0.0.0.0/0) in each workload route table points to the egress VPC attachment.
Optionally insert AWS Network Firewall or a third-party appliance (Palo Alto, Fortinet) in the egress VPC between the TGW attachment subnets and the NAT Gateway subnets.
Configure the egress VPC route table so return traffic goes back through the TGW to reach the workload VPCs.

The cost argument here matters. A NAT Gateway costs ~$0.045/hour plus data processing. With 20 workload VPCs, that's 20 NAT Gateways ($648/month just in hourly charges) versus 2–3 NAT Gateways in a centralized egress VPC (~$97/month). The TGW data processing cost ($0.02/GB) partially offsets the savings, but for most traffic profiles centralized egress is 40–60% cheaper.

# Egress VPC route table — return traffic to workloads
resource "aws_route" "tgw_return" {
  route_table_id         = aws_route_table.egress_public.id
  destination_cidr_block = "10.0.0.0/8"   # supernet for all workload VPCs
  transit_gateway_id     = data.aws_ec2_transit_gateway.main.id
}

# Workload VPC route table — default route to egress via TGW
resource "aws_route" "default_egress" {
  route_table_id         = aws_route_table.workload_private.id
  destination_cidr_block = "0.0.0.0/0"
  transit_gateway_id     = data.aws_ec2_transit_gateway.main.id
}

CIDR Planning That Doesn't Collapse

Every multi-account VPC architecture eventually lives or dies by its CIDR plan. Here's the strategy we use with every client engagement:

Reserve a /16 supernet per AWS region. For example, 10.0.0.0/16 for us-east-1, 10.1.0.0/16 for us-west-2, 10.2.0.0/16 for eu-west-1. This gives you 65,536 IPs per region and makes supernet routing on the TGW trivial — one route entry per region instead of one per VPC.

Subdivide by account type. Within each /16, allocate /20 blocks to accounts. A /20 gives you 4,096 IPs — enough for most workloads with room to add subnets later. The allocation might look like:

# us-east-1 CIDR allocation (10.0.0.0/16)
# ─────────────────────────────────────────
# 10.0.0.0/20   → networking (TGW, VPN, DNS)
# 10.0.16.0/20  → shared-services (CI/CD, artifacts)
# 10.0.32.0/20  → production-app
# 10.0.48.0/20  → production-data
# 10.0.64.0/20  → staging
# 10.0.80.0/20  → development
# 10.0.96.0/20  → data-platform
# 10.0.112.0/20 → (reserved for growth)
# ...
# 10.0.240.0/20 → (reserved for growth)

Never reuse CIDRs, even in non-connected VPCs. Today's isolated sandbox is tomorrow's acquisition target that needs TGW connectivity. If its CIDR overlaps with production, you're looking at a painful re-IP project. Treat CIDR space as a finite organizational resource and track allocations in a shared IPAM — AWS VPC IPAM works well for this.

Pro tip: Enable AWS VPC IPAM at the Organization level from day one. It automatically tracks CIDR allocations across accounts and can enforce pool-based allocation policies so workload teams can't accidentally claim a CIDR block that's already in use elsewhere.

DNS Resolution Across Accounts

The often-overlooked piece is DNS. When VPCs in different accounts need to resolve each other's private hosted zones, you need Route 53 Resolver endpoints in the shared-services or networking account, plus hosted zone associations shared via RAM.

The minimal setup is a pair of Route 53 Resolver inbound endpoints in the networking VPC (so on-premises DNS can forward to AWS) and outbound endpoints (so AWS workloads can resolve on-premises domains). Then use Route 53 private hosted zone associations — shared via RAM — so each workload VPC can resolve internal.yourcompany.com without running its own DNS infrastructure.

For service discovery, pair this with AWS Cloud Map or simply use predictable naming conventions in your private hosted zones. We typically set up a zone per account — prod.internal.yourcompany.com, staging.internal.yourcompany.com — and share them to the accounts that need visibility.

Putting It All Together

The architecture that covers 90% of organizations we work with combines all three patterns: a Transit Gateway for general east-west routing with isolation-tier route tables, PrivateLink for high-value shared services, and a centralized egress VPC for inspected outbound traffic. Layer on a disciplined CIDR plan tracked in VPC IPAM and cross-account DNS via Route 53 Resolver, and you have a network foundation that scales from 5 accounts to 50 without rearchitecting.

The mistake we see most often isn't choosing the wrong pattern — it's delaying the decision until the ad-hoc peering mesh is already too painful to untangle. If you're at 3–5 accounts and growing, now is the time to put the TGW and CIDR plan in place. The refactoring cost only goes up from here.

Need a second opinion on your stack?

Start with a free 20-minute assessment call to scope the problem and decide whether a paid diagnostic or implementation step is worthwhile. Written findings are not included in the free call.

Get a Free Assessment → More Articles