The $16,000 Cost Leak Hiding in Your Availability Zones
AWS charges for data transfer between Availability Zones. Within the same AZ, traffic between your resources is free. Across AZs, it costs money. This isn't a secret - it's right there in the pricing docs. And most teams know about it in theory.
But knowing about cross-AZ pricing and actually checking whether your resources are aligned are two different things.
One of my clients was paying over $20,000 a year in cross-AZ data transfer. A significant portion of that was traffic between EC2 instances and databases in different AZs. The setup was straightforward: a single EC2 instance talking to a single Aurora database per customer or database pair. Consistent tags on both sides - something like "Customer=ACME" on the EC2 instance and the same tag on its corresponding database.
The problem: almost half of all their databases were in a different AZ from the EC2 instance with the matching tag.
This is the kind of thing that happens gradually. You launch a database. AWS puts it in an AZ. You launch a compute instance. AWS puts it in an AZ. They might be the same AZ. They might not. Nobody checks. The application works either way. The latency difference is negligible. The cost difference is not.
I built a tool - number 19 in my toolkit - that finds all RDS databases in a different AZ from the corresponding EC2 instance with the same tag. The tool calculates the data transfer cost for each misaligned pair using network bandwidth metrics and outputs a sorted spreadsheet.
The fix is clean. For Aurora, you create a read replica in the same AZ as the EC2 instance, then trigger a failover. The database ends up in the right AZ. The application doesn't need to change. No code changes, no connection string updates.
Projected savings: approximately $16,000 annualized. The customer wouldn't need to "move a finger."
This cost leak is entirely invisible unless you're specifically looking for it. It doesn't show up as a warning in the console. It doesn't cause performance problems. The application works fine across AZs. The data transfer charges are buried in a line item that most teams never break down by traffic type.
And it's a one-time fix. Once the database is in the right AZ, it stays there. The cost leak is plugged permanently.
I noted that the same issue might exist in this client's ElastiCache setup - cache clusters in different AZs from their compute. Left that for another day.
One thing that made this possible: the client had consistent tags linking compute and database resources. Without those tags, you'd need someone to manually map which instances talk to which databases.