← Back to home

AWS FinOps for Cloud Ops Teams — Week 4
Rightsizing & Waste Reduction

Apr 12, 2026 6 min read FinOps • AWS • Optimization Series: 4 of 7

Last week was about paying less for compute you actually use. This week is about not running compute you don't use at all. In a multi-account environment, idle and oversized resources are the single biggest source of recurring waste — and the easiest place to find quick wins.

The sprawl problem

In any organization with more than a handful of AWS accounts, resource sprawl is inevitable:

None of these is a disaster. But across multiple accounts they compound. By the time someone notices, the bill is "just what it is".

Where to look first

Almost every multi-account environment has the same six rocks under it:

Resource TypeCommon Waste Pattern
EC2 instancesOversized; consistent <10% CPU and low memory pressure
RDS instancesDev/test databases running 24/7 in non-prod accounts
EBS volumesUnattached volumes left after instance termination
Elastic IPsAllocated but not associated with a running resource
NAT GatewaysOver-engineered into every dev/test VPC
Load balancersProvisioned for retired applications

If you do nothing else, run a single multi-account sweep against this list every month.

The tools that earn their keep

Scheduled shutdowns: the highest-ROI lever in dev/test

Production needs to run 24/7. Dev and test environments almost never do. If your dev EC2/RDS fleet runs only during business hours (say, 7am–7pm Monday–Friday), you've cut its hours from 168/week to 60/week — about a 64% reduction in compute spend on those resources, with no architectural changes.

Two ways to implement it:

Either way, drive it from a tag like Schedule=office-hours rather than hardcoded resource lists. New resources get scheduled automatically; opt-out is explicit.

EBS cleanup automation

Unattached EBS volumes are quiet money. They're easy to find — describe-volumes with State=available — and easy to delete, but the right answer is graceful cleanup, not terraform destroy:

This is exactly the kind of policy Cloud Custodian was built for; we'll wire it up next week.

Rightsizing without breaking trust

Rightsizing is where FinOps programs lose their reputation if they're not careful. The failure mode is: cloud team unilaterally resizes an EC2 instance, the workload's p99 latency tanks, the owning team finds out from a customer ticket. Now nobody on that team trusts you with their infrastructure again.

Avoid that with a simple workflow:

The change rate is lower this way — but the changes that do happen don't blow up.

Cleaning up the truly dead

The category that should be cleanable unilaterally:

Wire these into automated cleanup with a 7-day mark-and-sweep cycle. Notify the account owner when something is queued for deletion. Most of the time, no one will object — and you'll reclaim a surprising chunk of monthly spend.

Rule of thumb

Run a monthly rightsizing review. Even catching one oversized instance per account per month adds up to massive savings at a multiple account scale. The win isn't any single resource, it's that nothing stays "free to forget" for long.

Next week we move from one-off cleanups to policy : keeping all of this enforced as code with Cloud Custodian.

Previous← Week 3 — RIs & Savings Plans Next weekWeek 5 — Governance with Cloud Custodian →