AWS FinOps for Cloud Ops Teams — Week 2: Cost Visibility & Tagging at Scale

Last week I argued that FinOps is fundamentally about making cloud cost legible at scale. This week we get to the foundation: cost visibility and tagging. Without this layer, every other FinOps activity — commitments, rightsizing, governance, dashboards — runs on guesswork.

If you remember one line from this post: youou cannot manage what you cannot attribute.

The problem with many accounts

AWS Organizations and consolidated billing give you a single number per account, per service, per month. That's enough to know your bill went up. It is nowhere near enough to know why.

In the multi-account environments I've worked in, the same five questions come up every month:

Which team is responsible for the $15,000 NAT Gateway charge in shared-network?
What's the production-vs-non-prod split for our compute spend?
Which projects are growing fastest and should we be worried?
How much are we spending on a specific business product across all its accounts?
Where is the long tail of small-but-untagged spend hiding?

Each of those is a tagging question.

A tagging strategy that survives contact with reality

The mistake I see most often is teams designing a 20-tag schema, then enforcing none of it. You end up with technically rich but practically useless metadata — half the resources have owner, a quarter have Owner, and the rest are blank.

Pick a small, mandatory baseline. Make every other tag optional. Here's the schema I'd start with:

Tag Key	Example Value	Why it exists
`Environment`	production, staging, dev	Drives nearly every cost question
`Team`	platform, data-engineering	Points at a human owner
`Project`	checkout-service, ml-pipeline	Connects spend to a business outcome
`CostCenter`	1042, 2310	Lets finance map spend to GL accounts

Four tags. That's it. Anything beyond this is bonus, not baseline.

Enforcement: where most programs quietly fail

A schema in a wiki page is not enforcement. There are three layers I'd insist on for any serious tagging program at 40+ accounts:

AWS Organizations Tag Policies. Define allowed values for each required tag (e.g., Environment must be one of production|staging|dev). Tag Policies don't block creation but produce compliance reports you can act on.
Service Control Policies (SCPs). For the strictest cases, deny resource creation when required tags are missing. Start with one or two high-impact services (EC2, RDS) before going wider.
AWS Config rules. Continuous detection, every resource is evaluated against a required-tags rule, and non-compliant resources show up in a dashboard you can wire to Slack or Jira.

If you only do one of those, do AWS Config. The compliance report alone changes behavior, because suddenly engineers can see their score.

Activate the tags in Billing! If not, then none of this matters, just give up now

This is the easy step that's easy to miss: tags only show up in Cost Explorer and Cost & Usage Reports if they're activated as Cost Allocation Tags in the Billing console of the management account. Activation only applies going forward, so do this before you start running reports.

Activate every tag in your required schema, plus any optional ones you want to slice on (e.g., Service, Owner).

Cost Explorer for ops engineers

Once tags are flowing, Cost Explorer becomes a different tool. A few queries I run constantly:

Spend by Team, last 90 days, monthly granularity. The "who's growing fastest" view. Filter to a single Environment tag (e.g., production) for clarity.
Spend by Service, grouped by linked account. Surfaces the account-level outliers — the dev account suddenly running a giant m5.24xlarge in us-west-2.
Untagged spend. Filter where a required tag is "No tag key". This is your tagging debt, in dollars.

Save these as Cost Explorer reports and share the URLs with team leads. People look at things you make easy to look at.

Don't ignore the long tail

One of the realities of multi-account environments is that most of the unattributed spend lives in shared services accounts: NAT Gateways, Transit Gateway, Route 53 resolver endpoints, central logging buckets. These can't be tagged in a way that maps cleanly to a product team.

You have two reasonable options:

Allocate shared spend proportionally based on traffic or compute usage. (More accurate, more work.)
Allocate shared spend evenly across product teams as an "infrastructure tax". (Less accurate, much easier to defend in conversations.)

Pick one and write it down. The worst outcome is "shared services are someone else's problem", which is how you end up with $40K/month NAT Gateway bills nobody touches.

Account vending: the real long-term fix

If your organization is growing accounts faster than humans can tag them, the right place to enforce tagging isn't on individual resources — it's on the accounts themselves, at creation time. AWS Control Tower, Account Factory for Terraform, or your own vending pipeline can guarantee:

Every account is tagged with Team, Environment, and CostCenter at the account level.
A baseline Tag Policy and Config rule deploy automatically.
The account name itself encodes the team and environment (e.g., platform-prod, data-eng-dev).

Account-level tags inherit nothing automatically, but they give Cost Explorer something useful to group on even when individual resources are still messy.

Rule of thumb

If a resource doesn't have at least an Environment and Team tag, you've already lost the ability to meaningfully attribute its cost. Anything more sophisticated — RIs, rightsizing, governance, reviews — is built on top of this.

Next week we move from "where is the money going?" to "how do we stop overpaying for it?" — Reserved Instances and Savings Plans across many accounts.