Why Prometheus Metrics Aren't Billing Data

Trying to calculate your cloud bill using PromQL? Here is why your numbers never match the AWS invoice, and why you need a reconciliation layer.

J
Jesus Paz
1 min read

We built a CLI for FinOps because we wanted quick answers. But why couldn’t we just use our existing Prometheus dashboards?

We tried. It didn’t work.

Here is the dirty secret of Cloud FinOps: Usage != Cost.

1. The Sampling Problem

Prometheus is designed for trends, not transactions. It scrapes metrics every 15s or 30s. If a pod starts, does heavy work for 5 seconds, and dies between scrapes, Prometheus might miss it entirely.

AWS Billing never misses it. You pay for every second.

Over a month, these “ghost pods” can add up to a 10-15% discrepancy.

2. The Rate Problem

Prometheus knows you used 4 vCPUs. It does not know how much those vCPUs cost.

  • Was it a Spot Instance? (60% cheaper)
  • Was it covered by a Savings Plan? (30% cheaper)
  • Did you cross a data transfer tier?

To get the cost, you have to multiply Usage * Rate. But the Rate is dynamic and lives in the AWS Cost & Usage Report (CUR), not in Prometheus.

3. The Retention Problem

Billing data needs to be kept for years (for audits and YoY analysis). Prometheus high-cardinality metrics are usually kept for weeks.

The Solution: Reconciliation

You cannot rely on metrics alone. You need a system that:

  1. Ingests real-time metrics (for speed).
  2. Ingests the AWS CUR (for accuracy).
  3. Reconciles the two.

This is what ClusterCost does. We use the metrics to allocate the bill, but we use the CUR to validate the total.

Don’t use a ruler to measure weight. Don’t use Prometheus to measure dollars.

👨‍💻

Jesus Paz

Founder & CEO

Read Next

Join 1,000+ FinOps and platform leaders

Get Kubernetes and ECS cost tactics delivered weekly.