Jesus Paz · 2 min read
The Real Cost of Idle Kubernetes Nodes and How to Eliminate Them
Quantify zombie capacity, catch misconfigured autoscalers, and automate remediation with ClusterCost.
Idle nodes hide in every cluster—blue/green leftovers, failed upgrades, or autoscalers that never scale down. They quietly burn thousands of dollars per month. Here’s how to find and remove them systematically.
Spot idle nodes quickly
ClusterCost tracks utilization per node group with the following signals:
- CPU/RAM usage vs. allocatable capacity.
- Pod count per node.
- Time since last scheduling event.
- Karpenter/Cluster Autoscaler events.
Flag nodes with <20% utilization for more than 24 hours.
Determine the cause
| Cause | How to confirm | Fix |
|---|---|---|
| Stuck DaemonSet | kubectl describe node shows taints preventing drains | Patch DaemonSet or adjust tolerations |
| PDB constraints | PodDisruptionBudget prevents eviction | Temporarily relax PDB or use surge deployments |
| Reserved node pool | Node pool pinned to min=3 but unused | Lower min nodes or delete pool |
| Failed scale-down | Autoscaler logs show “scale down disabled” | Update autoscaler flags / remove pod annotations |
Automate cleanup
- Schedule nightly jobs that call ClusterCost’s API to list idle nodes.
- For each node:
- Drain with
kubectl drain --ignore-daemonsets. - Delete from the ASG/managed node group.
- Drain with
- Notify owners via Slack if the node belonged to a specific workload.
Prevent idle nodes from returning
- Enable scale-down-utilization-threshold at 0.5 or lower.
- Set scale-down-delay-after-add to 10 minutes so new nodes can shrink quickly.
- Use scheduled scaling to sleep dev/test environments at night.
- Adopt spot instances for bursty workloads so unused capacity is cheaper.
Report savings
- Before/after snapshots from ClusterCost show node hours reclaimed.
- Share monthly savings with leadership to justify continued automation work.
Once idle nodes are tracked and removed automatically, your clusters maintain healthy utilization without constant babysitting—and your AWS bill thanks you.***
Previous
Best Practices for Tagging AWS Infrastructure for Accurate Cost Allocation
Next
How to Forecast Kubernetes Costs Using Basic Metrics (No AI Needed)
Related reading
Join 1,000+ FinOps and platform leaders
Get Kubernetes and ECS cost tactics delivered weekly.