How to Detect and Drain Idle Kubernetes Nodes Automatically

A playbook for identifying underutilized nodes and safely removing them without downtime.

L
Linda Cuanca
1 min read

An “Idle Node” isn’t always empty. Often, it’s a huge m5.2xlarge instance running a single tiny pod that refuses to move. This is the “fragmentation” problem, and it costs companies thousands.

Why Nodes Won’t Drain

Automated scalers (like Cluster Autoscaler) will NOT remove a node if:

  1. Pod has local storage: It can’t be moved.
  2. Pod has “Do Not Evict” annotation: cluster-autoscaler.kubernetes.io/safe-to-evict: false.
  3. Pod doesn’t have a controller: It’s a naked pod (not Deployment/ReplicaSet).
  4. No other node fits: Other nodes are full or don’t match Taints/Tolerations.

The Detection Script

You can find these money-burning nodes with a simple kubectl query:

Terminal window
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.capacity.cpu}{"\t"}{.status.allocatable.cpu}{"\n"}{end}'

Or, use the kubectl cost plugin to see cost-weighted idle time.

The Solution: Aggressive Bin Packing

  1. Switch to Karpenter: It actively moves pods to consolidate them. It calculates: “If I move these 3 pods, can I delete this node?” -> YES -> Action.
  2. Set scale-down-utilization-threshold: Increase this setting in Cluster Autoscaler to 0.6 or 0.7 (default is 0.5). Force it to be more aggressive.

[!NOTE] Read the Deep Dive. For a full guide on configuration flags, read our main article on Idle Kubernetes Nodes.

👨‍💻

Linda Cuanca

Head of Sales

Read Next

Join 1,000+ FinOps and platform leaders

Get Kubernetes and ECS cost tactics delivered weekly.