others-How to solve 'no space left on device' when scaling pods in kubernetes cluster

1. The purpose of this post

Sometimes, when you start a new pod in kubernetes, you would get this error:

(combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "coredns-5c59fd465f-nmq29": Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:297: applying cgroup configuration for process caused \"mkdir /sys/fs/cgroup/memory/kubepods/burstable/podfd1a65b2-72c3-4f99-b256-2d8ea33036e4/2c380a403bca5da576cc9feff8bd8d892297c6aeeac281ad88717ce6f6cab557: no space left on device\"": unknown

2. Environments

  • centos 7.6
  • kubernetes v1.17
  • docker 19.03.5

3. The solution

Maybe there is a cgroup leak in your node,check like this:

  1. Check your cgroup counts, like this:
cat /proc/cgroups | column -t
  1. Check if you do have those cgroup count:
find -L /sys/fs/cgroup/memory -type d | wc -l 
  1. Drain the node from your cluster
kubectl drain <ip>
  1. Clear your cache
echo 3 > /proc/sys/vm/drop_caches
  1. If still not work, please reboot your node