others-How to solve 'no space left on device' when scaling pods in kubernetes cluster

Jan 22, 2020

1. The purpose of this post

Sometimes, when you start a new pod in kubernetes, you would get this error:

(combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "coredns-5c59fd465f-nmq29": Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:297: applying cgroup configuration for process caused \"mkdir /sys/fs/cgroup/memory/kubepods/burstable/podfd1a65b2-72c3-4f99-b256-2d8ea33036e4/2c380a403bca5da576cc9feff8bd8d892297c6aeeac281ad88717ce6f6cab557: no space left on device\"": unknown

2. Environments

centos 7.6
kubernetes v1.17
docker 19.03.5

3. The solution

Maybe there is a cgroup leak in your node,check like this:

Check your cgroup counts, like this:

cat /proc/cgroups | column -t

Check if you do have those cgroup count:

find -L /sys/fs/cgroup/memory -type d | wc -l

Drain the node from your cluster

kubectl drain <ip>

Clear your cache

echo 3 > /proc/sys/vm/drop_caches

If still not work, please reboot your node