Kubernetes Node Containerd Runtime 问题排查
Containerd
containerd 硬重启后,出现failed to recover state: failed to reserve sandbox name
node1 containerd[80463]: time="2022-08-02T17:13:13.092422629Z" level=fatal msg="Failed to run CRI service" error="failed to recover state: failed to reserve sandbox name "kube-scheduler-node1_kube-system_705e7ce1217a37349a5567101e60165d_2": name "kube-scheduler-node1_kube-system_705e7ce1217a37349a5567101e60165d_2" is reserved for "139bb0ac7e050e9e28b994e78f651a8609f426f1b5bbfc887a0d4a3350b4eee2""
日志很明显提升,容日中有一层信息损坏。 通过重启来取活动新数据;
处理步骤
- 关闭 cri
$ cat /etc/containerd/config.toml
disable_plugins = ["io.containerd.grpc.v1.cri"]
...
- 清理损坏的层信息
$ ctr -n=k8s.io containers info 139bb0ac7e050e9e28b994e78f651a8609f426f1b5bbfc887a0d4a3350b4eee2
$ ctr -n=k8s.io containers rm 139bb0ac7e050e9e28b994e78f651a8609f426f1b5bbfc887a0d4a3350b4eee2
- 重启 containerd
$ systemctl daemon-reload && systemctl restart containerd && systemctl restart kubelet
「如果这篇文章对你有用,请随意打赏」
FEATURED TAGS
agent
apiserver
application
bandwidth-limit
cgo
cgroupfs
ci/cd
client-go
cloudnative
cncf
cni
community
container
container-network-interface
containerd
controller
coredns
crd
custom-controller
deployment
docker
docker-build
docker-image
drop
ebpf
ecology
egress
etcd
gitee
github
gitlab
golang
governance
hpa
http2
image
ingress
iptables
jobs
kata
kata-runtime
kernel
kind
kubelet
kubenetes
kubernetes
library
linux-os
logging
loki
metrics
monitor
namespace
network
network-troubleshooting
node
nodeport
pingmesh
pod
prestop
prometheus
proxyless
pvc
rollingupdate
schedule
scheduler
serverless
sidecar
sigtrem
systemd
throttling
timeout
tools
traceroute