问题
lxcfs
需要部署在 Kubernetes
集群的各个 Node
节点上,当LXCFS
服务重启或者crash
时,之前已经挂载在容器/proc
的挂载点会失效,导致在容器中执行free,top
命令会失效。
原因
lxcfs
进程重启了,那么容器里的 /proc/cpuinfo
等等都会报 transport connected failed
这个是因为 /var/lib/lxcfs
会删除再重建,inode
变了。
解决方法
使用systemd
的方式在各个节点启动lxcfs
服务,当lxcfs
服务crash
之后重启时,在lxcfs服务
启动成功之后,会通过ExecStartPost
的方式执行一个/usr/local/bin/container_remount_lxcfs.sh
来重新对之前已经挂载过的容器进行重新的挂载操作。
lxcfs.service:
[Unit]
Description=FUSE filesystem for LXC
ConditionVirtualization=!container
Before=lxc.service
Documentation=man:lxcfs(1)
[Service]
ExecStart=/usr/bin/lxcfs -l /var/lib/lxc/lxcfs/
KillMode=process
Restart= always
Delegate=yes
ExecStopPost=-/bin/fusermount -u /var/lib/lxc/lxcfs
ExecReload=/bin/kill -USR1 $MAINPID
# 添加 remount script 脚本
ExecStartPost=/usr/local/bin/container_remount_lxcfs.sh
[Install]
WantedBy=multi-user.target
container_remount_lxcfs.sh
脚本内容:
#! /bin/bash
PATH=$PATH:/bin
LXCFS="/var/lib/lxc/lxcfs"
LXCFS_ROOT_PATH="/var/lib/lxc"
containers=$(docker ps | grep -v pause | grep -v calico | awk '{print $1}' | grep -v CONTAINE)
for container in $containers;do
mountpoint=$(docker inspect --format '{{ range .Mounts }}{{ if eq .Destination "/var/lib/lxc" }}{{ .Source }}{{ end }}{{ end }}' $container)
# 确保本身pod中就有mount point
if [ "$mountpoint" = "$LXCFS_ROOT_PATH" ];then
echo "remount $container"
PID=$(docker inspect --format '{{.State.Pid}}' $container)
# mount /proc
for file in meminfo cpuinfo loadavg stat diskstats swaps uptime;do
echo nsenter --target $PID --mount -- mount -B "$LXCFS/proc/$file" "/proc/$file"
nsenter --target $PID --mount -- mount -B "$LXCFS/proc/$file" "/proc/$file"
done
# mount /sys
for file in online;do
echo nsenter --target $PID --mount -- mount -B "$LXCFS/sys/devices/system/cpu/$file" "/sys/devices/system/cpu/$file"
nsenter --target $PID --mount -- mount -B "$LXCFS/sys/devices/system/cpu/$file" "/sys/devices/system/cpu/$file"
done
fi
done
「如果这篇文章对你有用,请随意打赏」
FEATURED TAGS
agent
apiserver
application
bandwidth-limit
cgo
cgroupfs
ci/cd
client-go
cloudnative
cncf
cni
community
container
container-network-interface
containerd
controller
coredns
crd
custom-controller
deployment
docker
docker-build
docker-image
drop
ebpf
ecology
egress
etcd
gitee
github
gitlab
golang
governance
hpa
http2
image
ingress
iptables
jobs
kata
kata-runtime
kernel
kind
kubelet
kubenetes
kubernetes
library
linux-os
logging
loki
metrics
monitor
namespace
network
network-troubleshooting
node
nodeport
pingmesh
pod
prestop
prometheus
proxyless
pvc
rollingupdate
schedule
scheduler
serverless
sidecar
sigtrem
systemd
throttling
timeout
tools
traceroute