
Let's say the oc describe node command contains the following events.
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
MemoryPressure Unknown Wed, 11 Nov 2020 20:47:37 -0600 Wed, 11 Nov 2020 20:50:40 -0600 NodeStatusUnknown Kubelet stopped posting node status
DiskPressure Unknown Wed, 11 Nov 2020 20:47:37 -0600 Wed, 11 Nov 2020 20:50:40 -0600 NodeStatusUnknown Kubelet stopped posting node status
PIDPressure Unknown Wed, 11 Nov 2020 20:47:37 -0600 Wed, 11 Nov 2020 20:50:40 -0600 NodeStatusUnknown Kubelet stopped posting node status
Ready Unknown Wed, 11 Nov 2020 20:47:37 -0600 Wed, 11 Nov 2020 20:50:40 -0600 NodeStatusUnknown Kubelet stopped posting node status
Use the oc get nodes -o wide or oc describe node command to determine the IP address of the node.
oc get nodes --output wide
. . .
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL VERSION CONTAINER-RUNTIME
node001 Ready infra 273d v1.11.0+d4cacc0 10.141.115.11 <none> Red Hat Enterprise Linux 3.10.0-1127.8.2.el7.x86_64 docker://1.13.1
node002 NotReady infra 273d v1.11.0+d4cacc0 10.141.115.12 <none> Red Hat Enterprise Linux 3.10.0-1127.8.2.el7.x86_64 docker://1.13.1
node003 Ready infra 273d v1.11.0+d4cacc0 10.141.115.13 <none> Red Hat Enterprise Linux 3.10.0-1127.8.2.el7.x86_64 docker://1.13.1
node004 Ready compute 273d v1.11.0+d4cacc0 10.141.115.14 <none> Red Hat Enterprise Linux 3.10.0-1127.8.2.el7.x86_64 docker://1.13.1
node005 Ready compute 273d v1.11.0+d4cacc0 10.141.115.15 <none> Red Hat Enterprise Linux 3.10.0-1127.8.2.el7.x86_64 docker://1.13.1
node006 Ready master 273d v1.11.0+d4cacc0 10.141.115.16 <none> Red Hat Enterprise Linux 3.10.0-1127.8.2.el7.x86_64 docker://1.13.1
node007 Ready master 273d v1.11.0+d4cacc0 10.141.115.17 <none> Red Hat Enterprise Linux 3.10.0-1127.8.2.el7.x86_64 docker://1.13.1
You can start one of the nodes in debug mode.
~]# oc debug node/my-node-5n4fj
Starting pod/my-node-5n4fj-debug ...
sh-4.4#
Typically you will first issue the chroot /host command is used to set /host as the root directory because the root file system is mounted to /host in the debug pod.
sh-4.4# chroot /host
systemctl can be used to determine if the kubelet service is running.
sh-5.1# systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─01-kubens.conf, 10-mco-default-madv.conf, 20-aws-node-name.conf, 20-aws-providerid.conf, 20-logging.conf
Active: active (running) since Tue 2024-04-23 04:14:54 UTC; 1 day 1h ago
Here is a one liner that I use to loop through each node and to return the status of the kubelet service.
for node in $(oc get nodes | grep -v ^NAME | awk '{print $1}'); do echo $node; oc debug node/$node -- chroot /host /usr/bin/systemctl status kubelet | grep Active:; done;
Which should return something like this.
infra-node-1
Active: active (running) since Thu 2024-04-04 16:31:39 UTC; 2 weeks 5 days ago
infra-node-2
Active: active (running) since Thu 2024-04-04 16:18:48 UTC; 2 weeks 5 days ago
infra-node-3
Active: active (running) since Thu 2024-04-04 16:26:03 UTC; 2 weeks 5 days ago
master-node-1
Active: active (running) since Thu 2024-04-04 16:34:01 UTC; 2 weeks 5 days ago
master-node-2
Active: active (running) since Thu 2024-04-04 16:21:05 UTC; 2 weeks 5 days ago
master-node-3
Active: active (running) since Thu 2024-04-04 16:27:32 UTC; 2 weeks 5 days ago
worker-node-1
Active: active (running) since Tue 2024-04-04 16:34:01 UTC; 1 day 1h ago
worker-node-2
Active: active (running) since Tue 2024-04-04 16:21:05 UTC; 1 day 1h ago
worker-node-3
Active: active (running) since Tue 2024-04-04 16:27:32 UTC; 1 day 2h ago
This one liner can be used to restart the kubelet service in each node, if you want to try restarting the kubelet service to see if this resolves the issue.
for node in $(oc get nodes | grep -v ^NAME | awk '{print $1}'); do echo $node; oc debug node/$node -- chroot /host /usr/bin/systemctl restart kubelet; done;
Use the journalctl command to check the journal for any kubelet events at log levels emerg, alert, crit, warning, and notice. There are usually no results at these log levels, but many results at log level info.
journalctl -p emerg | grep -i kubelet
journalctl -p alert | grep -i kubelet
journalctl -p crit | grep -i kubelet
journalctl -p warning | grep -i kubelet
journalctl -p notice | grep -i kubelet
Did you find this article helpful?
If so, consider buying me a coffee over at