Debugging NTP
I spent a little time debugging NTP for the nodes of my k8s cluster. I have a local NTP server, and the k8s nodes were not using it, they were using the debian NTP pool. Debugging went:
- Is the DHCP server providing the NTP server address? Yes, visible in
/var/lib/dhcp/dhclient.leases
. - Is
systemd-timesyncd
using this NTP server? No visible intimedatectl timesync-status -a
.
I first found the dhcp exit hook /etc/dhcp/dhclient-exit-hooks.d/timesyncd
which makes a /run
timesyncd config (but something cleans it up promptly. I
then found in the timesyncd logs the error
Timed out waiting for reply from 192.168.1.48:123 (192.168.1.48).
I then prove with tcpdump that packets were making it as far as the NTP server before timing out (ping, traceroute and curl all worked, suggesting not a network problem). I then worked out that the ntp server was not accepting requests from the k8s VLAN. One reload later, and we were working.
I wanted to fix all machines, so I ran
for n in node{2..6}; do ssh $n sudo dhclient -v eth0; done
for n in node{2..6}; do ssh $n timedatectl timesync-status; done
To rerun the dhcp hooks on each.
Useful commands
Show status of timesyncd.
timedatectl timesync-status -a
ntpdate has been reimplemented as ntpdig.
ntpdate -d 192.168.1.48
Logs of timesyncd
journalctl -u systemd-timesyncd --no-hostname --since "1 day ago"