Debugging NTP

I spent a little time debugging NTP for the nodes of my k8s cluster. I have a local NTP server, and the k8s nodes were not using it, they were using the debian NTP pool. Debugging went:

Is the DHCP server providing the NTP server address? Yes, visible in /var/lib/dhcp/dhclient.leases.
Is systemd-timesyncd using this NTP server? No visible in timedatectl timesync-status -a.

I first found the dhcp exit hook /etc/dhcp/dhclient-exit-hooks.d/timesyncd which makes a /run timesyncd config (but something cleans it up promptly. I then found in the timesyncd logs the error

Timed out waiting for reply from 192.168.1.48:123 (192.168.1.48).

I then prove with tcpdump that packets were making it as far as the NTP server before timing out (ping, traceroute and curl all worked, suggesting not a network problem). I then worked out that the ntp server was not accepting requests from the k8s VLAN. One reload later, and we were working.

I wanted to fix all machines, so I ran

for n in node{2..6}; do ssh $n sudo dhclient -v eth0; done
for n in node{2..6}; do ssh $n timedatectl timesync-status; done

To rerun the dhcp hooks on each.

Useful commands

Show status of timesyncd.

timedatectl timesync-status -a

ntpdate has been reimplemented as ntpdig.

ntpdate -d 192.168.1.48

Logs of timesyncd

journalctl -u systemd-timesyncd --no-hostname --since "1 day ago"