Case files of a TSE: Would you have the time?
October 13, 2021 | By: Anthony Stalker
This is the first part of a series that attempts to showcase the kind of work that SUSE Support does and how we help customers resolve issues when running SUSE Products. The cases that are selected will be based on real cases. However, all details will be fully anonymized and stripped of identifying marks.
This is a case where the time from when I took it to when it was resolved happened to be about half an hour. Being half an hour late might not mean a lot to some people, but computer systems are much more sensitive to time and need it to be accurate and synchronized. That’s why it’s crucial to have a solid NTP (Network Time Protocol) infrastructure. This case shows how important attention to detail can be when troubleshooting a system.
There was also an output of chronyc> sources from the chrony shell that showed polls of some NTP sources. It contained reachable local pools and some unreachable internet sources. For more information on interpreting the output of: `ntp -q` and `chronyc sources -v` please see a TID that I wrote on the subject.
Notes on timekeeping
What the customer wants to achieve here is excellent practice in timekeeping. It’s correct that they want to use local time sources. This has the potential to improve security, and it’s the only way to give time to servers which are not connected to the internet. Perhaps the greatest advantage is that we would expect the local time sources to be consistent to eachother and while having accurate time is important, having a consistent relative time across the network is most important.Not so good practice is having 1 (no redundancy, but also there’s no dispute of the time) or 3 time-servers configured. Two is probably the worst number of time-servers, since if they tell different time, there will be no possibility of consensus. It’s one of the reasons we always want at least 3 nodes in a High Availability Cluster and why 2 is the worst number of hardware watchdog devices in a cluster.
What the customer has configured here, though, are pools of time-servers. A pool is basically just a list of servers, so if one of the servers shows a time that’s really off or if it’s unreachable, NTP will just get the time from the next server in the “list.” So, what the customer wants here is very reasonable. They’ve configured some local time-servers, but have all these unreachable internet time sources polluting their configuration. However, they can’t figure out where they are all coming from. Quite reasonably, they wrote in a specific question: Why is the system trying to reach these internet time sources? How do I fix it?
For illustration, I’m using the configuration from my desktop, which is running openSUSE Leap. It has the advantage that each line has a comment, so I know what it does without even looking at the manual. For my purposes, I actually want to get time from the internet. I want to be synchronized to the proper time, and I don’t have a pool on my local network. The customer had a similar configuration to me, but unlike me, they wanted to avoid getting their time directly from the internet.
The problem and the fix
What the customer didn’t notice is that they had an “include” directive in their /etc/chrony.conf configuration:# Also include any directives found in configuration files in /etc/chrony.d include /etc/chrony.d/*.confAnd there is a drop in file called ‘/etc/chrony.d/pool.conf’ with contents such as these:pool 2.opensuse.pool.ntp.org iburstWhat was happening is that chrony parsed the /etc/chrony.conf, hit the “include” directive, followed it and included the internet pool.The likely origin is that the customer left a box ticked with synchronize to internet time-servers, when setting up the NTP service with YaST.Now that we know what the problem was, how do we fix it? Since this is Linux, it’s straightforward and effortless. What I suggested was either commenting out either the ‘include’ line, which would tell chrony to forget anything outside /etc/chrony.conf that is fine if there’s no other include files providing important functionality. The other thing that the customer could do is delete the ‘/etc/chrony.d/pool.conf’ file completely or comment out its only line.
cat /etc/fstab # This text can be considered inert. # Nothing on this line will be parsed as part of the configuration. # This can be useful to document changes and deviations from standard configurations. UUID=f8626266-d78c-4c21-bc08-962bde178cdb / btrfs defaults 0 0 UUID=7facfae7-41e5-4699-9d78-555b032be5c8 swap swap defaults 0 0 UUID=f8626266-d78c-4c21-bc08-962bde178cdb /var btrfs subvol=/@/var 0 0 UUID=f8626266-d78c-4c21-bc08-962bde178cdb /usr/local btrfs subvol=/@/usr/local 0 0 UUID=f8626266-d78c-4c21-bc08-962bde178cdb /tmp btrfs subvol=/@/tmp 0 0 UUID=f8626266-d78c-4c21-bc08-962bde178cdb /srv btrfs subvol=/@/srv 0 0 UUID=f8626266-d78c-4c21-bc08-962bde178cdb /root btrfs subvol=/@/root 0 0 UUID=f8626266-d78c-4c21-bc08-962bde178cdb /opt btrfs subvol=/@/opt 0 0 UUID=f8626266-d78c-4c21-bc08-962bde178cdb /home btrfs subvol=/@/home 0 0 UUID=f8626266-d78c-4c21-bc08-962bde178cdb /boot/grub2/x86_64-efi btrfs subvol=/@/boot/grub2/x86_64-efi 0 0 UUID=f8626266-d78c-4c21-bc08-962bde178cdb /boot/grub2/i386-pc btrfs subvol=/@/boot/grub2/i386-pc 0 0 #UUID=6CC5-4290 /boot/efi vfat defaults 0 2
The customer wrote back shortly that to confirm he was satisfied with the solution and how we handled the case.Sometimes the solution can be easy, but spotting the problem can require attention to detail, close reading and a dash of knowing where to look.Have you ever seen an issue where all it took to fix it was changing one character or removing a line? Did you ever read a config file over and over again, only for a colleague to point out the issue at first glance? Were you that colleague who spotted it? I know I’ve been both. Please don’t hesitate to share your stories in the comments below!
I’m a Frontline Technical Support Engineer working in the EMEA region.
If you open a Support Case with SUSE for a technical problem, maybe I will be the person who helps you get to a solution.