Overlay connectivity broken after a node reboot until flannel is restarted
This document (000020003) is provided subject to the disclaimer at the end of this document.
Situation
Issue
After rebooting a Kubernetes node, you may notice that pod to pod network connectivity(via the overlay network) does not function correctly until you restart the canal workload on that node in Kubernetes.
Pre-requisites
- Kubernetes cluster running canal or flannel as the CNI
- Linux nodes running Systemd v242 or higher
Root Cause
This is caused by a race condition between flannel and systemd-networkd that is being tracked in this upstream issue.
This doesn't appear to affect Ubuntu 20.04, due to it's use of netplan to manage networking configuration.
Workaround
Either restart canal on the node (kubectl delete pod -n kube-system canal-XXXX
) as needed or change the MACAddressPolicy for the flannel interfaces on your nodes to none:
cat<<'EOF'>/etc/systemd/network/10-flannel.link
[Match]
OriginalName=flannel*
[Link]
MACAddressPolicy=none
EOF
Resolution
At present there is no resolution and this bug is still open upstream.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020003
- Creation Date: 06-May-2021
- Modified Date:06-May-2021
-
- SUSE Rancher
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com