SUSE Support

Here When You Need Us

Unable to re-add worker node to custom RKE2 cluster

This document (000021443) is provided subject to the disclaimer at the end of this document.

Environment

RKE2 custom cluster managed by Rancher

Situation

When trying to re-add a cleaned worker node to a custom RKE2 cluster, the rke2-agent service on the node does not start and shows the following status:
# systemctl status rke2-agent.service
● rke2-agent.service - Rancher Kubernetes Engine v2 (agent)
     Loaded: loaded (/usr/local/lib/systemd/system/rke2-agent.service; enabled; vendor preset: disabled)
     Active: activating (start) since Fri 2024-05-03 21:24:18 UTC; 2min 0s ago
       Docs: https://github.com/rancher/rke2#readme
    Process: 13885 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service (code=exited, status=0/SUCCESS)
    Process: 13887 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
    Process: 13888 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
   Main PID: 13889 (rke2)
      Tasks: 8
     CGroup: /system.slice/rke2-agent.service
             └─13889 /usr/local/bin/rke2 agent

May 03 21:25:06 ip-10-0-0-136 rke2[13889]: time="2024-05-03T21:25:06Z" level=info msg="Waiting to retrieve agent configuration; server is not ready: Node password rejected, duplicate hostname or contents of '/etc/rancher/node/password' may not match server node-passwd entry>
In the Rancher UI, the cluster is stuck in an Updating state, and the following message is shown:
configuring worker node(s) custom-64d1c30adc69: waiting for probes: calico, kubelet 

Resolution

1. Stop the rke2-agent service on the node that is being registered:
systemctl stop rke2-agent.service
2. Find the secret for the node that was not cleaned up by running the following with a kubeconfig file pointed to the RKE2 cluster in question:
kubectl get secret -n kube-system | grep node-password.rke2
This will output some secrets that are named in the following format: <NODE_NAME>.node-password.rke2
(in this example, the secret for the node in question is: ip-10-0-0-136.node-password.rke2)

3. Delete the secret corresponding to the node in question:
kubectl delete secret -n kube-system ip-10-0-0-136.node-password.rke2
4. Start the rke2-agent service on the node that is being registered:
systemctl start rke2-agent.service


 

Cause

There is a secret for the node that was not cleaned up from the cluster when the node was previously removed.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000021443
  • Creation Date: 03-May-2024
  • Modified Date:03-May-2024
    • SUSE Rancher

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.