Upgrade to Rancher v2.2.4 fails for instances managing OpenStack CloudProvider enabled clusters with a Loadbalancer config: 'cannot unmarshal number into Go value of type string'
This document (000020233) is provided subject to the disclaimer at the end of this document.
Situation
Issue
Upon attempting to upgrade to Rancher v2.2.4, where the Rancher instance manages an, OpenStack Cloud Provider enabled, Kubernetes cluster with a Loadbalancer config, the Rancher server fails to start. Logs for the Rancher pods show error messages of the format:
E0606 07:39:20.296926 8 reflector.go:134] github.com/rancher/norman/controller/generic_controller.go:175: Failed to list *v3.Cluster: json: cannot unmarshal number into Go value of type string
Pre-requisites
- Upgrading Rancher to v2.2.4
- A Rancher launched, OpenStack Cloud Provider enabled, Kubernetes cluster with a Loadbalancer config.
Root cause
In order to resolve Rancher/14577, the monitor-delay
and monitor-timeout
parameters for OpenStack cluster loadbalancer healthchecks were set from an integer type to a string, in Rancher v2.2.4.
As the default in the Rancher API framework had configured these values to 0, upon upgrade to Rancher v2.2.4 an error occurs attempting to unmarshal these integer values of 0 to a string type. If these had been manually set to a non-zero integer value, resulting in kubelet failures in the OpenStack cluster itself previously, these will now result in failure of the Rancher pods themselves.
Resolution
You can apply a one time fix, to workaround this issue, by manually editing the monitor-delay
and monitor-timeout
values of the cluster
Custom Resource of affected clusters, via kubectl
run against the Rancher management cluster.
Using your RKE generated kube config, perform the following operations:
-
Identify affected clusters by running
kubectl get clusters
and checking for those with aspec.rancherKubernetesEngineConfig.cloudProvider.openstackCloudProvider.loadBalancer
definition. -
For affected clusters run
kubectl edit <cluster name>
, where<cluster name>
is themetadata.name
value for the cluster and update thespec.rancherKubernetesEngineConfig.cloudProvider.openstackCloudProvider.loadBalancer.monitor-delay
andspec.rancherKubernetesEngineConfig.cloudProvider.openstackCloudProvider.loadBalancer.monitor-timeout
fields to a quoted string. Example: if it was 30, change it to "30s", if it was 0, change it to "".
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020233
- Creation Date: 06-May-2021
- Modified Date:06-May-2021
-
- SUSE Rancher
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com