How to enable and query the supervisor loadbalancer metrics in an RKE2 or K3s cluster
This document (000021744) is provided subject to the disclaimer at the end of this document.
Environment
A Rancher-provisioned or standalone RKE2 or K3s cluster; running RKE2 v1.29.12+rke2r1, v1.30.8+rke2r1, v1.31.4+rke2r1 or above; or K3s v1.29.12+k3s1, v1.30.8+k3s1, v1.31.4+k3s1 or above
Situation
The supervisor loadbalancer in RKE2 and K3s clusters loadbalances kube-apiserver, etcd and supervisor traffic between cluster nodes. More information on the supervisor process and load-balancing can be found in the documentation here. This article provides instructions on how to query metrics for this loadbalancer process.
Resolution
The presence of the loadbalancer depends upon the node's roles, as below:
- An all role, or controlplane and etcd role server node will have no loadbalancers, since it has everything locally.
- A control-plane role only server node will have a loadbalancer to connect to etcd nodes.
- An etcd role only server node will have a loadbalancer to connect to control-plane nodes.
- An agent (worker role only) node will have a loadbalancer to connect to control-plane nodes.
The command for querying the loadbalancer metrics depends upon the cluster and node type, as detailed in the following instructions.
RKE2
In an RKE2 cluster the supervisor's loadbalancer metrics are exposed on nodes via the supervisor port (9345).
To query the loadbalancer metrics in an RKE2 cluster:
-
Enable the supervisor metrics in the RKE2 configuration:
-
For a standalone RKE2 cluster node: Edit the configuration file on the node under /etc/rancher/rke2/config.yaml, add
supervisor-metrics: true and restart the rke2-server/rke2-agent process, depending upon the node type (`systemctl restart rke2-server` or `systemctl restart rke2-agent`)
## sample /etc/rancher/rke2/config.yaml [...] supervisor-metrics: true [...]
-
For a Rancher-provisioned RKE2 cluster: in the Cluster Management interface of the Rancher UI, click Edit YAML for the applicable cluster, add the configuration "supervisor-metrics: true" into the machineGlobalConfig block, and click Save
[...] machineGlobalConfig: supervisor-metrics: true [...]
-
-
Check the metrics are enabled: On an all role or control-plane role only node in the cluster (not via the Rancher-proxied Kubernetes API endpoint or Authorized Cluster Endpoint), query the metrics per the example below. Replace <node-ip> with the IP of the node you wish to query. N.B. The node from which you are running the command will need to be able to reach port 9345 on the node you are querying.
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml alias kubectl=/var/lib/rancher/rke2/bin/kubectl kubectl get --server https://<node-ip>:9345 --raw /metrics | grep load # HELP rke2_loadbalancer_dial_duration_seconds Time taken to dial a connection to a backend server # TYPE rke2_loadbalancer_dial_duration_seconds histogram rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="0.001"} 13 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="0.002"} 22 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="0.004"} 51 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="0.008"} 79 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="0.016"} 123 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="0.032"} 154 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="0.064"} 163 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="0.128"} 163 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="0.256"} 163 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="0.512"} 163 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="1.024"} 163 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="2.048"} 163 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="4.096"} 163 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="8.192"} 163 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="16.384"} 163 rke2_loadbalancer_dial_duration_seconds_bucket{name="rke2-etcd-server-load-balancer",status="success",le="+Inf"} 163 rke2_loadbalancer_dial_duration_seconds_sum{name="rke2-etcd-server-load-balancer",status="success"} 1.7390672159999991 rke2_loadbalancer_dial_duration_seconds_count{name="rke2-etcd-server-load-balancer",status="success"} 163 # HELP rke2_loadbalancer_server_connections Count of current connections to loadbalancer server # TYPE rke2_loadbalancer_server_connections gauge rke2_loadbalancer_server_connections{name="rke2-etcd-server-load-balancer",server="24.199.104.66:2379"} 0 rke2_loadbalancer_server_connections{name="rke2-etcd-server-load-balancer",server="24.199.96.251:2379"} 0 rke2_loadbalancer_server_connections{name="rke2-etcd-server-load-balancer",server="64.23.213.163:2379"} 153 # HELP rke2_loadbalancer_server_health Current health value of loadbalancer server # TYPE rke2_loadbalancer_server_health gauge rke2_loadbalancer_server_health{name="rke2-etcd-server-load-balancer",server="24.199.104.66:2379"} 5 rke2_loadbalancer_server_health{name="rke2-etcd-server-load-balancer",server="24.199.96.251:2379"} 5 rke2_loadbalancer_server_health{name="rke2-etcd-server-load-balancer",server="64.23.213.163:2379"} 7
K3s
In a K3s cluster the loadbalancer metrics are exposed on agent (worker role only) nodes via the kubelet metrics port (10250) and on server nodes via the kube-apiserver port (6443).
To query the loadbalancer metrics in a K3s cluster:
-
Enable the supervisor metrics in the K3s configuration:
-
For a standalone K3s cluster node: Edit the configuration file on the node under /etc/rancher/k3s/config.yaml, add
supervisor-metrics: true and
restart the K3s service(`systemctl restart k3s`)
## sample /etc/rancher/k3s/config.yaml [...] supervisor-metrics: true [...]
-
For a Rancher-provisioned K3s cluster: in the Cluster Management interface of the Rancher UI, click Edit YAML for the applicable cluster, add the configuration "supervisor-metrics: true" into the machineGlobalConfig block, and click Save
[...] machineGlobalConfig: supervisor-metrics: true [...]
-
-
Check the metrics are enabled: On an a server node in the cluster (not via the Rancher-proxied Kubernetes API endpoint or Authorized Cluster Endpoint), query the metrics per the example below. Replace <node-ip> with the IP of the node you wish to query and <port> with 10250 if querying an agent node, or 6443 if querying a server node. N.B. The node from which you are running the command will need to be able to reach <port> on the node you are querying.
kubectl get --server https://<node-ip>:<port> --raw /metrics |grep -i k3s_load # HELP k3s_loadbalancer_dial_duration_seconds Time taken to dial a connection to a backend server # TYPE k3s_loadbalancer_dial_duration_seconds histogram k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="0.001"} 218 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="0.002"} 239 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="0.004"} 253 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="0.008"} 264 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="0.016"} 278 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="0.032"} 290 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="0.064"} 293 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="0.128"} 294 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="0.256"} 294 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="0.512"} 294 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="1.024"} 294 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="2.048"} 294 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="4.096"} 294 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="8.192"} 294 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="16.384"} 294 k3s_loadbalancer_dial_duration_seconds_bucket{name="k3s-etcd-server-load-balancer",status="success",le="+Inf"} 294 k3s_loadbalancer_dial_duration_seconds_sum{name="k3s-etcd-server-load-balancer",status="success"} 0.8818124119999996 k3s_loadbalancer_dial_duration_seconds_count{name="k3s-etcd-server-load-balancer",status="success"} 294 # HELP k3s_loadbalancer_server_connections Count of current connections to loadbalancer server # TYPE k3s_loadbalancer_server_connections gauge k3s_loadbalancer_server_connections{name="k3s-etcd-server-load-balancer",server="164.92.125.58:2379"} 102 # HELP k3s_loadbalancer_server_health Current health value of loadbalancer server # TYPE k3s_loadbalancer_server_health gauge k3s_loadbalancer_server_health{name="k3s-etcd-server-load-balancer",server="164.92.125.58:2379"} 7
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000021744
- Creation Date: 18-Mar-2025
- Modified Date:19-Aug-2025
-
- SUSE Rancher
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com