How to prevent Prometheus from scraping noisy kube-apiserver metrics

This document (000021191) is provided subject to the disclaimer at the end of this document.

Environment

Rancher 2.6.x, 2.7.x with cluster monitoring V2 enabled.

Situation

Prometheus memory utilization can increase depending on the number of metrics being ingested from the kube-apiserver instances and the size of the cluster workloads. Running this command from the Prometheus expression browser will display the top 10 scraped metrics.
topk(10, count by (__name__)({__name__=~".+"}))
Note:
The example below assumes that apiserver_request_duration_seconds_bucket is the metric that needs to be dropped.

Resolution

  • Go to the Cluster Explorer section of Rancher and click the required cluster.
  • Find Rancher monitoring from the Installed Apps section and choose the Edit & Upgrade option[1].
  • Leave the version unchanged and use the Edit YAML section to update the values for the chart. The values of the monitoring chart can be found in the upstream Rancher chart repo[2].
  • In the kubeApiserver configuration option in the values of the helm chart and add the drop config into the metricRelabelings section. The example config below will drop the apiserver_request_duration_seconds_bucket metric from being scraped. More information on metric relabelings can be found in the Prometheus documentation[3]. 

kubeApiserver:
  serviceMonitor
    metricRelabelings:
    - action: drop
      regex: apiserver_request_duration_seconds_bucket
      sourceLabels:
      - __name__
  • Once the config is added to the chart, click Update. After the update completes, restart the Prometheus statefulset for the changes to take effect.
kubectl rollout restart statefulset prometheus-rancher-monitoring-prometheus -n cattle-monitoring-system
  • Check the status of the statefulset and verify the pod is recreated.
kubectl rollout status statefulset prometheus-rancher-monitoring-prometheus -n cattle-monitoring-system 
  • Use the Prometheus expression browser to verify the apiserver_request_duration_seconds_bucket metric is no longer being scraped.

Additional Information

References:
  • [1] https://ranchermanager.docs.rancher.com/pages-for-subheaders/monitoring-and-alerting
  • [2] https://github.com/rancher/charts
  • [3] https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000021191
  • Creation Date: 18-Oct-2023
  • Modified Date:18-Oct-2023
    • SUSE Rancher

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center