Many rancher-agent containers running on Rancher v2.x provisioned Kubernetes cluster, where stopped containers are regularly deleted on hosts

This document (000020202) is provided subject to the disclaimer at the end of this document.

Situation

Issue

On a Rancher v2.x provisioned cluster, a host shows a large number of containers running the rancher-agent image, per the following output of docker ps | grep rancher-agent:

$ docker ps | grep rancher-agent
...
aeffe9725521        rancher/rancher-agent:v2.3.3         "run.sh --server htt…"   About a minute ago   Up About a minute                       sleepy_hopper
130120f49b71        rancher/rancher-agent:v2.3.3         "run.sh --server htt…"   6 minutes ago        Up 6 minutes                            stoic_hypatia
498b923d9b6e        rancher/rancher-agent:v2.3.3         "run.sh --server htt…"   11 minutes ago        Up 11 minutes                            laughing_elbakyan
3453865e5f70        rancher/rancher-agent:v2.3.3         "run.sh --server htt…"   16 minutes ago        Up 16 minutes                            wonderful_gagarin
f925209cd16a        rancher/rancher-agent:v2.3.3         "run.sh --server htt…"   21 minutes ago       Up 21 minutes                           silly_shannon
7d7fb5d4bf04        rancher/rancher-agent:v2.3.3         "run.sh --server htt…"   26 minutes ago       Up 26 minutes                           gifted_elgamal
...

A docker inspect <container_id> for these containers, shows the Path and Args are of the following format:

"Path": "run.sh",
"Args": [
    "--server",
    "https://167.172.96.240",
    "--token",
    "gwrp7zlnwvsnzh2nhbvwcgdw45ccv6cq9pztzdd92j6xlv69xxhvnp",
    "--ca-checksum",
    "bbc8c7ca05c87a7140154554fa1a516178852f2710538c57718f4c874c29533c",
    "--no-register",
    "--only-write-certs"
],

Pre-requisites

  • A Rancher v2.x provisioned Kubernetes cluster, using either custom nodes or nodes hosted in an infrastructure provider.
  • Repeated deletion of stopped containers on hosts in the cluster, e.g. use of docker system prune, either manually or as part of an automated process such as a cronjob.

Root cause

This behaviour is a result of the issue tracked in Rancher GitHub issue #15364.

The share-mnt container is created on a Rancher provisioned Kubernetes cluster, and exits upon completion, but is not removed such that it can be invoked again.

Meanwhile, the Rancher node-agent Pod on a host will spawn a new share-mnt container, if the share-mnt is removed. Upon starting, the share-mnt process spawns a rancher-agent container to write certificates. This agent container will run indefinitely until the node-agent is triggered to reconnect to the Rancher server or the node-agent process is restarted.

As a result, where the share-mnt container on a host is removed repeatedly, either manually or by an automated process, this will result in multiple running rancher-agent containers.

Workaround

To trigger automatic removal of the rancher-agent containers, the node-agent container on the host can be restarted. Identifying the running agent container with docker ps | grep k8s_agent_cattle-node restart the container with docker restart <container_id>.

In addition, you can prevent further creation of multiple rancher-agent container instances by removing whichever process is triggering the deletion of stopped containers.

Resolution

An enhancement request, to prevent the creation of multiple long-running rancher-agent containers, in the event of repeated deletion of the share-mnt container, is tracked in Rancher GitHub issue #15364.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000020202
  • Creation Date: 06-May-2021
  • Modified Date:06-May-2021
    • SUSE Rancher

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback@suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center