SUSE Support

Here When You Need Us

kube-apiserver "socket: too many open files" error messages

This document (000020016) is provided subject to the disclaimer at the end of this document.

Situation

Issue

During normal operation of a Kubernetes cluster, you may experience intermittent stability issues and the kube-apiserver logs may contain messages of the following format:

  • clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://x.x.x.x:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp x.x.x.x:2379: socket: too many open files". Reconnecting...
  • clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://x.x.x.x:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: authentication handshake failed: context canceled". Reconnecting...
  • clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://x.x.x.x:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: authentication handshake failed: context deadline exceeded". Reconnecting...

Root Cause

These symptoms can be caused by the kube-apiserver being blocked by configuration that limits the number of files a process can have open. This limit could also affect other components and OS services.

This is typically a result of restrictive ulimits, or a high number of open connections.

Below is a non-exhaustive list of places where the number of open files ulimit can be set for a Docker container.

System ulimits (/etc/security/limits.conf):

This file defines the persisted configuration for the system-wide ulimits, such as file size limits, and how much memory can be used by the different components of the process, including the stack, data and text segments.

The limit of interest is the nofile limit, which defines the number of files a process can have open at any given time. This can be set per user, or for all users(*) and there are two limits to define:

  • Soft limit - These limits are ones that the user can move up or down within the range permitted by any pre-existing hard limits. A user can modify the soft limit by running the command ulimit -n X where X is the desired new value.
  • Hard limit - These limits are set by the superuser and enforced by the Kernel. Users cannot exceed this.

The nofile hard limit for the current user can be seen by running ulimit -Hn and the soft limit can be seen by running ulimit -Sn.

More info on limits.conf can be found here.
 

k3s and rke2 configuration

The k3s and rke2 install scripts both define LimitNOFILE=1048576 on the respective services. If you don't use the install scripts, you may need to configure ulimits as described below.
 

Systemd configuration

By design, systemd will ignore ulimits set via /etc/security/limits.conf, and instead apply its own limits. These can be configured per-service or system-wide.
The system-wide systemd nofile limit is defined in /etc/systemd/system.conf as DefaultLimitNOFILE=X:Y. Where X is the soft limit and Y is the hard limit.

It is possible to set nofile for a specific service, either by defining LimitNOFILE within the service file itself or creating an override file. For example, defining it directly within the docker systemd service file (/lib/systemd/system/docker.service):

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket

[Service]
Type=notify
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always

LimitNOFILE=infinity

Or creating a systemd override file (/etc/systemd/system/docker.d/override.conf):

[Service]
LimitNOFILE=infinity

Note: The docker.d directory name may be slightly different between Linux distributions. It is usually recommended to create an override, as this will persist through system updates.

Note: On older versions of systemd, LimitNOFILE=infinity results in a limit of 65535. This is fixed as part of this commit which was merged in systemd v234. More info is available here.

Docker daemon configuration

It is possible to configure Docker to enforce its own open file limits on specific containers through the command line flags --default-ulimit nofile=X:Y.

This can be applied to all containers by specifying the limit within the /etc/docker/daemon.json configuration file:

{
  "default-ulimits": {
    "nofile": {
      "Name": "nofile",
      "Hard": 64000,
      "Soft": 64000
}

Resolution

If you have any non-default configuration that is applying nofile restrictions on either docker, or containers, revert these to the default configuration, or increase the limits and re-test.
 

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000020016
  • Creation Date: 07-Mar-2024
  • Modified Date:08-Mar-2024
    • SUSE Rancher

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.