🚀 Knowledge Base Beta
Preview our redesigned Knowledge Base. Click here to try it and give us your feedback! Your input helps us improve

rancher-logging-root-fluentd-0 pod keeps restarting continuously with exit code 137 even after increasing memory

This document (000021839) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Rancher 2.9.x
Rancher-logging 104.1.x+

Situation

A cluster with rancher-logging installed causing rancher-logging-root-fluentd-0 pod to restart continuously with error code '137'. But the same issue persists even after increasing the memory significantly.

The rancher-logging-root-fluentd-0 pod only shows below error:

Last status: Exited with 137: Error, Started: Fri Feb 28, 2025 4:50:53 PM, Exited: Fri Feb 28, 2025 4:57:07 PM

Upon further investigation, rancher-logging-root-fluentbit pod shows below errors:

[2025/03/04 11:01:38] [error] [net] TCP connection failed: rancher-logging-root-fluentd.cattle-logging-system.svc.cluster.local:24240 (Connection refused)
[2025/03/04 11:01:38] [error] [output:forward:forward.0] no upstream connections available
[2025/03/04 11:01:38] [ warn] [engine] failed to flush chunk '1-1741004097.135890300.flb', retry in 320 seconds: task_id=147, input=tail.0 > output=forward.0 (out_id=0)

Resolution

Configure the output buffer to use the type 'file' instead of 'memory'.
Below is an example output snippet for elasticsearch:

apiVersion: logging.banzaicloud.io/v1beta1
kind: Output
metadata:
  name: efk
  namespace: cattle-logging-system
spec:
  elasticsearch:
    buffer:
      flush_interval: 30s
      flush_mode: interval
      flush_thread_count: 4
      queued_chunks_limit_size: 300
      type: file                          <<========================

Furthermore, login to Rancher >> explore the desired cluster >> Apps >> Installed Apps >> Rancher-Logging >> Click on "Edit/Upgrade" and review if the 'Buffer_Chunk_Size' and 'Buffer_Max_Size' mentioned below can be tuned further with a value that best suits the cluster needs as per https://github.com/rancher/rancher-docs/issues/90

inputTail:
    Buffer_Chunk_Size: ''
    Buffer_Max_Size: ''

Observe that the pod rancher-logging-root-fluentd-0 does not restart anymore and logs are sent successfully.

Cause

By default when Fluent Bit processes data, it uses Memory as a primary and temporary place to store the records. There are scenarios where it would be ideal to have a persistent buffering mechanism based in the filesystem to provide aggregation and data safety capabilities.
Fluentbit can lead to these issues when destination is slow or the cluster is producing large volumes of data.
It is important to understand the correct configuration in case of slow destinations or large backpressure.
More information can be found here: https://docs.fluentbit.io/manual/administration/buffering-and-storage.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

Document ID:000021839
Creation Date: 15-May-2025
Modified Date:22-May-2025
- SUSE Rancher

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Report a Software Vulnerability

Go to Customer Center

SUSE Support

Here When You Need Us

🚀 Knowledge Base Beta Preview our redesigned Knowledge Base. Click here to try it and give us your feedback! Your input helps us improve