Quickly vanishing disk space for the /var/lib/pgsql volume on SUSE Manager deployments

This document (000020854) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Manager 4.x (could apply to any installed version)

Situation

Disk space on the /var/lib/pgsql volume is quickly depleting. Even after resizing the respective logical volume and its filesystem the disk space is vanishing again. 

Resolution

Please note:
The backup directory should be located on a different volume as the postgresql database itself. 

1.) Stop spacewalk-service: 

spacewalk-service stop

2.) Stop the postgresql database:

systemctl stop postgresql.service

3.) Move the contents of the backup directory (in this example /data/backup-db) to a different location.
4.) Start the postgresql database again:

systemctl start postgresql.service

5.) Start the spacewalk-service again:

spacewalk-service start

6.) Create a new hot-backup of the database:

/usr/bin/smdba backup-hot --enable=on --backup-dir=/data/backup-db

7.) Monitor disk space usage of /var/lib/pgsql.

Cause

A possible cause for this behaviour could be the smdba-pgarchive command encountered an error at some point in time and is unable to backup certain write-ahead-log (located in /var/lib/pgsql/data/pg_wal/*) files. Please check the latest postgresql database logs in /var/lib/pgsql/data/*.log first for error messages similar to:

2022-11-10 00:00:04.436 CET   [1977]DETAIL:  The failed archive command was: /usr/bin/smdba-pgarchive --source "pg_wal/000000010000037B000000A9" --destination "/data/backup-db/000000010000037B000000A9"
File already exists: /data/backup-db/000000010000037B000000A9
2022-11-10 00:00:05.440 CET   [1977]LOG:  archive command failed with exit code 1
2022-11-10 00:00:05.440 CET   [1977]DETAIL:  The failed archive command was: /usr/bin/smdba-pgarchive --source "pg_wal/000000010000037B000000A9" --destination "/data/backup-db/000000010000037B000000A9"

Additional Information

Using the command
grep -l "pg_wal/000000010000037B000000A9" /var/lib/pgsql/data/log/*.log | sort -nr
should help to identify the file when the error occurred for the first time. This data point may be used to verify whether the starting disk space depletion matches the timestamps from the monitoring system reports. 

To see which /var/lib/pgsql/data/pg_wal files are affected and the total number of error occurrences:
> grep "The failed archive command was: /usr/bin/smdba-pgarchive" *.log|grep "pg_wal"|cut -d \" -f 2|sort|uniq -c
      9 pg_wal/000000010000029D000000D4
      6 pg_wal/00000001000002C300000035
      6 pg_wal/0000000100000310000000C9
      9 pg_wal/000000010000031C000000A3
      6 pg_wal/000000010000034100000089
      6 pg_wal/000000010000034E000000D4
 233628 pg_wal/000000010000037B000000A9

Please adjust the respective commands where necessary.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000020854
  • Creation Date: 11-Nov-2022
  • Modified Date:11-Nov-2022
    • SUSE Manager Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center