SUSE Conversations


Troubleshooting a supportconfig Hang



By: jrecord

February 7, 2008 10:42 pm

Reads:180

Comments:0

Rating:0

Problem

On occasion the supportconfig will hang when gathering data. The supportconfig is just a bash script that runs system commands. There are times that a command is executed at an inappropriate time, causing a hang condition. However, this is rare. Most of the time, the supportconfig just identifies a problem you already have with a normal system command. So how can you find out which system command supportconfig is hanging on?

Solution

If you want to attempt to skip the hanging command, simply press Ctrl-\ once or twice. If this doesn’t work, you will need to open an additional terminal, and follow the troubleshooting steps below.

When you observe a hang condition, do the following:

  1. Notice the last message of the supportconfig output on the screen.
  2. Match the message with the corresponding supportconfig file.
  3. Look at the last line in the file to see which system command is hanging.
  4. Finally, run the Binary Check Tool (chkbin) against the command to help troubleshoot the hang.

Let’s step through an example. In this case, the supportconfig hangs gathering Network Time Protocol (NTP) information.

  1. Notice the last message of the supportconfig output on the screen.
  2. Gathering system information
    
      Basic Server Health Check...       Done
      RPM Database...                    Done
      Basic Environment...               Done
      System Modules...                  Done
      Memory Details...                  Done
      Disk I/O...                        Done
      System Logs...                     Done
      YaST Files...                      Done
      File System List...                Skipped
      Crash Info...                      Skipped
      NTP...
    

    “NTP…” is the last message on the screen.

  3. Match the message with the corresponding supportconfig file.
  4. Most of the supportconfig text filenames match closely with the displayed message; like “RPM Database” matches to rpm.txt, and “Crash Info” to crash.txt. You can also grep the supportconfig script itself for the filename.

    # grep -A2 'NTP...' /sbin/supportconfig
       printlog "NTP..."
       test $OPTION_NTP -eq 0 && { echolog Excluded; return 1; }
       OF=ntp.txt
       if rpm_verify $OF xntp
    

    The OF=ntp.txt shows the supportconfig uses ntp.txt for it’s NTP information. You can also see the OPTION_NTP variable is used to exclude all NTP information. If you wanted to bypass the hang, you could change the OPTION_NTP=1 to OPTION_NTP=0 in the /etc/supportconfig.conf to exclude NTP information. Get a complete supportconfig once you exclude the problematic section.

  5. Look at the last line in the file to see which system command is hanging.
  6. The supportconfig creates a directory in /var/log as it’s gathering information. Upon successful completion, it tars up the directory and then deletes it. Since the supportconfig hung, the directory should still be in /var/log with the format /var/log/nts_hostname_date_time.

    larktop:~ # cd /var/log/nts_larktop_080205_2246/
    
    larktop:/var/log/nts_larktop_080205_2246 # tail ntp.txt
    #==[ Command ]======================================#
    # /sbin/chkconfig ntp --list
    ntp    0:off  1:off  2:on   3:on   4:off  5:on   6:off
    
    #==[ Command ]======================================#
    # /etc/init.d/ntp status
    Checking for network time protocol daemon (NTPD): ..unused
    
    #==[ Command ]======================================#
    # /usr/sbin/ntpq -p
    

    The last command to be executed prior to the hang is ntpq.

  7. Finally, run the Binary Check Tool (chkbin) against the command to help troubleshoot the hang.
  8. #--[ Checking File Ownership ]-----------------------#
    /usr/sbin/ntpq           - from RPM: xntp-4.2.0a-70.14
     :Shell script
    
    #--[ Validating Unique RPMs ]------------------------#
    Validating RPM: xntp-4.2.0a-70.14      [   Warning   ]
    S.5....T  c /etc/ntp.conf
    S.5....T    /usr/sbin/ntpq
    

    The RPM validation shows that the size, md5sum and time stamps have all changed on the ntpq executable. It also says ntpq is a shell script. Something is very wrong, since ntpq is supposed to be a dynamically linked executable. The best course of action is to reinstall the xntp RPM package.

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

Tags: ,
Categories: SUSE Linux Enterprise Server, Technical Solutions

Disclaimer: As with everything else at SUSE Conversations, this content is definitely not supported by SUSE (so don't even think of calling Support if you try something and it blows up).  It was contributed by a community member and is published "as is." It seems to have worked for at least one person, and might work for you. But please be sure to test, test, test before you do anything drastic with it.

Comment

RSS