51.6 Data Problems

Data problems are when the machine might or might not boot properly but, in either case, it is clear that there is data corruption on the system and that the system needs to be recovered. These situations call for a backup of your critical data, enabling you to recover a system state from before your system failed. SUSE Linux Enterprise offers dedicated YaST modules for system backup and restoration as well as a rescue system that can be used to recover a corrupted system from the outside.

51.6.1 Backing Up Critical Data

System backups can be easily managed using the YaST System Backup module:

  1. As root, start YaST and select System > System Backup.

  2. Create a backup profile holding all details needed for the backup, filename of the archive file, scope, and type of the backup:

    1. Select Profile Management > Add.

    2. Enter a name for the archive.

    3. Enter the path to the location of the backup if you want to keep a local backup. For your backup to be archived on a network server (via NFS), enter the IP address or name of the server and the directory that should hold your archive.

    4. Determine the archive type and click Next.

    5. Determine the backup options to use, such as whether files not belonging to any package should be backed up and whether a list of files should be displayed prior to creating the archive. Also determine whether changed files should be identified using the time-consuming MD5 mechanism.

      Use Expert to enter a dialog for the backup of entire hard disk areas. Currently, this option only applies to the Ext2 file system.

    6. Finally, set the search constraints to exclude certain system areas from the backup area that do not need to be backed up, such as lock files or cache files. Add, edit, or delete items until your needs are met and leave with OK.

  3. Once you have finished the profile settings, you can start the backup right away with Create Backup or configure automatic backup. It is also possible to create other profiles tailored for various other purposes.

To configure automatic backup for a given profile, proceed as follows:

  1. Select Automatic Backup from the Profile Management menu.

  2. Select Start Backup Automatically.

  3. Determine the backup frequency. Choose daily, weekly, or monthly.

  4. Determine the backup start time. These settings depend on the backup frequency selected.

  5. Decide whether to keep old backups and how many should be kept. To receive an automatically generated status message of the backup process, check Send Summary Mail to User root.

  6. Click OK to apply your settings and have the first backup start at the time specified.

51.6.2 Restoring a System Backup

Use the YaST System Restoration module to restore the system configuration from a backup. Restore the entire backup or select specific components that were corrupted and need to be reset to their old state.

  1. Start YaST > System > System Restoration.

  2. Enter the location of the backup file. This could be a local file, a network mounted file, or a file on a removable device, such as a floppy or a CD. Then click Next.

    The following dialog displays a summary of the archive properties, such as the filename, date of creation, type of backup, and optional comments.

  3. Review the archived content by clicking Archive Content. Clicking OK returns you to the Archive Properties dialog.

  4. Expert Options opens a dialog in which to fine-tune the restore process. Return to the Archive Properties dialog by clicking OK.

  5. Click Next to open the view of packages to restore. Press Accept to restore all files in the archive or use the various Select All, Deselect All, and Select Files buttons to fine-tune your selection. Only use the Restore RPM Database option if the RPM database is corrupted or deleted and this file is included in the backup.

  6. After you click Accept, the backup is restored. Click Finish to leave the module after the restore process is completed.

51.6.3 Recovering a Corrupted System

There are several reasons why a system could fail to come up and run properly. A corrupted file system after a system crash, corrupted configuration files, or a corrupted boot loader configuration are the most common ones.

SUSE Linux Enterprise offers two different methods to cope with this kind of situation. You can either use the YaST System Repair functionality or boot the rescue system. The following sections cover both flavors of system repair.

Using YaST System Repair

Before launching the YaST System Repair module, determine in which mode to run it to best fit your needs. Depending on the severeness and cause of your system failure and your expertise, there are three different modes to choose from:

Automatic Repair

If your system failed due to an unknown cause and you basically do not know which part of the system is to blame for the failure, use Automatic Repair. An extensive automated check will be performed on all components of your installed system. For a detailed description of this procedure, refer to Automatic Repair.

Customized Repair

If your system failed and you already know which component is to blame, you can cut the lengthy system check with Automatic Repair short by limiting the scope of the system analysis to those components. For example, if the system messages prior to the failure seem to indicate an error with the package database, you can limit the analysis and repair procedure to checking and restoring this aspect of your system. For a detailed description of this procedure, refer to Customized Repair.

Expert Tools

If you already have a clear idea of what component failed and how this should be fixed, you can skip the analysis runs and directly apply the tools necessary for the repair of the respective component. For details, refer to Verify Installed Software.

Choose one of the repair modes as described above and proceed with the system repair as outlined in the following sections.

Automatic Repair

To start the automatic repair mode of YaST System Repair, proceed as follows:

  1. Insert the first installation medium of SUSE Linux Enterprise into your CD or DVD drive.

  2. Reboot the system.

  3. At the boot screen, select Installation.

  4. Select the language and click Next.

  5. Confirm the license agreement and click Next.

  6. In System Analysis, select Other > Repair Installed System.

  7. Select Automatic Repair.

    YaST now launches an extensive analysis of the installed system. The progress of the procedure is displayed at the bottom of the screen with two progress bars. The upper bar shows the progress of the currently running test. The lower bar shows the overall progress of the analysis. The log window in the top section tracks the currently running test and its result. See Figure 51-2. The following main test runs are performed with every run. They contain, in turn, a number of individual subtests.

    Figure 51-2 Automatic Repair Mode

    Partition Tables of All Hard Disks

    Checks the validity and coherence of the partition tables of all detected hard disks.

    Swap Partitions

    The swap partitions of the installed system are detected, tested, and offered for activation where applicable. The offer should be accepted for the sake of a higher system repair speed.

    File Systems

    All detected file systems are subjected to a file system–specific check.

    Entries in the File /etc/fstab

    The entries in the file are checked for completeness and consistency. All valid partitions are mounted.

    Boot Loader Configuration

    The boot loader configuration of the installed system (GRUB or LILO) is checked for completeness and coherence. Boot and root devices are examined and the availability of the initrd modules is checked.

    Package Database

    This checks whether all packages necessary for the operation of a minimal installation are present. While it is optionally possible also to analyze the base packages, this takes a long time because of their vast number.

  8. Whenever an error is encountered, the procedure stops and a dialog opens outlining the details and possible solutions.

    Read the screen messages carefully before accepting the proposed fix. If you decide to decline a proposed solution, your system remains unchanged.

  9. After the repair process has been terminated successfully, click OK and Finish and remove the installation media. The system automatically reboots.

Customized Repair

To launch the Customized Repair mode and selectively check certain components of your installed system, proceed as follows:

  1. Insert the first installation medium of SUSE Linux Enterprise into your CD or DVD drive.

  2. Reboot the system.

  3. At the boot screen, select Installation.

  4. Select the language and click Next.

  5. Confirm the license agreement and click Next.

  6. In System Analysis, select Other > Repair Installed System.

  7. Select Customized Repair.

    Choosing Customized Repair shows a list of test runs that are all marked for execution at first. The total range of tests matches that of automatic repair. If you already know where no damage is present, unmark the corresponding tests. Clicking Next starts a narrower test procedure that probably has a significantly shorter running time.

    Not all test groups can be applied individually. The analysis of the fstab entries is always bound to an examination of the file systems, including existing swap partitions. YaST automatically resolves such dependencies by selecting the smallest number of necessary test runs.

  8. Whenever an error is encountered, the procedure stops and a dialog opens outlining the details and possible solutions.

    Read the screen messages carefully before accepting the proposed fix. If you decide to decline a proposed solution, your system remains unchanged.

  9. After the repair process has been terminated successfully, click OK and Finish and remove the installation media. The system automatically reboots.

Expert Tools

If you are knowledgeable with SUSE Linux Enterprise and already have a very clear idea of what needs to be repaired in your system, directly apply the tools skipping the system analysis.

To make use of the Expert Tools feature of the YaST System Repair module, proceed as follows:

  1. Boot the system with the original installation medium used for the initial installation (as outlined in Section 3.0, Installation with YaST).

  2. In System Analysis, select Other > Repair Installed System.

  3. Select Expert Tools and choose one or more repair options.

  4. After the repair process has been terminated successfully, click OK and Finish and remove the installation media. The system automatically reboots.

Expert tools provides the following options to repair your faulty system:

Install New Boot Loader

This starts the YaST boot loader configuration module. Find details in Section 21.3, Configuring the Boot Loader with YaST.

Start Partitioning Tool

This starts the expert partitioning tool in YaST.

Repair File System

This checks the file systems of your installed system. You are first offered a selection of all detected partitions and can then choose the ones to check.

Recover Lost Partitions

It is possible to attempt to reconstruct damaged partition tables. A list of detected hard disks is presented first for selection. Clicking OK starts the examination. This can take a while depending on the processing power and size of the hard disk.

IMPORTANT: Reconstructing a Partition Table

The reconstruction of a partition table is tricky. YaST attempts to recognize lost partitions by analyzing the data sectors of the hard disk. The lost partitions are added to the rebuilt partition table when recognized. This is, however, not successful in all imaginable cases.

Save System Settings to Floppy

This option saves important system files to a floppy disk. If one of these files become damaged, it can be restored from disk.

Verify Installed Software

This checks the consistency of the package database and the availability of the most important packages. Any damaged installed packages can be reinstalled with this tool.

Using the Rescue System

SUSE Linux Enterprise contains a rescue system. The rescue system is a small Linux system that can be loaded into a RAM disk and mounted as root file system, allowing you to access your Linux partitions from the outside. Using the rescue system, you can recover or modify any important aspect of your system:

  • Manipulate any type of configuration file.

  • Check the file system for defects and start automatic repair processes.

  • Access the installed system in a change root environment

  • Check, modify, and reinstall the boot loader configuration

  • Resize partitions using the parted command. Find more information about this tool at the Web site of GNU Parted (http://www.gnu.org/software/parted/parted.html).

The rescue system can be loaded from various sources and locations. The simplest option is to boot the rescue system from the original installation CD or DVD:

  1. Insert the installation medium into your CD or DVD drive.

  2. Reboot the system.

  3. At the boot screen, choose the Rescue System option.

  4. Enter root at the Rescue: prompt. A password is not required.

If your hardware setup does not include a CD or DVD drive, you can boot the rescue system from a network source. The following example applies to a remote boot scenario—if using another boot medium, such as a floppy disk, modify the info file accordingly and boot as you would for a normal installation.

  1. Enter the configuration of your PXE boot setup and replace install=protocol://instsource with rescue=protocol://instsource. As with a normal installation, protocol stands for any of the supported network protocols (NFS, HTTP, FTP, etc.) and instsource for the path to your network installation source.

  2. Boot the system using Wake on LAN, as described in Section 4.3.7, Wake on LAN.

  3. Enter root at the Rescue: prompt. A password is not required.

Once you have entered the rescue system, you can make use of the virtual consoles that can be reached with Alt+F1 to Alt+F6.

A shell and many other useful utilities, such as the mount program, are available in the /bin directory. The sbin directory contains important file and network utilities for reviewing and repairing the file system. This directory also contains the most important binaries for system maintenance, such as fdisk, mkfs, mkswap, mount, mount, init, and shutdown, and ifconfig, ip, route, and netstat for maintaining the network. The directory /usr/bin contains the vi editor, find, less, and ssh.

To see the system messages, either use the command dmesg or view the file /var/log/messages.

Checking and Manipulating Configuration Files

As an example for a configuration that might be fixed using the rescue system, imagine you have a broken configuration file that prevents the system from booting properly. You can fix this using the rescue system.

To manipulate a configuration file, proceed as follows:

  1. Start the rescue system using one of the methods described above.

  2. To mount a root file system located under /dev/sda6 to the rescue system, use the following command:

    mount /dev/sda6 /mnt

    All directories of the system are now located under /mnt

  3. Change the directory to the mounted root file system:

    cd /mnt
  4. Open the problematic configuration file in the vi editor. Adjust and save the configuration.

  5. Unmount the root file system from the rescue system:

    umount /mnt
  6. Reboot the machine.

Repairing and Checking File Systems

Generally, file systems cannot be repaired on a running system. If you encounter serious problems, you may not even be able to mount your root file system and the system boot may end with a kernel panic. In this case, the only way is to repair the system from the outside. It is strongly recommended to use the YaST System Repair for this task (see Expert Tools for details). However, if you need to do a manual file system check or repair, boot the rescue system. It contains the utilities to check and repair the ext2, ext3, reiserfs, xfs, dosfs, and vfat file systems.

Accessing the Installed System

If you need to access the installed system from the rescue system to, for example, modify the boot loader configuration, or to execute a hardware configuration utility, you need to do this in a change root environment.

To set up a change root environment based on the installed system, proceed as follows:

  1. First mount the root partition from the installed system and the device file system:

    mount /dev/sda6 /mnt
    mount --bind /dev /mnt/dev
    
  2. Now you can change root into the new environment:

    chroot /mnt
  3. Then mount /proc and /sys:

    mount /proc
    mount /sys
    
  4. Finally, mount the remaining partitions from the installed system:

    mount -a
  5. Now you have access to the installed system. Before rebooting the system, unmount the partitions with umount -a and leave the change root environment with exit.

WARNING: Limitations

Although you have full access to the files and applications of the installed system, there are some limitations. The kernel that is running is the one that was booted with the rescue system. It only supports essential hardware and it is not possible to add kernel modules from the installed system unless the kernel versions are exactly the same (which is unlikely). So you cannot access a sound card, for example. It is also not possible to start a graphical user interface.

Also note that you leave the change root environment when you switch the console with Alt+F1 to Alt+F6.

Modifying and Reinstalling the Boot Loader

Sometimes a system cannot boot because the boot loader configuration is corrupted. The start-up routines cannot, for example, translate physical drives to the actual locations in the Linux file system without a working boot loader.

To check the boot loader configuration and reinstall the boot loader, proceed as follows:

  1. Perform the necessary steps to access the installed system as described in Accessing the Installed System.

  2. Check whether the following files are correctly configured according to the GRUB configuration principles outlined in Section 21.0, The Boot Loader.

    • /etc/grub.conf

    • /boot/grub/device.map

    • /boot/grub/menu.lst

    Apply fixes to the device mapping (device.map) or the location of the root partition and configuration files, if necessary.

  3. Reinstall the boot loader using the following command sequence:

    grub --batch < /etc/grub.conf
  4. Unmount the partitions, log out from the change root environment, and reboot the system:

    umount -a
    exit
    reboot