Migrating SLES to Software RAID1
Overview
Prepare the non-RAID Disk
Create a Degraded Array
Copy the System Disk to the Degraded Array
Build the Finished Array
Related Links
Conclusion
Overview
The purpose of this document is to show how to migrate an existing SUSE Linux Enterprise Server (SLES) system to a mirrored array (RAID1). This is a dangerous operation! You could leave your system in an unbootable, unrecoverable state. So backup your data before you begin. By the time you are done, you will be able to boot your system to a software RAID1 array with two disks. All examples are based on SLES10 SP1. However, the technique works for both SLES9 and SLES10, except where noted.
/dev/sda | – Installed non-RAID system disk |
/dev/sda1 | – swap partition |
/dev/sda2 | – root partition |
/dev/sdb | – First empty disk for the RAID mirror |
/dev/md0 | – Mirrored swap partition |
/dev/md1 | – Mirrored root partition |
Prepare the non-RAID Disk
- You backed up your system, right?
- Confirm that both disks are the same size.
linux:~ # cat /proc/mdstat Personalities : unused devices: <none> linux:~ # cat /proc/partitions major minor #blocks name 8 0 2097152 sda 8 1 514048 sda1 8 2 1582402 sda2 8 16 2097152 sdb
- It is recommended the migration be performed in run level 1 to minimize corruption possibilities.
Change the default run level to 1.linux:~ # init 1 linux:~ # cat /etc/inittab | grep default: id:3:initdefault: linux:~ # vi /etc/inittab id:1:initdefault:
- Make sure that your devices do not have labels and that you are referencing the disks by device name.
linux:~ # cat /etc/fstab /dev/sda2 / reiserfs defaults 1 1 /dev/sda1 swap swap pri=42 0 0 linux:~ # cat /boot/grub/menu.lst default 0 timeout 8 title SUSE Linux Enterprise Server 10 SP1 root (hd0,1) kernel /boot/vmlinuz-2.6.16.46-0.12-default root=/dev/sda2 resume=/dev/sda1 showopts initrd /boot/initrd-2.6.16.46-0.12-default.orig
- Change the partition type on the existing non-RAID disk to type ‘fd’ (Linux raid autodetect).
linux:~ # fdisk /dev/sda Command (m for help): p Disk /dev/sda: 2147 MB, 2147483648 bytes 255 heads, 63 sectors/track, 261 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 64 514048+ 82 Linux swap /dev/sda2 * 65 261 1582402+ 83 Linux Command (m for help): t Partition number (1-4): 1 Hex code (type L to list codes): fd Changed system type of partition 1 to fd (Linux raid autodetect) Command (m for help): t Partition number (1-4): 2 Hex code (type L to list codes): fd Changed system type of partition 2 to fd (Linux raid autodetect) Command (m for help): p Disk /dev/sda: 2147 MB, 2147483648 bytes 255 heads, 63 sectors/track, 261 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 64 514048+ fd Linux raid autodetect /dev/sda2 * 65 261 1582402+ fd Linux raid autodetect Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. WARNING: Re-reading the partition table failed with error 16: Device or resource busy. The kernel still uses the old table. The new table will be used at the next reboot. Syncing disks.
- Copy the non-RAID disk’s partition table to the empty disk.
linux:~ # sfdisk -d /dev/sda > partitions.txt linux:~ # sfdisk /dev/sdb < partitions.txt Checking that no-one is using this disk right now ... OK Disk /dev/sdb: 261 cylinders, 255 heads, 63 sectors/track sfdisk: ERROR: sector 0 does not have an msdos signature /dev/sdb: unrecognized partition Old situation: No partitions found New situation: Units = sectors of 512 bytes, counting from 0 Device Boot Start End #sectors Id System /dev/sdb1 63 1028159 1028097 fd Linux raid autodetect /dev/sdb2 * 1028160 4192964 3164805 fd Linux raid autodetect /dev/sdb3 0 - 0 0 Empty /dev/sdb4 0 - 0 0 Empty Successfully wrote the new partition table Re-reading the partition table ... If you created or changed a DOS partition, /dev/foo7, say, then use dd(1) to zero the first 512 bytes: dd if=/dev/zero of=/dev/foo7 bs=512 count=1 (See fdisk(8).) linux:~ # cat /proc/partitions major minor #blocks name 8 0 2097152 sda 8 1 514048 sda1 8 2 1582402 sda2 8 16 2097152 sdb 8 17 514048 sdb1 8 18 1582402 sdb2
- Reboot the server.
Create a Degraded Array
- Create the degraded array on the empty disk, but leave out the existing system disk for now.
linux:~ # mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 missing mdadm: array /dev/md0 started. linux:~ # mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sdb2 missing mdadm: array /dev/md1 started. linux:~ # cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sdb2[0] 1582336 blocks [2/1] [U_] md0 : active raid1 sdb1[0] 513984 blocks [2/1] [U_] unused devices: <none>
- Format the degraded array devices.
linux:~ # mkswap /dev/md0 Setting up swapspace version 1, size = 526315 kB linux:~ # mkreiserfs /dev/md1 --snip-- ReiserFS is successfully created on /dev/md1.
- Backup the original initrd.
linux:~ # ls -l /boot/initrd* lrwxrwxrwx 1 root root 29 Apr 2 12:03 initrd -> initrd-2.6.16.46-0.12-default -rw-r--r-- 1 root root 2930512 Apr 2 10:15 initrd-2.6.16.46-0.12-default linux:/boot # mv initrd-2.6.16.46-0.12-default initrd-2.6.16.46-0.12-default.orig
- Recreate the initrd with Software RAID1 support.
- Creating a SLES10+ initrd
- Creating a SLES9 initrd
linux:~ # head /etc/sysconfig/kernel | grep INITRD_MODULES INITRD_MODULES="mptscsih reiserfs" linux:~ # vi /etc/sysconfig/kernel linux:~ # head /etc/sysconfig/kernel | grep INITRD_MODULES INITRD_MODULES="raid1 mptscsih reiserfs" linux:~ # mkinitrd Root device: /dev/sda2 (mounted on / as reiserfs) Module list: raid1 mptspi reiserfs Kernel image: /boot/vmlinuz-2.6.5-7.311-default Initrd image: /boot/initrd-2.6.5-7.311-default Shared libs: lib/ld-2.3.3.so lib/libblkid.so.1.0 lib/libc.so.6 lib/libpthread.so.0 lib/libselinux.so.1 lib/libuuid.so.1.2 Modules: kernel/drivers/scsi/scsi_mod.ko kernel/drivers/scsi/sd_mod.ko kernel/drivers/md/raid1.ko kernel/drivers/message/fusion/mptbase.ko kernel/drivers/message/fusion/mptscsih.ko kernel/drivers/message/fusion/mptspi.ko kernel/fs/reiserfs/reiserfs.ko Including: udev raidautorun linux:~ # ls -l /boot/initrd* lrwxrwxrwx 1 root root 26 Apr 3 02:41 initrd -> initrd-2.6.5-7.311-default -rw-r--r-- 1 root root 1428402 Apr 3 02:41 initrd-2.6.5-7.311-default -rw-r--r-- 1 root root 1425140 Apr 2 09:40 initrd-2.6.5-7.311-default.orig
- Create an additional boot entry in the menu.lst file with a modified initrd path so you can boot from the non-RAID or the degraded array, in case you make mistakes during the migration.
linux:~ # vi /boot/grub/menu.lst default 0 timeout 8 title SUSE Linux Enterprise Server 10 SP1 root (hd0,1) kernel /boot/vmlinuz-2.6.16.46-0.12-default root=/dev/sda2 resume=/dev/sda1 showopts initrd /boot/initrd-2.6.16.46-0.12-default.orig title SUSE Linux Enterprise Server 10 SP1 LinuxRAID1 root (hd0,1) kernel /boot/vmlinuz-2.6.16.46-0.12-default root=/dev/md1 resume=/dev/md0 showopts initrd /boot/initrd-2.6.16.46-0.12-default
linux:~ # mkinitrd -f md Root device: /dev/sda2 (mounted on / as reiserfs) Module list: piix mptspi ide-generic processor thermal fan reiserfs edd (xennet xenblk) Kernel image: /boot/vmlinuz-2.6.16.46-0.12-default Initrd image: /boot/initrd-2.6.16.46-0.12-default Shared libs: lib/ld-2.4.so lib/libacl.so.1.1.0 lib/libattr.so.1.1.0 lib/libc-2.4.so lib/libdl-2.4.so lib/libhistory.so.5.1 lib/libncurses.so.5.5 lib/libpthread-2.4.so lib/libreadline.so.5.1 lib/librt-2.4.so lib/libuuid.so.1.2 lib/libnss_files-2.4.so lib/libnss_files.so.2 lib/libgcc_s.so.1 Driver modules: ide-core ide-disk scsi_mod sd_mod piix scsi_transport_spi mptbase mptscsih mptspi ide-generic processor thermal fan edd raid0 raid1 xor raid5 linear libata ahci ata_piix Filesystem modules: reiserfs Including: initramfs mdadm fsck.reiserfs Bootsplash: SuSE-SLES (800x600) 13481 blocks linux:~ # ls -l /boot/initrd* lrwxrwxrwx 1 root root 29 Apr 2 12:15 initrd -> initrd-2.6.16.46-0.12-default -rw-r--r-- 1 root root 3045558 Apr 2 12:15 initrd-2.6.16.46-0.12-default -rw-r--r-- 1 root root 2930512 Apr 2 10:15 initrd-2.6.16.46-0.12-default.orig
If you attempt to boot the degraded array, without referencing an initrd that contains the raid1 driver or raidautorun,
then you will get a message that the /dev/md1 device is not found, and the server hangs. If this happens, reboot into the non-RAID configuration and rebuild the initrd properly.
Copy the System Disk to the Degraded Array
- Mount the degraded array system device to a temporary mount point on the non-RAID system disk.
linux:~ # mkdir -p /mnt/newroot linux:~ # mount /dev/md1 /mnt/newroot linux:~ # mount | grep newroot /dev/md1 on /mnt/newroot type reiserfs (rw)
- Do not copy mnt or proc to the degraded array, but create place holders for them. The /mnt/newroot/proc directory is used for the proc file system mount point. If it’s missing, you will get an error saying /proc is not mounted, and the system will hang at boot time.
linux:~ # mkdir /mnt/newroot/mnt /mnt/newroot/proc
- Copy the non-RAID system disk to the degraded array file system.
linux:~ # ls / . .. bin boot dev etc home lib media mnt opt proc root sbin srv sys tmp usr var linux:/mnt/newroot # for i in bin boot dev etc home lib media opt root sbin srv sys tmp var usr > do > printf "Copy files: /$i -> /mnt/newroot/$i ... " > cp -a /$i /mnt/newroot > echo Done > done Copy files: /bin -> /mnt/newroot/bin ... Done Copy files: /boot -> /mnt/newroot/boot ... Done Copy files: /dev -> /mnt/newroot/dev ... Done Copy files: /etc -> /mnt/newroot/etc ... Done Copy files: /home -> /mnt/newroot/home ... Done Copy files: /lib -> /mnt/newroot/lib ... Done Copy files: /media -> /mnt/newroot/media ... Done Copy files: /opt -> /mnt/newroot/opt ... Done Copy files: /root -> /mnt/newroot/root ... Done Copy files: /sbin -> /mnt/newroot/sbin ... Done Copy files: /srv -> /mnt/newroot/srv ... Done Copy files: /sys -> /mnt/newroot/sys ... Done Copy files: /tmp -> /mnt/newroot/tmp ... Done Copy files: /var -> /mnt/newroot/var ... Done Copy files: /usr -> /mnt/newroot/usr ... Done
If you attempt to copy files that have ACL’s, you will get a warning that the original permissions cannot be restored. You will need to restore any ACL’s manually. You may also get some permission denied errors on files in the sys directory. Check the files, but you shouldn’t have to worry about the errors.
- Confirm the copy was successful.
linux:~ # ls / . .. bin boot dev etc home lib media mnt opt proc root sbin srv sys tmp usr var linux:~ # ls /mnt/newroot . .. bin boot dev etc home lib media mnt opt proc root sbin srv sys tmp usr var
- Modify the /mnt/newroot/etc/fstab file on the degraded array so the system boots properly.
linux:~ # cat /mnt/newroot/etc/fstab /dev/sda2 / reiserfs defaults 1 1 /dev/sda1 swap swap pri=42 0 0 linux:~ # vi /mnt/newroot/etc/fstab /dev/md1 / reiserfs defaults 1 1 /dev/md0 swap swap pri=42 0 0
- Reboot and select the degraded array “LinuxRAID1”.
Build the Finished Array
- At this point you should be running your system from the degraded array, and the non-RAID disk is not even mounted.
linux:~ # mount /dev/md1 on / type reiserfs (rw,acl,user_xattr) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) debugfs on /sys/kernel/debug type debugfs (rw) udev on /dev type tmpfs (rw) devpts on /dev/pts type devpts (rw,mode=0620,gid=5) linux:~ # mdadm --detail /dev/md1 /dev/md1: Version : 00.90.03 Creation Time : Wed Apr 2 10:31:50 2008 Raid Level : raid1 Array Size : 1582336 (1545.51 MiB 1620.31 MB) Used Dev Size : 1582336 (1545.51 MiB 1620.31 MB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Wed Apr 2 12:53:16 2008 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 UUID : 413ef3b7:cd4b903b:8b845b16:7261679c Events : 0.661 Number Major Minor RaidDevice State 0 8 18 0 active sync /dev/sdb2 1 0 0 1 removed
- Create a RAID configuration file.
linux:~ # cat << EOF > /etc/mdadm.conf > DEVICE /dev/sdb1 /dev/sdb2 /dev/sda1 /dev/sda2 > ARRAY /dev/md0 devices=/dev/sdb1,/dev/sda1 > ARRAY /dev/md1 devices=/dev/sdb2,/dev/sda2 > EOF linux:~ cat /etc/mdadm.conf DEVICE /dev/sdb1 /dev/sdb2 /dev/sda1 /dev/sda2 ARRAY /dev/md0 devices=/dev/sdb1,/dev/sda1 ARRAY /dev/md1 devices=/dev/sdb2,/dev/sda2
- Add the non-RAID disk partitions into their respective RAID array.
WARNING: This is the point of no return.
linux:~ # mdadm /dev/md0 -a /dev/sda1 mdadm: hot added /dev/sda1 linux:~ # mdadm /dev/md1 -a /dev/sda2 mdadm: hot added /dev/sda2 linux:~ # cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sda2[2] sdb2[0] 1582336 blocks [2/1] [U_] [==================>..] recovery = 92.8% (1469568/1582336) finish=0.3min speed=5058K/sec md0 : active raid1 sda1[1] sdb1[0] 513984 blocks [2/2] [UU] unused devices: <none> linux:~ # cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sda2[1] sdb2[0] 1582336 blocks [2/2] [UU] md0 : active raid1 sda1[1] sdb1[0] 513984 blocks [2/2] [UU] unused devices: <none>
- You have wiped out GRUB from your boot sector! Install GRUB onto both disks. You will need to do this manually.
linux:~ # grub GNU GRUB version 0.97 (640K lower / 3072K upper memory) [ Minimal BASH-like line editing is supported. For the first word, TAB lists possible command completions. Anywhere else TAB lists the possible completions of a device/filename. ] grub> find /boot/grub/stage1 (hd0,1) (hd1,1) grub> root (hd0,1) Filesystem type is reiserfs, partition type 0xfd grub> setup (hd0) Checking if "/boot/grub/stage1" exists... yes Checking if "/boot/grub/stage2" exists... yes Checking if "/boot/grub/reiserfs_stage1_5" exists... yes Running "embed /boot/grub/reiserfs_stage1_5 (hd0)"... 18 sectors are embedded. succeeded Running "install /boot/grub/stage1 (hd0) (hd0)1+18 p (hd0,1)/boot/grub/stage2 /boot/ grub/menu.lst"... succeeded Done. grub> root (hd1,1) Filesystem type is reiserfs, partition type 0xfd grub> setup (hd1) Checking if "/boot/grub/stage1" exists... yes Checking if "/boot/grub/stage2" exists... yes Checking if "/boot/grub/reiserfs_stage1_5" exists... yes Running "embed /boot/grub/reiserfs_stage1_5 (hd1)"... 18 sectors are embedded. succeeded Running "install /boot/grub/stage1 (hd1) (hd1)1+18 p (hd1,1)/boot/grub/stage2 /boot/ grub/menu.lst"... succeeded Done. grub> quit
WARNING: If you do not reinstall GRUB, after rebooting you will get a GRUB error. If that happens, boot from your install CD1. Select Installation, your language, Other and Boot installed system. Once the system is up, follow the steps above to install GRUB onto both drives.
- Remove the original initrd. It is useless at this point.
linux:~ # rm /boot/initrd-2.6.16.46-0.12-default.orig
- Remove the non-RAID boot option in the menu.lst file; it’s also useless.
linux:~ # vi /boot/grub/menu.lst default 0 timeout 8 title SUSE Linux Enterprise Server 10 SP1 LinuxRAID1 root (hd0,1) kernel /boot/vmlinuz-2.6.16.46-0.12-default root=/dev/md1 resume=/dev/md0 showopts initrd /boot/initrd-2.6.16.46-0.12-default
- Change back to your default run level.
linux:~ # vi /etc/inittab id:3:initdefault:
- You should only have LinuxRAID1 as a boot option.
Reboot and confirm.
Congratulations! You have a mirrored SLES system that is booting off of the software RAID1 array.
linux:~ # mount /dev/md1 on / type reiserfs (rw) proc on /proc type proc (rw) tmpfs on /dev/shm type tmpfs (rw) devpts on /dev/pts type devpts (rw,mode=0620,gid=5) /dev/hdc on /media/dvd type subfs (ro,nosuid,nodev,fs=cdfss,procuid,iocharset=utf8) /dev/fd0 on /media/floppy type subfs (rw,nosuid,nodev,sync,fs=floppyfss,procuid) usbfs on /proc/bus/usb type usbfs (rw) linux:~ # df -h Filesystem Size Used Avail Use% Mounted on /dev/md1 1.6G 421M 1.1G 28% / tmpfs 126M 8.0K 126M 1% /dev/shm linux:~ # mdadm --detail --scan ARRAY /dev/md1 level=raid1 num-devices=2 UUID=66e0c793:ebb91af6:f1d5cde8:81f9b986 ARRAY /dev/md0 level=raid1 num-devices=2 UUID=0c70c3f5:28556506:9bd29f42:0486b2ea linux:~ # mdadm --detail /dev/md1 /dev/md1: Version : 00.90.03 Creation Time : Wed Apr 2 10:31:50 2008 Raid Level : raid1 Array Size : 1582336 (1545.51 MiB 1620.31 MB) Used Dev Size : 1582336 (1545.51 MiB 1620.31 MB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Wed Apr 2 13:42:10 2008 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 413ef3b7:cd4b903b:8b845b16:7261679c Events : 0.1752 Number Major Minor RaidDevice State 0 8 18 0 active sync /dev/sdb2 1 8 2 1 active sync /dev/sda2
Related Links
- Remote Conversion to Linux Software RAID-1 for Crazy Sysadmins HOWTO
(http://togami.com/~warren/guides/remoteraidcrazies) - RAID1 ROOT HOWTO
(http://www.parisc-linux.org/faq/raidboot-howto.html) - Installing SLES on Software RAID1
(http://www.novell.com/communities/node/4132)
Conclusion
There are advantages and disadvantages to running software RAID1 in a production environment. Hardware RAID1 is preferred, but when budgets are tight, software RAID can provide a level of redundancy for your SLES servers. Migrating a post-installed SLES server to software RAID1 is a straight forward process. Remember, a redundancy strategy is not a backup strategy. Always maintain proper functional backups.
Related Articles
Aug 09th, 2023
Comments
Whilst I appreciate the tip I can’t help make a comment…
Is this what we call “Progress” in the 21st Century??
In the bad old days of Netware 6.5 it was so very complicated:
(1) Add a Hard Disk – same or larger capacity
(2) At the Server console type NSSMU press Enter
(3) From the menu that comes up Select “Partitions”
(4) Select Partition from the displayed list ie. SYS
(5) Press F3 to Mirror
(6) Repeat 4 & 5 to other partitions
The END
WoW!! – what progress.
The Linux community and Novell should pat itself on the back on all the work that has gone into this.
Yes, NSS volumes are still easy to mirror, even on OES2 for Linux. System disks with Linux file systems are not as easy. That is why I thought it a good idea to document it. Thanks for the feedback.
Do the disks really have to be the same size, or just the partitions?
Just the partitions. Thanks for the clarification.
Also, can I use a procedure something like this with LVM partitions?
Short answer, I don’t know. However, you can run the mdadm –create on an LVM device. This suggests it’s possible, but I’ve never tried it. Let us know what you find out 🙂
Hi
Very useful and well documented instructions – thanks.
But I had a lot of trouble at the reboot of the degraded array – always waiting for /dev/md0 to appear, despite numerous remakes using mkinitrd -f md.
I had to use mkinitrd -f md -d /dev/md0 to change the root device of the initrd to the newly created array. Using the -v (verbose) flag is also useful. (step 4 of “create a degraded array” section ) (I am booting off the raid-1 array)
This may be because I am using opensuse 10.3 rather than the precise version of SUSE used in these notes.
kgb
Hello,
is it possible to use that scenario on top of device mapper multipath devices? We tried to mirror from one storage box to the other with 8 paths for each LUN. The only result had been an unusable busybox prompt and the error message: waiting for device /dev/md1 to appear exiting to /bin/sh Platform had been SLES 10 SP2 with latest patches.
Bye
I have never tried it. Let us know if you get it to work, and how. You might even consider writing your own Cool Solution. 🙂
I need to clone mirrored hdd image with installed software and data. I found a good solution for it – Clonzilla.org, but Clonzilla does not support MD devices still. So I wrote this script to make mirror after clonning.
You can try it (click to download)
The only thing you need – to run stage1 and then reboot server. Script will do all work automatic and after 10minutes it will startup in 5 runlevel and you will have mirrored system (with sync in progress)
Note. It works only with ext3 file systems and swap. Reiserfs is not supported (I don’t use it), but you can make a few changes if you need it.