Date Created: 13 March 2008
Last Updated: 04 June 2008
Author: Sam Prior
Architecture: IBM BladeCenter E with LS41 double wide blades. Connected to an IBM DS3400 storage system via two Cisco Fibre Channel Switch Modules (multiple physical paths).
If you want to run your XEN virtual machines from a storage system on your SAN then ideally you need to ensure that you have multiple physical connections/paths between your servers/blades and the storage system. This will provide redundancy in the event of a fibre channel switch failure, HBA failure or the loss of one of your fibre connections.
Multipathing is a technology used to inform the host server that in fact there are multiple physical connections to the same logical drive on a storage system. By using multipathing the server is made aware of these multiple connections however each logical drive is only displayed/presented to the operating system once (as a virtual drive). There are two ways to implement multipathing:
- Use the native Linux multipath tools included in the SLES 10 SP1 distribution.
- Use drivers and tools supplied by your storage system manufacturer.
There are advantages and disadvantages of both methods. For this implementation I have decided to use the multipath tools provided by IBM. The disadvantage is that you must ensure these 3rd party tools and drivers are compatible with any updates made to the SLES 10 operating system (always test new updates on one host server first). However I have found the following advantages when using the IBM supplied software:
- It removes the burden of multipathing from the kernel.
- IBM have seen a dramatic increase in performance with the RDAC drivers compared to the native Linux tools.
- I also prefer the way the IBM RDAC drivers present the multipathed devices to the operating system.
You can download the Linux RDAC package from the IBM support site (in the Downloads section for the DS3400 storage system).
So that multipathing works correctly you first need to ensure that any failover support on your Host Bus Adapter (HBA) is disabled. This feature is used by the HBA to detect a loss in connection on your fibre network. In my system I am using a QLogic HBA. The server is using qla2xxx kernel module for the HBA. You can confirm what module is being used by typing the lsmod command at your server console.
Depending on the make of your HBA you need to enter parameters into the /etc/modprobe.conf.local file to disable the HBA failover support. If you have a QLogic HBA and are using the qla2xxx kernel module do the following to disable failover support:
- Boot the server using the XEN kernel and login as root.
- At the server console type vi /etc/modprobe.conf.local.
- Navigate to the last line in the file and then press Ins to enter Input mode.
Press the Enter key to start a new line and type options qla2xxx qlport_down_retry=1 as shown below:
- Press Esc to return to Command mode. Type and then Enter to save the file and exit vi.
- At the console type mkinitrd && reboot. This will create a new Initial RAM disk and, if successful, reboot the server.
Before proceeding check that your host server type is set correctly on your storage system. On IBM storage systems you should use the LNXCLVMWARE option rather than Linux. This is because multiple servers will be accessing the same logical drive. On the IBM DS3400 changing the host server type looks like this:
Each blade server in my environment has a single HBA with two ports. Each of these ports is connected to a separate Fibre Channel Switch in the chassis. Each Fibre Channel switch has a connection to each of the controllers on the storage system (the DS3400 has two controllers). This means that each blade has four different physical paths to each logical drive on the storage system. Therefore a SLES server without multipathing will see each logical drive four times when you use the lsscsi command. You cannot access each logical drive via both controllers at the same time however to allow for failover on the fibre channel paths the server needs to be able to at least “see” the logical drive via the second controller (even though it can’t access it). This is required to enable redundancy between the controllers.
If you have multiple logical drives that are the same size I would recommend creating them one at a time on the storage system so that you can be sure which logical drive has which device name on your host servers. At a later date I would recommend mounting or accessing your logical drives via the ID or UUID rather than the device name as this may not be persistent.
Before you proceed ensure the HBA can see at least one logical drive on the storage system by using the BIOS utility included with the HBA. If you’re using a Qlogic HBA you can start this utility by pressing Ctrl + Q at server boot up.
Before you install the 3rd party IBM software ensure the following is in place:
- Check the RDAC Readme file for supported HBAs and the required driver version.
- Ensure the HBA driver is installed and working.
- The RDAC driver does not support auto-volume transfer/auto-disk transfer (AVT/ADT). This is why we set the host type on the storage system to LNXCLVMWARE.
- If you have multiple HBAs in your server only one of them should be connected to the storage system.
Also before you start be aware of the following gotchas:
- The Linux SCSI layer does not support skipped (sparse) LUNs. Always use consecutive LUN numbers when creating your logical drives.
- If your server has multiple HBA ports and each port can see both controllers on the storage system (on an unzoned switch) the Linux RDAC driver might return I/O errors during controller failover. To solve this use multiple unconnected Fibre Channel switches to zone the Fibre Channel switch into multiple zones thus each port can only see one of the controllers.
- If you delete a logical drive you must restart the server for the change to be detected.
- Do not load or unload the RDAC driver stack or HBA driver using the “modprobe” utility.
The steps outlined below assume you have downloaded the RDAC Linux driver onto a floppy disk. However downloading it from a server, for example your FTP installation server, is much quicker and easier.
- Insert the floppy disk into the media tray and make the blade you wish to install the RDAC driver onto the media tray owner.
- Boot the server using the XEN kernel.
- Login to the server as root.
- I often create a folder in /root to contain drivers and other software I need to install on the server. This provides easy access to the installation files in future and acts as a reference for version checking.
- Open YaST. Go to Software > Software Management. In the Search field type kernel-source and then click on Search. You should see the kernel-source package appear in the right hand screen. Tick the box next to the package. Now type gcc in the Search field and click Search. Again in the right hand window place a tick in the box next to the gcc package. Click Accept. You may see a prompt about Automatic Changes that have been made to resolve dependencies. Click Continue.
Once the software has installed click No when prompted if you wish to install/remove more packages.
- Check that a soft link exists to the newly installed kernel source files and if not created one:
ls –l /usr/src
This lists the contents of the /usr/src directory. If a link already exists you will see linux -> linux-188.8.131.52-0.2.5. If the link doesn’t already exist use the command below to create it. This link gets created by default on SLES 10 SP1.
ln –sf /usr/src/linux-184.108.40.206-0.2.5 /usr/src/linux
After executing this command repeat the ls –l /usr/src command to ensure the link has been successfully created.
- Next type the following commands to ensure the kernel version is synchronised between the device driver and the running kernel:
Uses the soft link to change to the source files directory.
This completely cleans the kernel tree.
- cp /boot/config-`uname –r` .config
Copy the new configuration file.
Update the configuration using the .config file
Build the modules (this isn’t required for some kernel versions but use it on SLES 10)
- Now we need to build the driver:
Change to the directory we created earlier.
Removes the old driver modules in the directory.
Compiles all driver modules and utilities in a server with multiple processors (SMP kernel).
- To build a new Initial RAM disk do the following:
This copies driver modules to the kernel module tree and builds the new RAMdisk image (*.img) which includes RDAC modules. Ensure your current directory is ~/software/linuxrdac-09.01.C5.06 otherwise this command may not work.
Follow the instructions that are displayed at the end of the build process to add a new boot menu option which uses /boot/mpp-kernel version.img as the initial RAMdisk image. The path for the initial RAM disk is relative to the boot device (referred to in line “root (hd0,6)”). During the build process you are prompted to confirm that only one type of HBA is connected to the storage system. Ensure this is correct and then type “yes” followed by Enter.
If near the end of the RDAC install you see error message “All of your loopback devices are in use” refer to the installation documentation.
At the end of the build process you will see instructions similar to “You must now edit your boot loader configuration file /boot/grub/menu.lst to add a new boot menu which uses mpp-220.127.116.11-0.2.5-xenpae.img as the initrd image”. Make a note of the new file name.
- You can update the GRUB boot menu either using YaST or the console. I would recommend copying the existing XEN boot option and then editing the copy so that the original can be used as a fallback. Ensure you create the new entry correctly including updating the title name and modifying the initial ram disk file name including the .img extension. Set the new boot option as the default.
- Reboot the server.
- Type “lsmod” to ensure the driver stack is correctly loaded. When multipathing is enabled you should see sg, mppUpper, mppVhba and the HBA driver (qla2xxx).
- To ensure the RDAC driver discovered the available physical LUNs and created virtual LUNs for them type ls -lR /proc/mpp.
Below the text in bold are commands you need to type into the server console (followed by Enter).
This ensures you’re in the /root directory
Creates a directory to store my drivers etc.
If the server is running in runlevel 5 the floppy drive should automatically be mounted to /media. If you are running in runlevel 3 or the floppy drive is not automatically mounted do the following:
Creates a directory to be used as my mount point.
mount /dev/sdf /mnt/floppy
Mounts the floppy drive to /mnt/floppy. The device name (/dev/sdf) will vary depending upon your system devices.
cp /mnt/floppy/rdac-LINUX-09.01.C5.06-source.tar.gz ~/software
Copies the RDAC driver from the floppy to my software folder
Changes to the software folder
tar –zxvf rdac-LINUX-09.01.C5.06-source.tar.gz
Unpacks the file to a directory called linuxrdac-09.01.C5.06 in the software directory.
You can use the mulipath tools to test the configuration. However a very simple way to properly check the system is to create a logical drive and mount this on each server. Create a filesystem on the logical drive and an empty file using the command touch e.g. touch test. To ensure that the multipath driver is working correctly use a command such as “cat /dev/random 4942- /vms/test” on the Linux server. This will display the device random (which generates random characters) but redirect the output to a file rather than the screen. This file should be the one you created on the logical drive. Leave this command running and turn off one of the fibre channel switches and/or remove fibre connections to the storage system. If multipathing is working correctly the cat /dev/random > /vms/test command will continue to run without any errors. Also, using the IBM Storage Manager software, you can check controller ownership. If the multipath driver is working correctly the logical drive should have failed over to the second controller when you turned the fibre channel switch off.
For advice/assistance email slesNOSPAM@runbox.com (without the NO SPAM) or better still post a comment to this article so other users can see. I try endeavour to respond to all emails, depending on demand!