Deploying clustered controller nodes with DRBD on SUSE cloud
To deploy controllers that work together as a cluster, you’ll need to deploy pacemaker on both nodes. Very easy principle right? well, when you are doing it on cloud using DRBD mirroring there are some things that you need to keep in mind and this is what this tutorial is all about.
To get started, open crowbar interface on admin node. After, open menu “Barclamps” > “OpenStack”. On the OpenStack list, click on “Edit” button under “Pacemaker” option. Add a name for this proposal if you want. Once you are done, click on “Create”.
Once you have your new proposal created, there are many options we can select before applying the barclamps to the nodes. If you are still not sure about what exactly to select and want to know more details about every individual option, this might help you
For this tutorial I will be using the following setup:
From all of the options available here, the “Atributes”, “DRBD” (optional) and “Stonith” need special attention. These are the ones that might cause you problems on deployment if wrongly set.
On “Attribute”, it is important for you to set the policy to “ignore” ONLY if you are deploying pacemaker barclamp on two controllers. This will turn off voting/quorum on the machines when a split brain situation happens. If you are using more then two controllers then it might be interesting to change this policy to something else.
If you decide to use SBD device as STONITH, you need to make sure that the device is correctly set on the nodes before deploying the barclamp. In the scenario illustrated above, the /dev/sda device is a scsi disk on a different machine which is configured to get connected once the system boots.
NOTE: you can only specify the path to the STONITH devices after adding both nodes as “pacemaker-cluster-member” at the bottom of the page and hit “Save” button.
If you decide to use DRBD as a shared device for the resources, keep in mind that DRBD is raid1 over network. So for you to deploy it correctly, it will need 1 free disk on first controller and 1 free disk on the second, so DRBD can sync them. Barclamp should also install every package needed to deploy DRBD. Make sure that you have this free disk available in order to apply pacemaker.
The working process is similar to this:
- barclamp(pacemaker) creates LVM on the first available disk it finds.
- barclamp(postgres) creates a logical volume from that LVM
- barclamp(postgres) then creates drbd on that LV and then puts file system on top of it.
Step 1 is a very tricky one because the system will create the LVM on the first available disk it finds and as you can imagine, it might not be optimal for client environment. Sometimes the “first disk available” is not the disk that we want, but the second or third one for example… Unfortunately, there is not a “easy” way to tell crowbar the exact DRBD disk we want to use . If the DRBD fails to find the disk, the following error will appear when you try to apply the pacemaker barclamp:
[2017-07-18T14:11:50+08:00] FATAL: RuntimeError: There is no suitable disk for LVM for DRBD!
For you to set exactly which device we want to use, we will need to ssh into the admin node and then use the following commands:
# crowbarctl node list
This will show us a list with all nodes available with their current status. Here is an example:
| Name | Alias | Group | Status |
| cloudadmin.cloudlab | cloudadmin | sw-unknown | ready |
| d52-54-00-0e-e4-e1.cloudlab | cloudcompute | sw-unknown | ready |
| d52-54-00-6f-ed-db.cloudlab | cloudcontroller1 | sw-unknown | ready |
| d52-54-00-8e-2f-14.cloudlab | cloudcontroller2 | sw-unknown | ready |
Then, we will need to edit the configuration for the controller nodes on admin’s node chef master with command:
# knife node edit d52-54-00-6f-ed-db.cloudlab
# knife node edit d52-54-00-8e-2f-14.cloudlab
Is it recommended to use the entire name of the node instead of the Aliases. With the above commands, a configuration file should open. Once in there, search for these specific lines:
Make sure that the device path for LVM_DRBD is correct there on both controller nodes. If it need changes then do it there, save, close the file and try to re-apply pacemaker barclamps again.