Joining a CTDB cluster into a domain
This document (7006496) is provided subject to the disclaimer at the end of this document.
In the logfiles for winbind in /var/log/samba/log.winbindd entries like
[2010/07/20 17:32:59, 0] winbindd/winbindd_util.c:782(init_domain_list)
Could not fetch our SID - did we join?
can be found. The cluster prints messages in /var/log/messages that winbind failed and CTDB cannot start.
The reason for this is, that CTDB upon start parses the /etc/samba/smb.conf and if it sees entries
for a domain membership it will try to start winbind. But at this moment there are no keys in for the domain stored. So the winbind upon start uses the information in smb.conf and krb5.conf to contact the KDC but missing the keys it cannot contact the KDC. This makes the start of winbind fail. As winbind is started as a part of CTDB the whole CTDB Resource agent fails and will be stopped. A cleanup or restart of the Resource Agent does not solve the issue, as the information missing is missing all the time.
Parameter in smb.conf to one value over the whole cluster.
To remedy the situation CTDB has to be started without winbind. This process is outlined below
All nodes apart from one should be set to standby. This ensures that only one node will run CTDB after the changes.
The no_quorum policy has to be set to ignore. Reason is that the necessary changes should be made on as few machines as possible, as they are temporary to minimize work and error sources. This point does not apply on a two node cluster as the no_quorum policy has to be ignore on a two node cluster anyway
The CTDB Resource Agent should be set to stop. Even so it failed before, this ensures that there will be no additional issues from the Resource Agent.
is checked. If the first line contains an entry like "# CTDB-RA: Auto-generated by" then the CTDB script failed once on stop and this file, /etc/sysconfig/ctdb, will not be changed by the CTDB Resource Agent anymore. This means that all changes have to be made in
if there is NO entry like the one mentioned above then the CTDB script did not fail on stop and WILL change the /etc/sysconfig/ctdb file on the next start of the CTDB Resource Agent and overwrite ALL changes with its own values and restore the old values on stop again. In this case the changes have to be made in
In case of /etc/sysconfig/ctdb the change is to set
# if left comented out then it will be autodetected based on smb.conf
in case of /usr/lib/ocf/resource.d/heartbeat/CTDB the change is to set
After this changes the CTDB Resource Agent can be started and will start without error. This will be because only the nmb and the smb will be started, but not the winbind.
Now the domain can be joined.
After the successful join of the domain the CTDB Resource Agent should be stopped again.
Then the changes in /usr/lib/ocf/resource.d/heartbeat/CTDB or /etc/sysconfig/ctdb should be reverted. Which means that the value for
is set in the one changed before.
Then the CTDB Resource Agent can be started again and will work
If the no_quorum policy was changed to ignore it should be set to the original value again. This does not apply on a two node cluster as the no_quorum policy has to be ignore there anyway.
After this the nodes in standby can be set to active.
CTDB will be running as domain member now on all nodes. Tests with
will succeed on all CTDB nodes.
- Document ID:7006496
- Creation Date: 23-Jul-2010
- Modified Date:03-Mar-2020
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com