How to restore Crowbar backup on the admin node

This document (7022979) is provided subject to the disclaimer at the end of this document.

Environment

SUSE OpenStack Cloud 7

Situation

Admin node is in a state where it needs to restore backup.

Resolution

Following steps will clean and restore backup without need to reinstall the admin node

0. Create a snapshot prior cleaning up (optional for btrfs)

snapper create -d "cleaned admin node"

1. Check admin node uses localhost and ip of admin server as dns servers.

Can be restored from cache if needed:

cp -a /var/lib/crowbar/cache/etc/resolv.conf /etc/resolv.conf

DNS forwarders can be added to

/etc/bind/named.conf

Example resolv.conf:

cat >/etc/resolv.conf <<EOF
search examplecloud.suse.com
nameserver 127.0.0.1
nameserver 192.168.124.10
EOF

2. Stop all cloud services

for service in crowbar crowbar-jobs crowbar-init \

tftp chef-{server,solr,expander,client} couchdb \

apache2 named dhcpd xinetd rabbitmq-server postgresql dnsmasq; \

do \

systemctl stop $service; \

done

2.1 Stop unstopped processes (if they exists)

killall epmd # part of rabbitmq
killall looper_chef_client.sh

3. Deinstall cloud packages

zypper rm \
  
  `rpm -qa|grep -e crowbar -e chef -e cloud -e apache2` \
  
  couchdb createrepo erlang rabbitmq-server sleshammer yum-common \
  
  bind bind-chrootenv dhcp-server tftp

4. Delete content in cloud directrories

rm -rf /opt/dell/* \

  /etc/{bind,chef,crowbar,crowbar.install.key,dhcp3,xinetd.d/tftp} \
  
  /etc/sysconfig/{dhcpd,named/*,rabbitmq-server} \
  
  /var/lib/{chef,couchdb,crowbar,dhcp,rabbitmq} \
  
  /var/run/{chef,crowbar,named/*,rabbitmq} \

  /var/log/{apache2,chef,couchdb,crowbar,nodes,rabbitmq} \

  /var/cache/chef \

  /var/chef \

  /var/lib/pgsql/* \

/srv/tftpboot/{discovery/pxelinux.cfg/*,nodes,validation.pem}

(Note: In case of non btrfs root partition some directrories needs to be recreated)

5. Verify subprocesses are also down after deinstallation (usually not needed anymore)

killall epmd

############ #clean state #############

6. Create snapshot of cleaned admin node (optional for btrfs)

snapper create -d "cleaned admin node"

7. Ensure admin node is resolvable, nslookup and ping works
#e.g. start dnsmasq and add ip/name to /etc/hosts as needed

systemctl start dnsmasq.service

8. Install required patterns

zypper in -t pattern base cloud_admin

9. apply PTF fix for hostname check against /etc/hostname instead "hostname -f" if still needed

rpm -Fhv /ptf/*.rpm

10. Prepare backup in place

mkdir restore

cd restore

tar -xvf ../backup-restore.tar.gz

10.1 Make sure admin node is network.json compliant - the admin ip matches

first ip of an interface ( usually eth0 )

cp crowbar/configs/crowbar/network.json /etc/crowbar/network.json

10.2 Copy/Add hostnames to "/etc/hosts" from backup if not sufficient
cp crowbar/configs/hostname /etc/hostname

10.3 If using remote SMT add ip/name to the backup tar (not needed if reachable
via separate bastion network)
echo "10.10.10.10 smt-server.suseexample.com smt-server">> crowbar/configs/hosts
echo "10.10.10.0 192.170.124.1" >> /etc/sysconfig/network/routes

10.4 Create new tar-archive with changes

tar -czvf restore.tar.gz crowbar/ knife/ meta.yml

11 Start crowbar-init

systemctl start crowbar-init

12 Create empty crowbar database

crowbarctl database create

(To check sanity checks what causes issues access via browser:

http://localhost:3000/sanity)

######## #restore ########

13 Create snapshot prior restore (optional for btrfs)

snapper create -d "clean admin node prior uploading/restore backup"

14 Upload restore tarball

crowbarctl backup upload restore.tar.gz --debug --anonymous

(Note:if it still does not work check to use fqdn in /etc/hostnames or get PTF installed

15 Restore backup

crowbarctl backup list
crowbarctl backup restore <name-from-list>

16 monitor progress

tail -f /var/log/apache2/*log /var/log/crowbar/*

(or access installer within browser

http://localhost:3000/installer)

17 finally check services are up, if not start them:

for service in crowbar crowbar-jobs \

tftp chef-{server,solr,expander,client} couchdb \

apache2 named dhcpd xinetd rabbitmq-server postgresql dnsmasq; \

do \

systemctl status $service; \

done

Cause

Crowbar backup can only be restored on a fresh/clean admin node.

Additional Information

Crowbar backups can be created frequently running e.g. as cron job:
/usr/bin/crowbarctl backup create "$(date '+%Y%m%d-%H%M%S')"

In addition, as backup routine dumps the configuration it's needed to check if any of the proposals
for the barclamps has a bad status:

mkdir check
cd check
tar -xzvf <path to backup tar>
grep "crowbar-" knife/roles/*.json|grep -i failed

Note: When restoring a crowbar backup current uploaded backups get removed, please download them before.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.