Share with friends and colleagues on social media

Since I started working in IT, I have always been fascinated by the philosophy of “No Downtime”, and nowadays the technologies to achieve this goal are plenty: Virtualization, Clustering, Containers, amongst others. All these actors are part of a beautiful, complex, and challenging play. I like to think of myself, and all the people in this line of work, that we are the directors of the play; we find the best configuration, we try and experiment, and eventually, we fix the problems before going “Live”.

At SUSE, as a Technical Support Engineer, I come across all these amazing technologies, and since my attention lately was driven by HA and Docker, I decided to show you how to create a Docker container as a cluster resource: My floating Container.

Pre-requisites:

  • Internet connection
  • KVM Host with SLES 12 SP1
  • 2 or 3 nodes with SLES 12 SP1

For this post, I’m not going to go through the basic configuration of a cluster, I will just assume that this documentation has been followed.

Cluster is up and running:

giovanna:/ # crm_mon -1
 Last updated: Sat Apr 15 13:34:49 2017 Last change: Sat Apr 15 13:22:07 2017 by root via crm_resource on giovanna
 Stack: corosync
 Current DC: giovanna (version 1.1.13-20.1-6f22ad7) - partition with quorum
 3 nodes and 2 resources configured

Online: [ giovanna laura sofia ]

stonith-sbd (stonith:external/sbd): Started giovanna
admin_addr (ocf::heartbeat:IPaddr2): Started sofia

My configuration:

giovanna:/ # crm configure show

node 1084777475: giovanna \
 attributes standby=off
node 1084777476: laura \
 attributes standby=off
node 1084777477: sofia \
 attributes standby=off maintenance=off
primitive admin_addr IPaddr2 \
 params ip=192.168.100.101 \
 op monitor interval=10 timeout=20
primitive stonith-sbd stonith:external/sbd \
 params pcmk_delay_max=30s
property cib-bootstrap-options: \
 have-watchdog=true \
 dc-version=1.1.13-20.1-6f22ad7 \
 cluster-infrastructure=corosync \
 cluster-name=hacluster \
 stonith-enabled=true \
 no-quorum-policy=ignore \
 placement-strategy=balanced \
 last-lrm-refresh=1492253688
rsc_defaults rsc-options: \
 resource-stickiness=1 \
 migration-threshold=3
op_defaults op-options: \
 timeout=600 \
 record-pending=true

This is what my cluster looks like, a basic setup with a STONITH device for fencing and a floating IP. I named the nodes after my mother and my two sisters ( I know..pretty sweet, right? ) and, obviously, Giovanna (my mom) is our DC so I will start with her.

Install Docker:

Enable the container module:

giovanna:/ # SUSEConnect -p sle-module-containers/12/x86_64 -r ''

Install the packages:

giovanna:/ # zypper in docker

Add user to the “docker” group:

giovanna:/ # usermod -aG docker <user>

Enable and start Docker:

giovanna:/ # systemctl enable docker.service && systemctl start docker.service

NOTE: Repeat the steps above on all nodes.

Create your docker cluster resource:

giovanna:/ # crm configure

crm(live)configure# primitive cont_float_opensuse ocf:heartbeat:docker \
params image=opensuse allow_pull=yes run_opts=-it run_cmd="/bin/bash" \
op start interval=0 timeout=90 \
op stop interval=0 timeout=90 \
op monitor interval=30 timeout=30 \
meta target-role=Started

crm(live)configure#  show

node 1084777475: giovanna \
 attributes standby=off
node 1084777476: laura \
 attributes standby=off
node 1084777477: sofia \
 attributes standby=off maintenance=off
primitive admin_addr IPaddr2 \
 params ip=192.168.100.101 \
 op monitor interval=10 timeout=20
primitive cont_float_opensuse docker \
 params image=opensuse allow_pull=yes run_opts=-it run_cmd="/bin/bash" \
 op start interval=0 timeout=90 \
 op stop interval=0 timeout=90 \
 op monitor interval=30 timeout=30 \
 meta target-role=Started
primitive stonith-sbd stonith:external/sbd \
 params pcmk_delay_max=30s
property cib-bootstrap-options: \
 have-watchdog=true \
 dc-version=1.1.13-20.1-6f22ad7 \
 cluster-infrastructure=corosync \
 cluster-name=hacluster \
 stonith-enabled=true \
 no-quorum-policy=ignore \
 placement-strategy=balanced \
 last-lrm-refresh=1492253688
rsc_defaults rsc-options: \
 resource-stickiness=1 \
 migration-threshold=3
op_defaults op-options: \
 timeout=600 \
 record-pending=true

I deliberately created my resource with the following parameters:

image=opensuse
allow_pull=yes

The parameter “image=” is to configure the cluster to run the docker resource using the specified image (local or remote image). For the purpose of this post I just wrote “opensuse” so Docker will automatically pull the latest openSUSE image from Docker Hub.

The parameter “allow_pull=yes” is to allow the download of the image automatically from the specified repository, in this case Docker Hub, but also a local registry can be set up.

run_opts=-it
run_cmd="/bin/bash"

The last 2 parameters can be specified if you need to add options to the command “docker run” ( the option “-d” for “docker run” is embedded in the cluster script for launching this particular resource so it is always specified ) and to add the command/application/script that you want to run in your container.

Commit and exit the configuration:

crm(live)configure# commit
crm(live)configure# exit

OPLA’, my container is up and running on my DC node Giovanna:

giovanna:~ # crm_mon -1
Last updated: Sat Apr 15 14:35:09 2017 Last change: Sat Apr 15 14:35:02 2017 by root via crm_resource on giovanna
Stack: corosync
Current DC: giovanna (version 1.1.13-20.1-6f22ad7) - partition with quorum
3 nodes and 3 resources configured

Online: [ giovanna laura sofia ]

 stonith-sbd (stonith:external/sbd): Started giovanna
 admin_addr (ocf::heartbeat:IPaddr2): Started sofia
 cont_float_opensuse (ocf::heartbeat:docker): Started giovanna

but Giovanna can’t be always present, she can be temporarily unavailable, so what happens if my container needs to rely on Sofia or Laura, let’s have a look:

Something bad happened…oops:

giovanna:~ # echo c > /proc/sysrq-trigger

My cluster felt it:

Last updated: Sat Apr 15 14:45:05 2017 Last change: Sat Apr 15 14:44:34 2017 by root via crm_resource on giovanna
Stack: corosync
Current DC: sofia (version 1.1.13-20.1-6f22ad7) - partition with quorum
3 nodes and 3 resources configured

Node giovanna: UNCLEAN (offline)
Online: [ laura sofia ]

stonith-sbd (stonith:external/sbd): Started laura
admin_addr (ocf::heartbeat:IPaddr2): Started sofia
cont_float_opensuse (ocf::heartbeat:docker): Started giovanna (UNCLEAN)

But after few seconds:

cont_float_opensuse (ocf::heartbeat:docker): Starting laura

And finally:

cont_float_opensuse     (ocf::heartbeat:docker):        Started laura

What happened in the background? The cluster lost a member:

Apr 15 14:51:44 laura corosync[1669]: [TOTEM ] A processor failed, forming new configuration.
Apr 15 14:51:45 laura corosync[1669]: [TOTEM ] A new membership (192.168.100.4:1056) was formed. Members left: 1084777475
[..]
Apr 15 14:51:45 laura stonith-ng[1685]: notice: crm_update_peer_proc: Node giovanna[1084777475] - state is now lost (was member)
Apr 15 14:51:45 laura crmd[1689]: notice: crm_reap_unseen_nodes: Node giovanna[1084777475] - state is now lost (was member)

My container can’t run on Giovanna anymore:

Apr 15 14:51:46 laura pengine[1688]: warning: Action cont_float_opensuse_stop_0 on giovanna is unrunnable (offline)

So it gets moved to Laura:

Apr 15 14:51:46 laura pengine[1688]: notice: Move cont_float_opensuse (Started giovanna -> laura)
[..]
Apr 15 14:52:00 laura crmd[1689]: notice: Initiating action 106: start cont_float_opensuse_start_0 on laura

First problem, the image is not locally present on Laura:

Apr 15 14:52:00 laura docker(cont_float_opensuse)[667]: NOTICE: Image (opensuse) does not exist locally but will be pulled during start

First problem solved! The image gets automatically pulled from the registry:

Apr 15 14:52:00 laura docker(cont_float_opensuse)[678]: NOTICE: Beginning pull of image, opensuse

After pulling the image, the container can be started on a new node:

Apr 15 14:52:01 laura docker(cont_float_opensuse)[702]: INFO: running container cont_float_opensuse for the first time
Apr 15 14:52:12 laura docker(cont_float_opensuse)[830]: INFO: 5ceb3bd9e4445c5ca078c6bcedb5500a7235c91482d5747552e7fe8f1707ae7b
Apr 15 14:52:12 laura docker(cont_float_opensuse)[845]: NOTICE: Container cont_float_opensuse started successfully

My container became a floating bubble, free to fly around my 3-node cluster: Mission No Downtime – Completed.

Obviously, my configuration is very simple and just for education purposes, but try to imagine the potentiality of these technologies, bind together for one purpose: having your application(s), your website(s), even your OS(s), always running and flexible, able to move when needed or in case of a problem.

That’s why I love SUSE!


Share with friends and colleagues on social media
Tags: , , , , , , ,
Category: Containers, Enterprise Linux, SUSE Linux Enterprise Server, Technical Solutions
This entry was posted Wednesday, 19 April, 2017 at 3:03 pm
You can follow any responses to this entry via RSS.

Leave a Reply

Your email address will not be published. Required fields are marked *

No comments yet