My floating Container
Since I started working in IT, I have always been fascinated by the philosophy of “No Downtime”, and nowadays the technologies to achieve this goal are plenty: Virtualization, Clustering, Containers, amongst others. All these actors are part of a beautiful, complex, and challenging play. I like to think of myself, and all the people in this line of work, that we are the directors of the play; we find the best configuration, we try and experiment, and eventually, we fix the problems before going “Live”.
At SUSE, as a Technical Support Engineer, I come across all these amazing technologies, and since my attention lately was driven by HA and Docker, I decided to show you how to create a Docker container as a cluster resource: My floating Container.
Pre-requisites:
- Internet connection
- KVM Host with SLES 12 SP1
- 2 or 3 nodes with SLES 12 SP1
For this post, I’m not going to go through the basic configuration of a cluster, I will just assume that this documentation has been followed.
Cluster is up and running:
giovanna:/ # crm_mon -1 Last updated: Sat Apr 15 13:34:49 2017 Last change: Sat Apr 15 13:22:07 2017 by root via crm_resource on giovanna Stack: corosync Current DC: giovanna (version 1.1.13-20.1-6f22ad7) - partition with quorum 3 nodes and 2 resources configured Online: [ giovanna laura sofia ] stonith-sbd (stonith:external/sbd): Started giovanna admin_addr (ocf::heartbeat:IPaddr2): Started sofia
My configuration:
giovanna:/ # crm configure show node 1084777475: giovanna \ attributes standby=off node 1084777476: laura \ attributes standby=off node 1084777477: sofia \ attributes standby=off maintenance=off primitive admin_addr IPaddr2 \ params ip=192.168.100.101 \ op monitor interval=10 timeout=20 primitive stonith-sbd stonith:external/sbd \ params pcmk_delay_max=30s property cib-bootstrap-options: \ have-watchdog=true \ dc-version=1.1.13-20.1-6f22ad7 \ cluster-infrastructure=corosync \ cluster-name=hacluster \ stonith-enabled=true \ no-quorum-policy=ignore \ placement-strategy=balanced \ last-lrm-refresh=1492253688 rsc_defaults rsc-options: \ resource-stickiness=1 \ migration-threshold=3 op_defaults op-options: \ timeout=600 \ record-pending=true
This is what my cluster looks like, a basic setup with a STONITH device for fencing and a floating IP. I named the nodes after my mother and my two sisters ( I know..pretty sweet, right? ) and, obviously, Giovanna (my mom) is our DC so I will start with her.
Install Docker:
Enable the container module:
giovanna:/ # SUSEConnect -p sle-module-containers/12/x86_64 -r ''
Install the packages:
giovanna:/ # zypper in docker
Add user to the “docker” group:
giovanna:/ # usermod -aG docker <user>
Enable and start Docker:
giovanna:/ # systemctl enable docker.service && systemctl start docker.service
NOTE: Repeat the steps above on all nodes.
Create your docker cluster resource:
giovanna:/ # crm configure crm(live)configure# primitive cont_float_opensuse ocf:heartbeat:docker \ params image=opensuse allow_pull=yes run_opts=-it run_cmd="/bin/bash" \ op start interval=0 timeout=90 \ op stop interval=0 timeout=90 \ op monitor interval=30 timeout=30 \ meta target-role=Started crm(live)configure# show node 1084777475: giovanna \ attributes standby=off node 1084777476: laura \ attributes standby=off node 1084777477: sofia \ attributes standby=off maintenance=off primitive admin_addr IPaddr2 \ params ip=192.168.100.101 \ op monitor interval=10 timeout=20 primitive cont_float_opensuse docker \ params image=opensuse allow_pull=yes run_opts=-it run_cmd="/bin/bash" \ op start interval=0 timeout=90 \ op stop interval=0 timeout=90 \ op monitor interval=30 timeout=30 \ meta target-role=Started primitive stonith-sbd stonith:external/sbd \ params pcmk_delay_max=30s property cib-bootstrap-options: \ have-watchdog=true \ dc-version=1.1.13-20.1-6f22ad7 \ cluster-infrastructure=corosync \ cluster-name=hacluster \ stonith-enabled=true \ no-quorum-policy=ignore \ placement-strategy=balanced \ last-lrm-refresh=1492253688 rsc_defaults rsc-options: \ resource-stickiness=1 \ migration-threshold=3 op_defaults op-options: \ timeout=600 \ record-pending=true
I deliberately created my resource with the following parameters:
image=opensuse allow_pull=yes
The parameter “image=” is to configure the cluster to run the docker resource using the specified image (local or remote image). For the purpose of this post I just wrote “opensuse” so Docker will automatically pull the latest openSUSE image from Docker Hub.
The parameter “allow_pull=yes” is to allow the download of the image automatically from the specified repository, in this case Docker Hub, but also a local registry can be set up.
run_opts=-it run_cmd="/bin/bash"
The last 2 parameters can be specified if you need to add options to the command “docker run” ( the option “-d” for “docker run” is embedded in the cluster script for launching this particular resource so it is always specified ) and to add the command/application/script that you want to run in your container.
Commit and exit the configuration:
crm(live)configure# commit crm(live)configure# exit
OPLA’, my container is up and running on my DC node Giovanna:
giovanna:~ # crm_mon -1 Last updated: Sat Apr 15 14:35:09 2017 Last change: Sat Apr 15 14:35:02 2017 by root via crm_resource on giovanna Stack: corosync Current DC: giovanna (version 1.1.13-20.1-6f22ad7) - partition with quorum 3 nodes and 3 resources configured Online: [ giovanna laura sofia ] stonith-sbd (stonith:external/sbd): Started giovanna admin_addr (ocf::heartbeat:IPaddr2): Started sofia cont_float_opensuse (ocf::heartbeat:docker): Started giovanna
but Giovanna can’t be always present, she can be temporarily unavailable, so what happens if my container needs to rely on Sofia or Laura, let’s have a look:
Something bad happened…oops:
giovanna:~ # echo c > /proc/sysrq-trigger
My cluster felt it:
Last updated: Sat Apr 15 14:45:05 2017 Last change: Sat Apr 15 14:44:34 2017 by root via crm_resource on giovanna Stack: corosync Current DC: sofia (version 1.1.13-20.1-6f22ad7) - partition with quorum 3 nodes and 3 resources configured Node giovanna: UNCLEAN (offline) Online: [ laura sofia ] stonith-sbd (stonith:external/sbd): Started laura admin_addr (ocf::heartbeat:IPaddr2): Started sofia cont_float_opensuse (ocf::heartbeat:docker): Started giovanna (UNCLEAN)
But after few seconds:
cont_float_opensuse (ocf::heartbeat:docker): Starting laura
And finally:
cont_float_opensuse (ocf::heartbeat:docker): Started laura
What happened in the background? The cluster lost a member:
Apr 15 14:51:44 laura corosync[1669]: [TOTEM ] A processor failed, forming new configuration. Apr 15 14:51:45 laura corosync[1669]: [TOTEM ] A new membership (192.168.100.4:1056) was formed. Members left: 1084777475 [..] Apr 15 14:51:45 laura stonith-ng[1685]: notice: crm_update_peer_proc: Node giovanna[1084777475] - state is now lost (was member) Apr 15 14:51:45 laura crmd[1689]: notice: crm_reap_unseen_nodes: Node giovanna[1084777475] - state is now lost (was member)
My container can’t run on Giovanna anymore:
Apr 15 14:51:46 laura pengine[1688]: warning: Action cont_float_opensuse_stop_0 on giovanna is unrunnable (offline)
So it gets moved to Laura:
Apr 15 14:51:46 laura pengine[1688]: notice: Move cont_float_opensuse (Started giovanna -> laura) [..] Apr 15 14:52:00 laura crmd[1689]: notice: Initiating action 106: start cont_float_opensuse_start_0 on laura
First problem, the image is not locally present on Laura:
Apr 15 14:52:00 laura docker(cont_float_opensuse)[667]: NOTICE: Image (opensuse) does not exist locally but will be pulled during start
First problem solved! The image gets automatically pulled from the registry:
Apr 15 14:52:00 laura docker(cont_float_opensuse)[678]: NOTICE: Beginning pull of image, opensuse
After pulling the image, the container can be started on a new node:
Apr 15 14:52:01 laura docker(cont_float_opensuse)[702]: INFO: running container cont_float_opensuse for the first time Apr 15 14:52:12 laura docker(cont_float_opensuse)[830]: INFO: 5ceb3bd9e4445c5ca078c6bcedb5500a7235c91482d5747552e7fe8f1707ae7b Apr 15 14:52:12 laura docker(cont_float_opensuse)[845]: NOTICE: Container cont_float_opensuse started successfully
My container became a floating bubble, free to fly around my 3-node cluster: Mission No Downtime – Completed.
Obviously, my configuration is very simple and just for education purposes, but try to imagine the potentiality of these technologies, bind together for one purpose: having your application(s), your website(s), even your OS(s), always running and flexible, able to move when needed or in case of a problem.
That’s why I love SUSE!
No comments yet