Step 2 Toward Enhanced Update Infrastructure Access
Earlier this year I had the pleasure of announcing a major upgrade to our update infrastructure that serves updates to on-demand instances in AWS, Azure, and GCP.
In the blog I promised that we would be enabling network traffic routing that allows you to send data from your instances in the Public Cloud through your data center and back to the SUSE update infrastructure in the respective framework. The goal was to have that done by the end of 2019. Due to various considerations around timing in general and some technical snafus that needed to be worked out the timing slipped and we had to push the transition date into 2020.
The good news is that we have a definite date for the transition. This also requires some action on your side.
June 1st, 2020
This is the date when we will start the cut-over process.
By May 31st, 2020 you must upgrade the package “cloud-regionsrv-client” on your instances to version 9.0.0 or greater, best to pull the latest version available.
zypper up cloud-regionsrv-client
What’s happening on the backend
On June 1st, 2020 we will enable new access checks that are instance bound and relate the instance back to the originating image. This data is only sent along to the update servers with code implemented in version 9.0.0 or greater of the cloud-regionsrv-client package. If this information is not sent along to the update infrastructure access to the update infrastructure will be denied.
Once we are done enabling the new checks we will drop the current IP address based access restrictions. At present, based on IP address, access to the update infrastructure is restricted to traffic originating from within the Cloud framework. If the IP address of the origination system is not in the known list of IPs handed out by the framework DHCP servers then access is denied. This is why you have to punch an egress hole into your VPC setups in AWS and your VNET configuration in Azure. The egress hole is also necessary in GCE, but no special instructions are required. Additionally the current access control prevents updates to systems that take advantage of the AWS “bring your own IP” feature.
The IP address checks are expected to be disabled no later than June 19, 2020. After this date the route Cloud -> DC -> Cloud to access the SUSE update infrastructure in all frameworks and all regions is open. The switch over will happen by region in no particular order. Meaning any time after May 31st, 2020 and before June 19, 2020 you may find regions that are open to the new route while other regions are not yet open to the new traffic routes.
Again, as a reminder, you must update to cloud-regionsrv-client > 9.0.0 by May 31st, 2020.
The Special Cases
First, those still using SLES 11 SP4 instances. We have re-enabled the migration process that provides an ugly, do at your own risk, online migration from SLES 11 SP4 to SLES 12 SP1. This is your last chance to take advantage of this process. On June 1st, 2020 this migration route will be permanently blocked. It is now or never.
If you are running in AWS and you have created your own images additional action may be required. During image creation of AMIs along a certain route, every time when operating directly on the underlying volume, it was possible to loose the billing construct. This effectively turned an on-demand image into a BYOS image. In order to enable the Cloud -> DC -> Cloud traffic routing and to support “bring your own IP” in AWS we need the billing construct to be present. This means that running instances that have access to the update infrastructure but no longer have a billing code will also loose access to the update infrastructure unless they are converted back to on-demand instances. In an effort to help identify such instances we have implemented the “slecompliancereport” utility. This is available in S3 as follows:
for SLES 12 SP1 and later in the SLE 12 release series and
for SLES 15 and later in the SLE 15 release series.
For images created after July 2019 the lost billing construct is no longer an issue.
Now of course you are going to want to know what the slecompliancechecker code does. You can see the details in GitHub. At a high level the code will examine every running instance in a given account in the connected regions, determine if the instance is a SLES instance, and then figure out if the instance is registered to the SUSE operated update infrastructure. If the instance is registered the billing construct will be checked. If there is no billing construct the instance will be reported as out of compliance. For details about the options run
it is also recommended to read the man page
You can also point the compliance checker to a given instance or run the check in only one region.
Knowing the state of your instances is great, of course. But what to do if instances are out of compliance?
Unfortunately this requires downtime and careful consideration of your approach. One way to get there is to start a new instance from the SUSE published on-demand images and then transfer your workload installation and all your configuration to the new instance. This may turn out to be tedious.
The second option is to use the so called “root swap” method. Start a new on-demand instance from a SUSE on-demand image that matches exactly the product and service pack you are running in the out of compliance instance. For example if you are running a SLES 12 SP3 instance (it’s time to upgrade by the way, SLES 12 SP5 has just been released and SLES 12 SP3 has been out of general support for a while) you want to start a new SLES 12 SP3 on-demand instance. Lucky you we are not yet fully enforcing the Life-cycle policy in AWS just yet and you can still launch instances from SLES 12 SP3 on demand images. To find the proper image to use, check out pint. Once you have your new instance up and running, it will need to be running in the same AZ (Availability Zone) as the instance you want to bring back into compliance, you want to stop both instances, the new instance just created from the SUSE on-demand image and the instance you want to bring back into compliance. The next step is to force-detach the root volume on both instances. It is a good idea to label the volumes or write down their ID somewhere to avoid any confusion and mix up during this process. Once the volumes are detached it is possible to graft the root volume from the instance that was out of compliance onto the new on-demand instance by attaching the root volume to the stopped instance you just started from the SUSE on-demand image. You can then terminate the old instance and delete the extra root volume, i.e. the root volume of the instance you started from the SUSE on-demand image. Start the new instance and voila you are done. While you are doing the root volume swap you want to make sure you move any additional storage disks to the new instance, and fix up your IP addresses, if needed. The new instance will have new IP addresses. Once all the other administrative tasks are complete fire up the new instance that you created that now has the root volume from your previously out of compliance instance. For verification you can run slecompliancereport against the new instance and you’ll see it will be reported as compliant.
That’s it in a nutshell. The root-swap can be automated and I’d recommend that in case you have a large fleet of out of compliance instances running.
Remember, after May 31st, 2020 any instances running with cloud-regionsrv-client version less then 9.0.0 or instances that are out of compliance will get cut off from the SUSE operated update infrastructure. With these changes we are finally able to support the Cloud -> DC -> Cloud network path for updates in all frameworks in addition to supporting the “bring your own IP” feature in AWS.