SUSE Update Infrastructure Access Through the Data Center
In Step 2 Toward Enhanced Update Infrastructure Access the time-line for enabling access to the SUSE update infrastructure in the Public Cloud via routing through the data center was announced. As of June 1, 2020 we have started the work necessary to make this possible for all regions in AWS, Azure, and GCE. This marks the beginning of the final phase of a process that started almost 1 year ago with A New Update Infrastructure For The Public Cloud. We expect to have everything completed by no later than the end of June 2020, but will most likely be much faster. The changes from a global IP based access control mechanism to an instance based access mechanism apply to both SUSE Linux Enterprise Server (SLES) and SUSE Linux Enterprise Server For SAP Applications (SLES For SAP) on-demand instances and any images released in the future that might access the update infrastructure.
The good news first. There are conditions where things will just work. But unfortunately the spread of images and migration paths is rather large and this makes things a bit complicated. One universal requirement for any instance is that the cloud-regionsrv-client package with version 9.0.0 or later has to be on the system.
For any instance launched from a SUSE published SLES 15 or SLES 15 SP1 image, this equally applies to SLES For SAP, with version 9.0.0 or later of the cloud-regionsrv-client package no action is required.
However, given that you found your way here it is likely that your instance has lost access to the update infrastructure and you would like to get that access back.
Let’s get into the weeds
For those still running SLES 11 based on-demand instances there is no way back, sorry. Access to the update infrastructure ends with this change. There exists an old and ugly, proceed at your own risk approach to get a SLES 11 SP4 instance onto SLES 12 SP1, which at the time, 4.5 years ago, was the best we could do. Since then technology has evolved and we have developed a new upgrade process that addresses SLES 12 SP4/5 to SLES 15 SP1 (also SLES For SAP) migration. The plan is to have this work when the next major distribution release comes about. But I digress. For SLES 11 based instances this is the end of the road with respect for accessing the update infrastructure. However, this does not mean the instances stop running. You can continue to run your instances. SLES 11 series repositories have not received any updates since March 31st, 2019, i.e over a year ago, as this was the end of general support of SLES 11 SP4 per the distribution life cycle. Bottom line, you are not really losing anything, it is just now obvious that the zypper are not doing anything. LTSS is available vi SUSE Manager which can carry you for another 2 years at this point, until March 31st 2022. Hopefully none of this comes as a surprise.
For any instance based on SLE 12 in AWS and Azure launched from an image with a date stamp prior to 20200526 additional packages are needed. But you just lost access to the update infrastructure and thus have a chicken and egg problem. Not to worry, the solution is outlined below.
In AWS there was a way to create private images that would effectively turn PAYG images into BYOS images. Depending on how the private image was created it was possible that access to the update infrastructure was retained due to verification happening at a global (IP based level). With the switch to the instance based access check this will no longer work. If you have instances that fall into this category you will have to migrate the instance to be a PAYG instance. Some additional comments on this below.
In GCE all instances must have access to the “instance-attributes”. This is the default behavior when an instance gets launched. However, it is possible to create instances using a service account that have this access disabled. For such instances you will need to stop your instance and enable the access.
This covers the special conditions and caveats to make instance based access controls work and support routing of traffic through the data center.
How do I get access back?
Now to the resolution of the chicken and egg problem which is created by either not having the latest cloud-regionsrv-client package on instances started from SUSE published SLE 15 or SLE 15 SP1 images (SLES & SLES For SAP) prior to the update infrastructure switch, or by missing packages in SLE 12 (SLES & SLES For SAP) based instances.
If you have no errors running “zypper up” or any other command that refreshes the repositories, you are done and your instance was already up to date. If the instance does not meet the necessary requirements you will get an error from zypper for each repository following the pattern:
Problem retrieving files from '$REPOSITORY_NAME'. Permission to access '$REPOSITORY_URL' denied. Please see the above error message for a hint. Warning: Skipping repository '$REPOSITORY_NAME' because of the above error.
and then at the end
Some of the repositories have not been refreshed because of an error.
This means we must break the chicken and egg problem. With no access to the update infrastructure there is no direct way to get the necessary packages. To solve this issue we have prepared tarballs that provide the necessary packages and the following steps provide guidance to get you back to operational mode.
The example commands below require the substitution of 1 or 2 place holders:
$ARCH is either aarch64 if you are running ARM based instances or x86_64 if you are running, well x86_64 instances.
$SLEBASE is the base version of the SLES release you are running. Note this may no longer match what the framework tells you is the origination image if at any time you ran “zypper migration” or did a major distribution upgrade. It is best to verify what you are running by looking at /etc/os-release. Use the numbers ahead of the decimal point in the “VERSION_ID“. For example a “12.4” in the “VERSION_ID” would make the $SLEBASE substitution a “12“.
Here is an example of an expanded name:
All file names are listed below along with the sha1sum for each taball.
All commands executed as root and assumed to be executed on the effected instance. Everything is setup such that it can easily be scripted.
For AWS instances:
wget --no-check-certificate https://220.127.116.11/late_instance_offline_update_ec2_$ARCH_$SLEBASE.tar.gz sha1sum late_instance_offline_update_ec2_$ARCH_$SLEBASE.tar.gz tar -xf late_instance_offline_update_ec2_$ARCH_$SLEBASE.tar.gz cd $ARCH zypper --no-refresh --no-remote --non-interactive in *.rpm
wget --no-check-certificate https://18.104.22.168/late_instance_offline_update_azure_$SLEBASE.tar.gz sha1sum late_instance_offline_update_azure_$SLEBASE.tar.gz tar -xf late_instance_offline_update_azure_$SLEBASE.tar.gz cd x86_64 zypper --no-refresh --no-remote --non-interactive in *.rpm
wget --no-check-certificate https://22.214.171.124/late_instance_offline_update_gce_$SLEBASE.tar.gz sha1sum late_instance_offline_update_gce_$SLEBASE.tar.gz tar -xf late_instance_offline_update_gce_$SLEBASE.tar.gz cd x86_64 zypper --no-refresh --no-remote --non-interactive in *.rpm
Here are the checksums for the tarballs:
AWS x86_64 SLE12: 1912c23c9140e406690160a9bd003a81fc79c14e late_instance_offline_update_ec2_x86_64_SLE12.tar.gz
AWS aarch64 SLE15: 250be20f24a941ef27fdb9727a3a4d34fc525ea3 late_instance_offline_update_ec2_aarch64_SLE15.tar.gz
AWS x86_64 SLE15: d8906af58483f9107146590cb604e1f32629a749 late_instance_offline_update_ec2_x86_64_SLE15.tar.gz
Azure SLE12: 7ec151e96455e629b128e03de073365874ded6b3 late_instance_offline_update_azure_SLE12.tar.gz
Azure SLE15: a64556df71a487c8adf93adca10b06c236030275 late_instance_offline_update_azure_SLE15.tar.gz
GCE SLE12: 497e2c455696db1ae8e3450a07ac8c74f1674fe5 late_instance_offline_update_gce_SLE12.tar.gz
GCE SLE15: fc7e743f6e6bfb6c327faaab0d5e53af483deb7b late_instance_offline_update_gce_SLE15.tar.gz
The tarballs contain the latest released packages of the code that connects an instance to the SUSE update infrastructure and the related configuration plus the dependencies. The packages are all released and available from the update infrastructure. This means RPM signatures match and once the system is connected to the update infrastructure again there will be no difference and the tools will not know that the packages were installed from the tarball rather than from an update server.
Once the packages are installed access to the update infrastructure, spare the AWS instances originating from private images that were inadvertently turned into BYOS, is restored. It is not required to re-register the instance. All registration data was preserved and your system is still known on the update infrastructure.
A few additional comments on the AWS case
Note first that the inadvertent switch from PAYG to BYOS upon image copy only potentially applies to SLES instances. SLES For SAP is not affected by this condition. If you fall into this category you have no choice but to migrate the instance to an instance that is recognized as PYAG. This process is highly dependent on individual circumstances and therefore it is difficult to document all conditions. It is best to work with AWS support if you are in this situation.
Look for “The Special Cases” in Step 2 Toward Enhanced Update Infrastructure Access to find out how you can create a report for all your instances to check for this condition. This also contains a description about how to potentially migrate instances. We also have a Technical Information Document (TID) for this topic.
By no later than the end of June all regions in AWS, Azure, and GCP will be enabled to route traffic to the SUSE operated update infrastructure through a data center. This allows you to seal of your network egress in your regions, while previously you had to punch holes for the update infrastructure servers. The instance based access check also enables the “bring your own IP” feature in AWS.
You can check whether or not the region you are interested in has been switched over using
wget --no-check-certificate https://$SERVER_IP/smt.crt
and of course you get $SERVER_IP from pint. If this command works, i.e. you do not get a 403 or it takes forever from your system, i.e. not an instance running in the framework you are interested in, then the update infrastructure in that region has the new access model enabled. The switch from global to instance based access checks is also covered in a TID.