CAPI、Fleet、GitOps:RancherでKubernetesクラスタをオーケストレーションする新手法

日曜日, 5 11月, 2023

はじめに

このブログポストでは、Rancher 2.8の新機能の1つであり、Cluster API(CAPI)を使用したクラスタのデプロイを助ける、Rancher Turtlesの使い方を紹介します。

これは、Rancherを使ってKubernetesクラスタをデプロイするための既存の方法に追加されたもので、現在はテスト版の状態ですが、将来のバージョンで完全にサポートされるようになる見込みです。

現在、Rancher TurtlesとRancherのGitOpsツールであるFleetの助けを借りて、CAPIをサポートするプラットフォーム上のクラスタのライフサイクル管理を簡単に自動化することができます。

プロバイダーがCAPIをサポートしている場合、プラットフォーム固有のAPIを使用することなく、共通のAPIを使用して、クラスタに必要なリソースのプロビジョニングと管理をコントロールできます。CAPIを使えば、あまりカスタマイズすることなくKubernetesクラスタをハイブリッド環境で利用できるため、プロバイダを変更する必要がある場合も簡単に対応ができ、作業がよりスムーズになります。

CAPIとは?

CAPIとはCluster API(クラスタAPI)の略で、「複数のKubernetesクラスタのプロビジョニング、アップグレード、運用を簡素化するための宣言型APIとツールの提供に焦点を当てたKubernetesのサブプロジェクト」です。(出典:https://cluster-api.sigs.k8s.io/)。

これは、Kubernetesクラスタのライフサイクル管理を支援するもので、クラスタがオンプレミスにデプロイされているかクラウドにデプロイされているかに関係なく、プラットフォームにとらわれず、統一したクラスタ操作を提供することができます。

Kubernetesの実行に必要ないクラスタ下のインフラのライフサイクルを管理したり、異なるインフラプロバイダーにまたがるクラスタを管理したり、作成やアップグレード以外にクラスタノードを設定したりすることは意図されていません。

より詳細な情報については、The Cluster API Book(特に導入概念の部分)を確認することをお勧めします。

Rancher Turtlesのセットアップ(オプション)

冒頭で述べたように、Rancher Turtlesは異なるCAPIプロバイダと統合することを可能にする技術であり、古いバージョンのRancherにはデフォルトで付属していませんが、古いクラスタで試したい場合は、以下の方法で行うことができます。

 

Requirements: Rancher 2.7以上

コンソールから、組み込みCAPI機能を無効にします:

kubectl apply -f feature.yaml

---
apiVersion: management.cattle.io/v3
kind: Feature
metadata:
name: embedded-cluster-api
spec:
value: false
...

kubectl delete mutatingwebhookconfiguration.admissionregistration.k8s.io mutating-webhook-configuration
kubectl delete validatingwebjookconfigurations.admissionregistration.k8s.io validating-webhook-configuration

Rancher Turtlesリポジトリを追加する

管理クラスタに切り替えて、Rancher Turtlesのアプリケーションリポジトリを追加します:

helm repo add turtles https://rancher-sandbox.github.io/rancher-turtles

 

Rancher Turtlesをインストールする

helm -n rancher-turtles-system install rancher-turtles --create-namespace --set cluster-api-operator.cert-manager.enabled=false

現在、Cert ManagerはRancher Turtlesの要件であることに注意してください。この例ではインストールされていることを前提としていますが、オペレータに自動的にインストールさせたい場合は、cluster-api-operator.cert-manager.enabled=true(デフォルトオプション)に設定してください。

 

CAPIプロバイダーを追加インストールする

kubectl apply -f capd-provider.yml

---
apiVersion: v1
kind: Namespace
metadata:
name: capd-system
...
---
apiVersion: operator.cluster.x-k8s.io/v1alpha1
kind: InfrastructureProvider
metadata:
name: docker
namespace: capd-system
spec:
secretName: capi-env-variables
secretNamespace: capi-system
...

この後、GitOpsの原則に従って新しいクラスタをプロビジョニングする準備が整いました。

Fleetを使ったGitOpsに続く新しいクラスタのプロビジョニング!

CAPIを使うことで、新しいAPIを学んだり、それぞれのAPIに対して多くのカスタマイズを施したりすることなく、異なるプラットフォームへのクラスタのデプロイが簡単になることをお話ししました。今回は、このCAPIの定義をFleetで使って、GitOpsの原則に従ってKubernetesクラスタを管理する方法をご紹介します。

 

プロセス:

Fleetでリポジトリが設定された後、または後で誰かがリポジトリに変更を加えたとき。

user_commit_and_push_changes_to_git

Fleetはこれらの変更をチェックし、CAPIクラスタ定義が見つかると、FleetはそれをRancher Turtlesに渡します。

fleet_check_git_and_send_to_turtles

その後、Turtlesはファイルを処理し、指定されたCAPIプロバイダーに処理依頼を行います:

Animation showing Turtles using CAPI to deploy Kubernetes clusters on 2 different infrastructure providers

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

新機能とFleetについて話してますが、Fleet新バージョンの特にエキサイティングな機能をご紹介したいと思います:
ドリフトリコンシリエーション
これによって、リソースがGITリポジトリで定義されたものと一致しない場合、それを上書きして同じ状態にするようFleetに指示できるようになりました。詳細については新しいブログ記事を作成する予定です。

 

Fleet の設定

まず、gitリポジトリをFleetに追加します。

実際のデプロイはインフラストラクチャプロバイダが行います。

kubectl apply -f myclusters-repo.yaml

---
apiVersion: fleet.cattle.io/v1alpha1
kind: GitRepo
metadata:
name: clusters
namespace: fleet-local
spec:
repo:
https://github.com/rancher-sandbox/rancher-turtles-fleet-example.git
branch: main
paths:
- clusters
...

これで、Fleetがリポジトリの “main “ブランチの”/clusters “パスの配下の変更を検出すると、定義した変更を自動的に環境へ適用します。

これは単なる例であることに注意してください。さらに複雑な条件を取り入れるために、このリポジトリ定義をカスタマイズすることができますが、コンセプトは同じです。

クラスタ定義の作成とプルリクエストの承認が、新しいクラスタ作成の準備に必要な唯一のステップです。

おまけ: Fleet-localネームスペースでデプロイしていることにお気づきでしょうか?もし興味があり、もっと知りたいのであれば、このリンクをチェックしてください。

Rancherの設定:

これでRancherにアクセスして、CRDがデプロイされているネームスペース内のすべてのクラスタを自動インポートしたいと指示することができます。
そのためには、以下のコマンドを実行してrancher-auto-import機能を有効にします。

kubectl label namespace <mynamespace> cluster-api.cattle.io/rancher-auto-import=true

この例では、クラスタ定義がネームスペース「default」にあることにご注意してください。

しばらくすると、Rancherの “Cluster Management “セクションに新しいクラスタがインポートされ、管理している他のクラスタと同じように表示されるのがわかります。

 

CAPIを使って新しくデプロイされたクラスタを調べる

クラスタの右側にある “Explore “ボタンをクリックすると、Rancherで他のクラスタと同じように管理できます。

kubeconfigをローカルにコピーし、便宜上、以下のコマンドを実行してkubectlやその他のツールで使用するデフォルトとして設定します:

export KUBECONFIG=<my-new-cluster-kubeconfig-file>

あるいは、選択したツールのコマンドラインで指定します。

例えば、コマンドを実行して動作を確認することができます:

kubectl get pods -A -w --insecure-skip-tls-verify

新しいクラスタ上で動いているポッドが表示されるはずです。

まとめ

コマンドラインから操作を行う方法を見てきましたが、将来的にRancherはWeb UIから直接Turtleを管理するためのUI拡張機能を組み込む予定です!

Rancher Primeの詳細と、コンテナ技術であなたのビジネスをさらに成長させ、より安全で俊敏にするための支援方法については、当社のウェブサイトをご覧ください。

Rancherについてさらに詳しくお知りになりたい方は、無料のホワイトペーパー「Why rancher?」をダウンロードいただくか、Rancher RodeoRancher Academyにご参加ください。

👉SUSEの製品やサービスに関する詳細については、お気軽にお問い合わせください

SUSEは選択肢を提供します

火曜日, 11 7月, 2023

25年以上にわたり、オープンソースは私たちの世界に革命をもたらしてきました。Linuxの成長から仮想化、クラウドへの移行など、テクノロジーにおける主要な進歩の全てとは言わないまでも、その多くはオープンソースの革新が原動力となっています。私にとって、その理由は明らかです。
開発結果が後で還元され、全員が利益を得ることができる枠組みの下で、できるだけ多くの人が最善策を見つけるために努力することを良いことと思いませんか?そして問題が見つかれば、皆で解決する
その根底にあるのは、ソフトウェアは「誰でも自由にアクセスし、使用し、変更し、(修正された形でも修正されていない形でも)共有できる」べきであるという考え方です。
ベンダーから提供されたソースを顧客が共有することを制限することは、ソフトウェアを(1ユーザーとして)共同で分析し、監査する能力を制限します。

 

SUSEは、この見解を100%支持します。プロプライエタリになることが、オープンソース企業間の競争の根拠になってはなりません。私たちは皆、オープンソースコミュニティに貢献しています。同じように、私たちは皆、オープンソースコミュニティから恩恵を受けています。オープンソースは、私たちの総和よりも大きなものなのです。

SUSEでは、オープンソースコミュニティと積極的に協力し、オープンソースプロジェクトからエンタープライズグレードの製品を構築しています。当社の顧客は、ソフトウェアに対価を支払うのではなく、長期にわたる24時間365日のサポート、セキュリティ、認定スタック、そしてオープンソースコミュニティの代表として、ビジネスクリティカルな環境でそれを実行する能力に対価を支払うのです。私たちは、お客様にとって最も信頼性が高く、費用対効果に優れたベンダーであることを競うのです。

ソースコード入手の制限により、競争環境は誤った方向にシフトすると考えています。

最優先事項は、お客様に選択肢を提供し続けることです。SUSEは本日、RHELコードベースのハードフォークを構築し、サポートし、コミュニティに貢献すると発表しました。これは我々が得意とすることであり、顧客に長期的な互換性と選択肢を提供することになります。

本取り組みをわかりやすくご説明いたします:

携帯電話のユーザーであれば、電話番号を保持したまま電話会社のプロバイダを変更できる機能を求めるでしょう。

同様に、Enterprise Linuxのユーザーであれば、既存のLinuxを維持したままSUSEに乗り換えることができます。SUSEは、オープンソースソフトウェアのユーザに対して、顧客にとって重要な点を妥協することなく、高い競争力で企業価値を提供するエキスパートです。

SUSEは、ユニークな立場にあります。SUSEには、30年以上にわたってLinuxに貢献してきたエンジニアリングの専門知識があり、ミッションクリティカルなワークロードに対応できる体制を整えています。当社のチームは、混在環境のサポートに豊富な経験を有しています。昨年は、CentOSとRHELのサポートを必要とするお客様のために、SUSE Liberty Linuxを発表しました。さらに、SUSE Managerは、さまざまなLinuxディストリビューションを効率的に管理できることで長年にわたって定評があり、ユーザーに柔軟性と選択肢を提供するという当社の取り組みを示しています。SUSEは、この作業を共有するというコミットメントを堅持しています。私たちは、皆様がソースコードに自由かつオープンにアクセスできるようにし、プロジェクトが制限されることは決してありません。

もう1点付け加えたいと思います。SUSEが、openSUSE Linuxディストリビューションだけでなく、SUSE Linux Enterprise(SLE)およびAdaptable Linux Platform(ALP)ソリューションにも引き続き全面的にコミットしていることは言うまでもありません。私たちは、企業やコミュニティが、混在した環境でも自由にイノベーションを起こせるようにしたいと考えています。

私たちと同じように、選択肢の実現にワクワクしている方は、ぜひご参加ください。
皆様からのご連絡をお待ちしております: Choice@SUSE.com

SUSEは皆様に選択肢を提供します!

DP

オープンソースを取り巻く環境変化への対応

木曜日, 29 6月, 2023

Red Hat社は先週、ソースコードへのアクセスポリシーを大幅に変更しました。ベンダー、開発者、ユーザーに対する影響は大きく、この動きはオープンソースコミュニティに懸念をもたらしました。私は、この決定にできる限り焦点を当て、コミュニティ全般、特にSUSEの顧客とパートナーにご安心いただきたいと思います。

何が起きたのか?

Red Hatは、Red Hat Enterprise Linux(RHEL)のソースコードへのパブリックアクセスを削除することを決定しました。これはソースコードアクセスポリシーの大きな変更であり、この決定はオープンソースコミュニティに大きな懸念を引き起こしました。これは当然の懸念です。RHELの存在は、SUSEをはじめとするさまざまな貢献者によって開発されたLinuxカーネルを含む、多くのアップストリームプロジェクトの協力に負うところが大きいのです。共に革新することこそが、私どものコアにあるものです。私たちは皆、個々の力を合わせた以上のものを築くために、努力しています。私たちは皆、相互に依存しているのです。

Navigating Changes in the Open Source Landscape

オープンソースの価値を守る

SUSEでは、オープンソースの原則とコラボレーションの力を大切にしています。オープンソースの状況の変化は、ダイナミクスを変える可能性はありますが、ソフトウェアへのアクセス、変更、配布の自由は、すべての人に開かれたままであるべきだと固く信じています。顧客満足度、安定性、信頼性へのコミットメントは揺るぎません。私たちは、堅牢なサポートインフラへの投資を継続し、タイムリーなアップデートを提供し、コミュニティユーザーとお客様にクラス最高のユーザーエクスペリエンスを提供します。

SUSE Liberty Linuxのお客様への取り組み

SUSEは、CentOSやRHELを含め、種々のディストリビューションの混在環境の運用および管理を行う多くの企業のお客様を支援しています。このようなお客様のためのソリューションがSUSE Liberty Linuxです。弊社は、SUSE Liberty Linuxのシームレスなエクスペリエンスを提供することに引き続き全力を尽くしており、お客様に安心していただきたいと考えています。Red Hatの決定がそれを変えることはありません。私たちは、オープンソースコミュニティのパートナーとの協力を継続し、数十年にわたる専門知識を活用して、今後もRed Hatのバイナリ互換アップデートとセキュリティ修正を提供していきます。

今後の展望

SUSEはまた、オープンソースのイノベーションは、コードの入手性ににとどまらず、コラボレーションによって促進されることも認識しています。SUSEは、業界の専門家、開発者、パートナーとの積極的な関わりを通じて、Adaptable Linux PlatformSUSE Linux Enterpriseスイートなど、openSUSEコミュニティを中心とした活気あるエコシステムの育成に引き続き取り組んでいきます。私たちは、オープンソースムーブメントを強化し、すべてのステークホルダーに豊かな未来を約束します。

真にオープンで協力的であり続けるために、オープンソースコミュニティとの既存の協力関係をどのように強化するつもりであるかについては、近日中に詳細をお知らせします。コミュニティとの協力関係の強化こそが、最善の道であり、唯一の道なのです。

When to Use K3s and RKE2

水曜日, 14 12月, 2022

K3s and Rancher Kubernetes Engine (RKE2) are two Kubernetes distributions from the SUSE Rancher container platform. Either project can be used to run a production-ready cluster; however, they target different use cases and consequently possess unique characteristics.

This article will explain the similarities and differences between the projects. You’ll learn when it makes sense to use RKE2 instead of K3s and vice versa. Selecting the right choice is important because it affects the security and compliance of the containerized workloads you deploy.

K3s and RKE2

K3s provides a production-ready Kubernetes cluster from a single binary that weighs in at under 60MB. Because K3s is so lightweight, it’s a great option for running Kubernetes at the edge on IoT devices, low-power servers and your developer workstations.

Meanwhile, RKE2 is an alternative project that also runs a production-ready cluster. It offers similar simplicity to K3s while adding additional security and conformance layers, including Federal Information Processing Standard (FIPS) 140-2 compliance for use in the U.S. federal government and DISA STIG compliance.

RKE2 has evolved from the original RKE project. It’s also known as RKE Government, reflecting its suitability for the most demanding sectors. It’s not just government agencies that can benefit, though — the distribution is ideal for all organizations that prioritize security and compliance, so it continues to be primarily marketed as RKE2.

Similarities between K3s and RKE2

K3s and RKE2 are both lightweight Cloud Native Computing Foundation (CNCF)-certified Kubernetes distributions that Rancher fully supports. Although they diverge in their target use cases, the two platforms have several intentional similarities in how they’re launched and operated. Both can be deployed with Rancher Manager, and they each run your containers using the industry-standard containerd runtime.

Usability

K3s and RKE2 each offer good usability with a quick setup experience.

Starting a K3s cluster on a new host can be achieved with one command and around 30 seconds of waiting:

$ curl -sfL https://get.k3s.io | sudo sh -

The service is registered and started for you, so you can immediately run kubectl commands to interact with your cluster:

$ k3s kubectl get pods

RKE2 doesn’t fare much worse. A similarly straightforward installation script is available to download its binary:

$ curl -sfL https://get.rke2.io | sudo sh -

RKE2 doesn’t start its service by default. Run the following commands to enable and start RKE2 in server (control plane) mode:

$ sudo systemctl enable rke2-server.service
$ sudo systemctl start rke2-server.service

You can find the bundled kubectl binary at /var/lib/rancher/rke2/bin. It’s not added to your PATH by default; a kubeconfig file is deposited to /etc/rancher/rke2/rke2.yaml:

$ export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
$ /var/lib/rancher/rke2/bin/kubectl get pods

Ease of operation

In addition to their usability, K3s and RKE2 are simple to operate. You can upgrade clusters created with either project by repeating the installation script on each node:

# K3s
$ curl -sfL https://get.k3s.io | sh -

# RKE2
$ curl -sfL https://get.rke2.io | sh -

You should repeat any flags you supplied to the original installation command.

Automated upgrades are supported using Rancher’s system-upgrade-controller. After installing the controller, you can declaratively create Plan objects that describe how to migrate the cluster to a new version. Plan is a custom resource definition (CRD) provided by the controller.

Backing up and restoring data is another common Kubernetes challenge. K3s and RKE2 also mirror each other in this field. Snapshots are automatically written and retained for a configurable period. You can easily restore a cluster from a snapshot by running the following command:

# K3s
$ k3s server \
    --cluster-reset \
    --cluster-reset-restore-path=/var/lib/rancher/k3s/server/db/etcd-old-<BACKUP_DATE>

# RKE2
$ rke2 server \
    --cluster-reset \
    --cluster-reset-restore-path=/var/lib/rancher/rke2/server/db/etcd-old-<BACKUP_DATE>

Deployment model

K3s and RKE2 share their single-binary deployment model. They bundle all their dependencies into one download, allowing you to deploy a functioning cluster with minimal Kubernetes experience.

The projects also support air-gapped environments to accommodate critical machines that are physically separated from your network. Air-gapped images are provided in the release artifacts. Once you’ve transferred the images to your machine, running the regular installation script will bootstrap your cluster.

High availability and multiple nodes

K3s and RKE2 are designed to run in production. K3s is often used in the development, too, having gained a reputation as an ideal single-node cluster. It has robust multi-node management and is capable of supporting fleets of IoT devices.

Both projects can run the control plane with high availability, too. You can distribute replicas of control plane components over several server nodes and use external data stores instead of the embedded ones.

Differences between K3s and RKE2

K3s and RKE2 both offer single-binary Kubernetes, high availability and easy backups, with many commands interchangeable between the two. However, some key differences affect where and when they should be used. It’s these characteristics that justify the distributions existing as two independent projects.

RKE2 is closer to upstream Kubernetes

K3s is CNCF-certified, but it deviates from upstream Kubernetes in a few ways. It uses SQLite instead of etcd as its default data store, although an embedded etcd instance is available as an option in modern releases. K3s also bundles additional utilities, such as the Traefik Ingress controller.

RKE2 sticks closer to standard Kubernetes, promoting conformance as one of its main features. This gives you confidence that workloads developed for other distributions will run reliably in RKE2. It reduces the risk of inconvenient gotchas that can occur when K3s steps out of alignment with upstream Kubernetes. RKE2 automatically uses etcd for data storage and omits nonstandard components included in K3s.

RKE2 uses embedded etcd

The standard SQLite database in K3s is beneficial for compactness and can optimize performance in small clusters. In contrast, RKE2’s default use of etcd creates a more conformant experience, allowing you to integrate directly with other Kubernetes tools that expect an etcd data store.

While K3s can be configured with etcd, it’s an option you need to turn on. RKE2 is designed around it, which reduces the risk of misconfiguration and subpar performance.

K3s also supports MySQL and PostgreSQL as alternative storage solutions. These let you manage your Kubernetes data using your existing tooling for relational databases, such as backups and maintenance operations. RKE2 only works with etcd, offering no support for SQL-based storage.

RKE2 is more security-minded

RKE2 has a much stronger focus on security. Whereas edge operation is K3s’s specialism, RKE2 prioritizes security as its greatest strength.

Hardened against the CIS benchmark

The distribution comes configured for compatibility with the CIS Kubernetes Hardening Benchmark v1.23 (v1.6 in RKE2 v1.25 and earlier). The defaults that RKE2 applies allow your clusters to reach the standard with minimal manual work.

You’re still responsible for tightening the OS-level controls on your nodes. This includes applying appropriate kernel parameters and ensuring the etcd data directory is correctly protected.

You can enforce that a safe configuration is required by starting RKE2 with the profile flag set to cis-1.23. RKE2 will then exit with a fatal error if the operating system hasn’t been suitably hardened.

Beyond configuring the OS, you must also set up suitable network policies](https://kubernetes.io/docs/concepts/services-networking/network-policies) and [Pod security admission rules] to secure your cluster’s workloads. The security admission controller can be configured to use profiles which meet the CIS benchmark. This will prevent non-compliant Pods from being deployed to your cluster.

Regularly scanned for threats

The safety of the RKE2 distribution is maintained within its build pipeline. Components are regularly scanned for new common vulnerabilities and exposures (CVEs) using the Trivy container vulnerability tool. This provides confidence that RKE2 itself isn’t harboring threats that could let attackers into your environment.

FIPS 140-2 compliant

K3s lacks any formal security accreditations. RKE2 meets the FIPS 140-2 standard that the U.S. government uses to approve cryptographic modules.

The project’s Go code is compiled using FIPS-validated crypto modules instead of the versions in the Go standard library. Each of the distribution’s components, from the Kubernetes API server through to kubelet and the bundled kubectl binary, are compiled with the FIPS-compatible compiler.

The FIPS mandate means RKE2 can be deployed in government environments and other contexts that mandate verifiable cryptographic performance. The entire RKE2 stack is compliant when you use the built-in components, such as the containerd runtime and etcd data store.

When to use K3s

K3s should be your preferred solution when you’re seeking a performant Kubernetes distribution to run on the edge. It’s also a good choice for single-node development clusters as well as ephemeral environments used in your CI pipelines and other build tools.

This distribution makes the most sense in situations where your primary objective is deploying Kubernetes with all dependencies from a single binary. It’s lightweight, quick to start and easy to manage, so you can focus on writing and testing your application.

When to use RKE2

You should use RKE2 whenever security is critical, such as in government services and other highly regulated industries, including finance and healthcare. As previously stated, the complete RKE2 distribution is FIPS 140-2 compliant and comes hardened with the CIS Kubernetes Benchmark. It’s also the only DISA STIG-certified Kubernetes distribution, meaning it’s approved for use in the most stringent U.S. government environments, including the Department of Defense.

RKE2 is fully certified and tightly aligned with upstream Kubernetes. It omits the K3s components that aren’t standard Kubernetes or that are unstable alpha features. This increases the probability that your deployments will be interoperable across different environments. It also reduces the risk of nonconformance that can occur through oversight when you’re manually hardening Kubernetes clusters.

Near edge computing is another primary use case for RKE2 over K3s. RKE2 ships with support for multiple CNI networking plugins, including Cilium, Calico, and Multus. Multus allows pods to have multiple network interfaces attached, making it ideal for use cases such as telco distribution centers and factories with several different production facilities. In these situations, it’s critical to have robust networking support with different network adapters. K3s bundles Flannel as its built-in CNI plugin; you can install a different provider, but all configuration has to be performed manually. RKE2’s default distribution provides integrated options for common networking solutions.

Conclusion

K3s and RKE2 are two popular Kubernetes distributions that overlap each other in several ways. They both offer a simple deployment experience, frictionless long-term maintenance, and high performance and compatibility.

While designed for tiny and far-edge use cases, K3s are not limited to these scenarios. It’s also widely used for development, in labs, or in resource-constrained environments. However, K3s is not focused on security, so you’ll need to secure and enforce your clusters.

RKE2 takes the usability ethos from K3s and applies it to a fully conformant distribution. It’s tailored for security, close alignment with upstream Kubernetes, and compliance in regulated environments such as government agencies. RKE2 is suitable for both data centers and near-edge situations as it offers built-in support for advanced networking plugins, including Multus.

Which you should choose depends on where your cluster will run and the workloads you’ll deploy. You should use RKE2 if you want a hardened distribution for security-critical workloads or if you need FIPS 140-2 compliance. It will help you establish and maintain a healthy security baseline. K3s remain an actively supported alternative for less sensitive situations and edge workloads. It’s a batteries-included project for when you want to focus on your apps instead of the environment that runs them.

Both distributions can be managed by Rancher and integrated with your DevOps toolbox. You can use solutions like Fleet to deploy applications at scale with GitOps strategies, then head to the Rancher dashboard to monitor your workloads.

Keeping Track of Kubernetes Deprecated Resources

水曜日, 9 11月, 2022

It’s a fact of life: as the Kubernetes API evolves, it’s periodically reorganized or upgraded. This means some Kubernetes resources can be deprecated and later removed. We deserve to keep track of those deprecations and removals easily. For that, we have just released the new deprecated-api-versions policy for Kubewarden, our efficient Kubernetes policy engine that runs policies compiled to Wasm. This policy checks for the usage of Kubernetes resources that have been deprecated or removed from the Kubernetes API.

A look at the deprecated-api-versions policy

This policy has two settings:

  1. kubernetes_version: The starting version begins with where to detect deprecated or removed Kubernetes resources. This setting is mandatory.
  2. deny_on_deprecation: If true, it will deny the operation on a resource that has been deprecated but not yet removed from the Kubernetes version specified by kubernetes_version. This setting is optional and is set to true by default.

As an example, extensions/v1beta1/Ingress was deprecated in Kubernetes 1.14.0, and removed in v1.22.0.

With the following policy settings, the policy accepts an extensions/v1beta1/Ingress in the cluster, yet the policy logs this result:

kubernetes_version: "1.19.0"
deny_on_deprecation: false

In contrast, with these other settings, the policy blocks the Ingress object:

kubernetes_version: "1.19.0"
deny_on_deprecation: true # (the default)

Don’t live in the past

Kubernetes deprecations evolve; we will update the policy as soon as there are new deprecations. The policy versioning scheme tells you up to what version of Kubernetes the policy knows about, e.g. 0.1.0-k8sv1.26.0 means that the policy knows about deprecations up to Kubernetes v1.26.0.

Back to the future

You are about to update your cluster’s Kubernetes version and wonder, will your workloads keep working? Will you be in trouble because of deprecated or removed resources in the new version? Check before updating! Just instantiate the deprecated-api-versions policy with the targetted Kubernetes version and deny_on_deprecation set to false, and get an overview of future-you problems.

In action

As usual, instantiate a ClusterAdmissionPolicy (cluster-wide) or AdmissionPolicy (namespaced) that makes use of the policy.

For this example, let’s work in a k8s cluster of version 1.24.0.

Here’s a definition of a cluster-wide policy that rejects resources that were deprecated or removed in Kubernetes version 1.23.0 and earlier:

kubectl apply -f - <<EOF
apiVersion: policies.kubewarden.io/v1
kind: ClusterAdmissionPolicy
metadata:
  name: my-deprecated-api-versions-policy
spec:
  module: ghcr.io/kubewarden/policies/deprecated-api-versions:v0.1.0-k8sv1.26.0
  mutating: false
  rules:
  - apiGroups: ["*"]
    apiVersions: ["*"]
    resources: ["*"]
    operations:
    - CREATE
    - UPDATE
  settings:
    kubernetes_version: "1.23.0"
    deny_on_deprecation: true
EOF

Info: In spec.rules we are checking every resource in every apiGroup and apiVersions. We are doing it for simplicity in this example, yet the policy metadata.yaml comes with long and complete, machine-generated spec.rules that covers just the resources that are deprecated.

You can obtain the right rules by using the kwctl scaffold command.

Our cluster is on version 1.24.0, so for example, without the policy, we could still instantiate an autoscaling/v2beta2/HorizontalPodAutoscaler, even if it is deprecated since 1.23.0 (and will be removed in 1.26.0).

Now with the policy, trying to instantiate an autoscaling/v2beta2/HorizontalPodAutoscaler resource that is already deprecated will result in its rejection:

kubectl apply -f - <<EOF
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
EOF

Warning: autoscaling/v2beta2 HorizontalPodAutoscaler is deprecated in v1.23+, unavailable in v1.26+; use autoscaling/v2 HorizontalPodAutoscaler
Error from server: error when creating "STDIN":
admission webhook "clusterwide-my-deprecated-api-versions-policy.kubewarden.admission" denied the request:
autoscaling/v2beta2 HorizontalPodAutoscaler cannot be used. It has been deprecated starting from 1.23.0. It has been removed starting from 1.26.0. It has been replaced by autoscaling/v2.

Have ideas for new policies? Would you like more features on existing ones? Drop us a line at #kubewarden on Slack! We look forward to your feedback 🙂

Kubernetes Jobs Deployment Strategies in Continuous Delivery Scenarios

水曜日, 5 10月, 2022

Introduction

Continuous Delivery (CD) frameworks for Kubernetes, like the one created by Rancher with Fleet, are quite robust and easy to implement. Still, there are some rough edges you should pay attention to. Jobs deployment is one of those scenarios where things may not be straightforward, so you may need to stop and think about the best way to process them.

We’ll explain here the challenges you may face and will give some tips about how to overcome them.

While this blog is based on Fleet’s CD implementation, most of what’s discussed here also applies to other tools like ArgoCD or Flux.

The problem

Let’s start with a small recap about how Kubernetes objects work.

There are some elements in Kubernetes objects that are immutable. That means that changes to the immutable fields are not allowed once one of those objects is created.

A Kubernetes Job is a good example as the template field, where the actual code of the Job is defined, is immutable. Once created, the code can’t be changed. If we make any changes, we’ll get an error during the deployment.
This is how the error looks when we try to update the Job:

The Job "update-job" is invalid: spec.template: Invalid value: 
core.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", 
UID:"", ResourceVersion:"", 
.... 
: field is immutable

And this is how the error is shown on Fleet’s UI:

Job replace error as seen on Rancher Fleet

When we do deployments manually and update code within Jobs, we can delete the previous Job manually and relaunch our deployment. However, when using Continuous Delivery (CD), things should work without manual intervention. It’s critical to find a way for the CD process to run without errors and update Jobs in a way that doesn’t need manual intervention.

Thus we have reached the point where we have a new version of a Job code that needs to be deployed, and the old Job should not stop us from doing that in an automated way.

Things to consider before configuring your CD process

Our first recommendation, even if not directly related to the Fleet or Jobs update problem, is always to try using Helm to distribute applications.

Helm installation packages (called Charts) are easy to build. You can quickly build a Chart by putting together all your Kubernetes deployment files in a folder called templates plus some metadata (name, version, etc.) defined in a file called Chart.yaml.

Using Helm to distribute applications with CD tools like Fleet and ArgoCD offers some additional features that can be useful when dealing with Job updates.

Second, In terms of how Jobs are implemented and their internal logic behaves, they must be designed to be idempotent. Jobs are meant to be run just once, and if our CD process manages to update and recreate them on each run, we need to be sure that relaunching them doesn’t break anything.

Idempotency is an absolute must while designing Kubernetes CronJobs, but all those best practices should also be applied here to avoid undesired side effects.

Solutions

Solution 1: Let Jobs self-destroy after execution

The process is well described in Kubernetes documentation:

“A way to clean up finished Jobs automatically is to use a TTL mechanism provided by a TTL controller for finished resources, by specifying the spec.ttlSecondsAfterFinished field of the Job”

Example:

apiVersion: batch/v1
kind: TestJob
metadata:
  name: job-with-ttlsecondsafterfinished
spec:
  ttlSecondsAfterFinished: 5
  template:
    spec:
...

When we add ttlSecondsAfterFinished to the spec, the TTL controller will delete the Job once it finishes. If the field is not present or the value is empty, the Job won’t be deleted, following the traditional behavior. A value of 0 will fire the deletion just after it finishes its execution. An integer value greater than 0 will define how many seconds will pass after the execution before the Job is deleted. This new feature has been stable since Kubernetes 1.23.

While this seems a pretty elegant way to clean finished jobs (future runs won’t complain about the update of immutable resources), it creates some problems for the CD tools.

The entire CD concepts rely on the fact that our external repos holding our application definition are the source of truth. That means that CD tools are constantly monitoring our deployment and notifying us about changes.

As a result, Fleet detects a change when the Job is deleted, so the deployment is marked as “Modified”:

Rancher Fleet modified Git repo

If we look at the details, we can see how Fleet detected that the Job is missing:

Rancher Fleet modified Git repo details

The change can also be seen in the corresponding Fleet Bundle status:

...
status:
  conditions:
  - lastUpdateTime: "2022-10-02T19:39:09Z"
    message: Modified(2) [Cluster fleet-default/c-ntztj]; job.batch fleet-job-with-ttlsecondsafterfinished/job-with-ttlsecondsafterfinished
      missing
    status: "False"
    type: Ready
  - lastUpdateTime: "2022-10-02T19:39:10Z"
    status: "True"
    type: Processed
  display:
    readyClusters: 0/2
    state: Modified
  maxNew: 50
  maxUnavailable: 2
  maxUnavailablePartitions: 0
  observedGeneration: 1
...

This solution is really easy to implement, with the only drawback of having repositories marked as Modified, which may be confusing over time.

Solution 2: Use a different Job name on each deployment

Here is when Helm comes to our help.

Using Helm’s processing, we can easily generate a random name for the Job each time Fleet performs an update. The old Job will also be removed as it’s no longer in our Git repository.

apiVersion: batch/v1
kind: Job
metadata:
  name: job-with-random-name-{{ randAlphaNum 8 | lower }}
  labels:
    realname: job-with-random-name
spec:
  template:
    spec:
      restartPolicy: Never
...

Summary

We hope what we have shared here helps to understand better how to manage Jobs in CD scenarios and how to deal with changes on immutable objects.

The code examples used in this blog are available on our GitHub repository: https://github.com/SUSE-Technical-Marketing/fleet-job-deployment-examples

If you have other scenarios that you’d want us to cover, we’d love to hear them and discuss ideas that can help improve Fleet in the future.

Please join us at the Rancher’s community Slack channel for Fleet and Continuous Delivery or at the CNCF official Slack Channel for Rancher.

Tags: ,,, Category: Digital Transformation Comments closed

Epinio and Crossplane: the Perfect Kubernetes Fit

木曜日, 18 8月, 2022

One of the greatest challenges that operators and developers face is infrastructure provisioning: it should be resilient, reliable, reproducible and even audited. This is where Infrastructure as Code (IaC) comes in.

In the last few years, we have seen many tools that tried to solve this problem, sometimes offered by the cloud providers (AWS CloudFormation) or vendor-agnostic solutions like Terraform and Pulumi. However, Kubernetes is becoming the standard for application deployment, and that’s where Crossplane fits in the picture. Crossplane is an open source Kubernetes add-on that transforms your cluster into a universal control plane.

The idea behind Crossplane is to leverage Kubernetes manifests to build custom control planes that can compose and provision multi-cloud infrastructure from any provider.

If you’re an operator, its highly flexible approach gives you the power to create custom configurations, and the control plane will track any change, trying to keep the state of your infrastructure as you configured it.

On the other side, developers don’t want to bother with the infrastructure details. They want to focus on delivering the best product to their customers, possibly in the fastest way. Epinio is a tool from SUSE that allows you to go from code to URL in just one push without worrying about all the intermediate steps. It will take care of building the application, packaging your image, and deploying it into your cluster.

This is why these two open source projects fit perfectly – provisioning infrastructure and deploying applications inside your Kubernetes platform!

Let’s take a look at how we can use them together:

# Push our app 
-> % epinio push -n myapp -p assets/golang-sample-app 

# Create the service 
-> % epinio service create dynamodb-table mydynamo 

# Bind the two 
-> % epinio service bind mydynamo myapp 

That was easy! With just three commands, we have:

  1. Deployed our application
  2. Provisioned a DynamoDB Table with Crossplane
  3. Bound the service connection details to our application

Ok, probably too easy, but this was just the developer’s perspective. And this is what Epinio is all about: simplifying the developer experience.

Let’s look at how to set up everything to make it work!

Prerequisites

I’m going to assume that we already have a Kubernetes cluster with Epinio and Crossplane installed. To install Epinio, you can refer to our documentation. This was tested with the latest Epinio version v1.1.0, Crossplane v.1.9.0 and the provider-aws v0.29.0.

Since we are using the enable-external-secret-stores alpha feature of Crossplane to enable it, we need to provide the args={--enable-external-secret-stores} value during the Helm installation:

-> % helm install crossplane \
    --create-namespace --namespace crossplane-system \
    crossplane-stable/crossplane \
    --set args={--enable-external-secret-stores}

 

Also, provide the same argument to the AWS Provider with a custom ControllerConfig:

apiVersion: pkg.crossplane.io/v1alpha1
kind: ControllerConfig
metadata:
  name: aws-config
spec:
  args:
  - --enable-external-secret-stores
---
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
  name: crossplane-provider-aws
spec:
  package: crossplane/provider-aws:v0.29.0
  controllerConfigRef:
    name: aws-config

 

Epinio services

To use Epinio and Crossplane together, we can leverage the Epinio Services. They provide a flexible way to add custom resources using Helm charts. The operator can prepare a Crossplane Helm chart to claim all resources needed. The Helm chart can then be added to the Epinio Service Catalog. Finally, the developers will be able to consume the service and have all the needed resources provisioned.

 

Prepare our catalog

We must prepare and publish our Helm chart to add our service to the catalog.

In our example, it will contain only a simple DynamoDB Table. In a real scenario, the operator will probably define a claim to a Composite Resource, but for simplicity, we are using some Managed Resource directly.

For a deeper look, I’ll invite you to take a look at the Crossplane documentation about composition.

We can see that this resource will “publish” its connection details to a secret defined with the publishConnectionDetailsTo attribute (this is the alpha feature that we need). The secret and the resource will have the app.kubernetes.io/instance label with the Epinio Service instance name. We can correlate the two with this label as Epinio services and configurations.

apiVersion: dynamodb.aws.crossplane.io/v1alpha1
kind: Table
metadata:
  name: {{ .Release.Name | quote }}
  namespace: {{ .Release.Namespace | quote }}
  labels:
    app.kubernetes.io/instance: {{ .Release.Name | quote }}
spec:
  publishConnectionDetailsTo:
    name: {{ .Release.Name }}-conn
    metadata:
      labels:
        app.kubernetes.io/instance: {{ .Release.Name | quote }}
      annotations:
        kubed.appscode.com/sync: "kubernetes.io/metadata.name={{ .Release.Namespace }}"
  providerConfigRef:
    name: aws-provider-config
  forProvider:
    region: eu-west-1
    tags:
    - key: env
      value: test
    attributeDefinitions:
    - attributeName: Name
      attributeType: S
    - attributeName: Surname
      attributeType: S
    keySchema:
    - attributeName: Name
      keyType: HASH
    - attributeName: Surname
      keyType: RANGE
    provisionedThroughput:
      readCapacityUnits: 7
      writeCapacityUnits: 7

 

Note: You can see a kubed annotation in this Helm chart. This is because the generated secrets need to be in the same namespace as the services and applications. Since we are using directly a managed resource then the secret will be in the namespace defined in the default StoreConfig (the crossplane-system  namespace). We are using kubed to copy this secret in the release namespace.
https://github.com/crossplane/crossplane/blob/master/design/design-doc-external-secret-stores.md#secret-configuration-publishconnectiondetailsto

 

We can now package and publish this Helm chart to a repository and add it to the Epinio Service Catalog by applying a service manifest containing the information on where to fetch the chart.

The application, .epinio.io/catalog-service-secret-types, define the list of the secret types that Epinio should look for. Crossplane will generate the secrets with their own secret type, so we need to explicit it.

apiVersion: application.epinio.io/v1
kind: Service
metadata:
  name: dynamodb-table
  namespace: epinio
  annotations:
    application.epinio.io/catalog-service-secret-types: connection.crossplane.io/v1alpha1
spec:
  name: dynamodb-table
  shortDescription: A simple DynamoDBTable that can be used during development
  description: A simple DynamoDBTable that can be used during development
  chart: dynamodb-test
  chartVersion: 1.0.0
  appVersion: 1.0.0
  helmRepo:
    name: reponame
    url: https://charts.example.com/reponame
  values: ""

 

Now we can see that our custom service is available in the catalog:

-> % epinio service catalog

Create and bind a service

Now that our service is available in the catalog, the developers can use it to provision DynamoDBTables with Epinio:

-> % epinio service create dynamodb-table mydynamo

We can check that a dynamo table resource was created and that the corresponding table is available on AWS:

-> % kubectl get tables.dynamodb.aws.crossplane.io
-> % aws dynamodb list-tables

We can now create an app with the epinio push command. Once deployed, we can bind it to our service with the epinio service bind:

-> % epinio push -n myapp -p assets/golang-sample-app
-> % epinio service bind mydynamo myapp
-> % epinio service list

And that’s it! We can see that our application was bound to our service!

The bind command did a lot of things. It fetched the secrets generated by Crossplane and labeled them as Configurations. It also redeployed the application mounting these configurations inside the container.

We can check this with some Epinio commands:

-> % epinio configuration list

-> % epinio configuration show x937c8a59fec429c4edeb339b2bb6-conn

The shown access path is available in the application container. We can use exec in the app and see the content of that files:

-> % epinio app exec myapp

 

Conclusion

In this blog post, I’ve shown you it’s possible to create an Epinio Service that will use Crossplane to provide external resources to your Epinio application. We have seen that once the heavy lifting is done, the provision of a resource is just a matter of a couple of commands.

While some of these features are not ready, the Crossplane team is working hard on them, and I think they will be available soon!

Next Steps: Learn More at the Global Online Meetup on Epinio

Join our Global Online Meetup: Epinio on Wednesday, September 14th, 2022, at 11 AM EST. Dimitris Karakasilis and Robert Sirchia will discuss the Epinio GA 1.0 release and how it delivers applications to Kubernetes faster, along with a live demo! Sign up here.

Tags: ,,,,,, Category: 未分類 Comments closed

Secure Supply Chain: Verifying Image Signatures in Kubewarden

金曜日, 20 5月, 2022

Secure Supply Chain: Verifying image signatures

 

After these last releases Kubewarden now has support for verifying the integrity and authenticity of artifacts within Kubewarden using the Sigstore project. In this post, we shall focus on verifying container image signatures using the new verify-image-signatures policy.

To learn more about how Sigstore works, take a look at our previous post

Verify Image Signatures Policy

This policy validates Pods by checking their container images for signatures (that is, containers, init containers and ephemeral containers in the pod)

The policy can inspect all the container images defined inside of a Pod or it can just analyze the ones that are matching a pattern provided by the user.

Container image tags are mutable, they can be changed to point to a completely different content. That’s why it’s a good security practice to reference container images by their immutable checksum.

This policy can rewrite the image definitions that are using a tag to instead reference the image by its checksum.

The policy will:

  • Ensure the image referenced by a tag is satisfying the signature requested by the operator
  • Extract the immutable reference of the image from the signatures
  • Rewrite the image reference to be in the form <image ref>@sha256:<digest>

Let’s see it in action!

For this example, a Kubernetes cluster with Kubewarden already installed is required. The installation process is described in the quick start guide.

We need an image with a signature that we can verify. You can use cosign to sign your images. For this example we’ll use the image ghcr.io/viccuad/app-example:v0.1.0 that was signed using keyless verification.

Obtain the issuer and subject using cosign.

COSIGN_EXPERIMENTAL=1 cosign verify ghcr.io/viccuad/app-example:v0.1.0
...
"Issuer": "https://token.actions.githubusercontent.com",
"Subject": "https://github.com/viccuad/app-example/.github/workflows/ci.yml@refs/tags/v0.1.0"
...

Let’s create a cluster-wide policy that will verify all images, and let’s use the issuer and subject for verification:

kubectl apply -f - <<EOF
apiVersion: policies.kubewarden.io/v1alpha2
kind: ClusterAdmissionPolicy
metadata:
  name: verify-image-signatures
spec:
  module: ghcr.io/kubewarden/policies/verify-image-signatures:v0.1.4
  rules:
  - apiGroups: [""]
    apiVersions: ["v1"]
    resources: ["pods"]
    operations:
    - CREATE
    - UPDATE
  mutating: true
  settings:
    signatures:
      - image: "*"
        keyless:
          - issuer: "https://token.actions.githubusercontent.com"
            subject: "https://github.com/viccuad/app-example/.github/workflows/ci.yml@refs/tags/v0.1.0"
EOF

Wait for the policy to be active:

kubectl wait --for=condition=PolicyActive clusteradmissionpolicies verify-image-signatures

Verify we can create pods with containers that are signed:

kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: verify-image-valid
spec:
  containers:
  - name: test-verify-image
    image: ghcr.io/viccuad/app-example:v0.1.0
EOF

Then check that the image was modified with the digest:

kubectl get pod verify-image-valid -o=jsonpath='{.spec.containers[0].image}'

This will produce the following output:

ghcr.io/viccuad/app-example:v0.1.0@sha256:d97d00f668dc5b7f0af65edbff6b37924c8e9b1edfc0ab0f7d2e522cab162d38

Finally, let’s try to create a pod with an image that it is not signed:

kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: verify-image-invalid
spec:
  containers:
  - name: test-verify-image
    image: ghcr.io/kubewarden/test-verify-image-signatures:unsigned
EOF

We will get the following error:

Error from server: error when creating "STDIN": admission webhook "clusterwide-verify-image-signatures.kubewarden.admission" denied the request: Pod verify-image-invalid is not accepted: verification of image ghcr.io/kubewarden/test-verify-image-signatures:unsigned failed: Host error: Callback evaluation failure: no signatures found for image: ghcr.io/kubewarden/test-verify-image-signatures:unsigned 

Recap

This policy is designed to meet all your needs. However, if you prefer you can build your own policy using one of the SDKs Kubewarden provides. We will show how to do this in an upcoming blog! Stay tuned!

Tags: , Category: 未分類 Comments closed

Secure Supply Chain: Securing Kubewarden Policies

水曜日, 4 5月, 2022

With recent releases, the Kubewarden stack supports verifying the integrity and authenticity of content using the Sigstore project.

In this post, we focus on Kubewarden Policies and how to create a Secure Supply Chain for them.

Sigstore?

Since a full Sigstore dive is not within the scope for this post, we recommend checking out their nice docs.

In short, Sigstore provides an automatable workflow to match the distributed Open Source development model. The workflow specifies how to digitally sign and verify artifacts which in our case are Kubewarden Policies. It also provides a transparency log to monitor such signatures. The workflow allows to sign artifacts with traditional Public-Private key pairs, or in Keyless mode.

In the keyless mode, signatures are created with short-lived certs using an OpenID Connect (OIDC) service as identity provider. Those short-lived certs are issued by Sigstore’s PKI infrastructure, Fulcio.

Fulcio acts as a Registration Authority, authenticating that you are who you say you are by using an OIDC service (SSO via your own Okta instance, GitHub, Google, etc). Once authenticated, Fulcio acts as a Certificate Authority, issuing the short-lived certificate that you will use to sign artifacts.

These short-lived certificate include the identity information obtained by the OIDC service inside of the certificate extensions attributes. The private key associated with the certificate is then used to sign the object while the certificate itself has a public key that can be used to verify the signatures produced by the private key.

The certificates issued by Fulcio have a short validity because they are generated to be short-lived. This is an interesting property that we will discuss shortly.

Once the artifact is signed, the proof of signature is then sent to an append-only transparency log, Rekor, that allows monitoring of such signatures and protects against timing attacks. The proof of signature is signed by Rekor and this information is stored inside of the signature itself.

By using the timestamp found inside of the proof of signature, the verifier can ensure that the signing action has been performed during the limited lifetime of the certificate.

Due to this the private key associated with the certificate doesn’t need to be safely stored. It can be discarded at the end of the signature process. An attacker could even reuse the private key, but the signature would not be considered valid if used outside of the limited lifetime of the certificate.

Nobody – developers, project leads, or sponsors, needs to have access to keys and Sigstore never obtains your private key. Hence the term keyless. Additionaly, one doesn’t need expensive infra for creating and validating signatures.

Since there’s no need for key secrets and the like in Keyless mode, it is easily automated inside CIs and implemented and monitored in the open. This is one of the reasons that makes it so interesting.

Building a Rust Sigstore stack

The policy server and libs within the Kubewarden stack are responsible for instantiating and running policies. They are written in Rust and therefore, we needed a good Rust implementation of Sigstore features. Since there weren’t any available, we are glad to announce that we have created a new crate, sigstore-rs, under the Sigstore org. This was done in an upstream-first manner and we’re happy to report that it is now taking a life of its own.

Securing kubewarden policies

As you may already know, Kubewarden Policies are small wasm-compiled binaries (~1 to ~6 MB) that are distributed via container registries as OCI artifacts. Let us see how Kubewarden protects policies against Secure Supply Chain attacks by signing and verifying them before they run.

Signing your Kubewarden Policy

Signing a Policy is done in the same way as signing a container image. This means just adding a new layer within the signature to a dedicated signature object managed by Sigstore. In the Sigstore workflow, one can sign with Public-Private keypair, or Keyless. Both can also add key=value annotations to the signatures.

The Public-Private key pair signing is straightforward, using sigstore/cosign:

$ COSIGN_PASSWORD=yourpass cosign generate-key-pair

Private key written to cosign.key
Public key written to cosign.pub

$ COSIGN_PASSWORD=yourpass cosign sign \
  --key cosign.key --annotations blog=yes \
  ghcr.io/kubewarden/policies/user-group-psp:v0.2.0

Pushing signature to: ghcr.io/kubewarden/policies/user-group-psp

The Keyless mode is more interesting:

$ COSIGN_EXPERIMENTAL=1 cosign sign \
  --annotations blog=yes \
  ghcr.io/kubewarden/policies/user-group-psp:v0.2.0

Generating ephemeral keys...
Retrieving signed certificate...
Your browser will now be opened to:
https://oauth2.sigstore.dev/auth/auth?access_type=online&client_id=sigstore&code_challenge=(...)
Successfully verified SCT...
tlog entry created with index: (...)
Pushing signature to: ghcr.io/viccuad/policies/volumes-psp

What happened? cosign prompted us for an OpenID Connect provider on the browser, which authenticated us, and instructed Fulcio to generate an ephemeral private key and a x509 certificate with the associated public key.

If this were to happen in a CI, the CI would provide the OIDC identity token in its environment. cosign has support for detecting some automated environments and producing an identity token. Currently that covers GitHub And Google Cloud, but one can always use a flag.

We shall now detail how it works for policies built by the Kubewarden team in GitHub Actions. First, we call cosign, and sign the policy in keyless mode. The certificate issued by Fulcio includes the following details about the identity of the signer inside of its x503v extensions:

  • An issuer, telling you who certified the image:
    https://token.actions.githubusercontent.com
    
  • A subject related to the specific workflow and worker, for example:
    https://github.com/kubewarden/policy-secure-pod-images/.github/workflows/release.yml@refs/heads/main
    

If you are curious, and want to see the contents of one of the certificates issued by Fulcio, install the crane cli tool, jq and openssl and execute the following command:

crane manifest \
  $(cosign triangulate ghcr.io/kubewarden/policies/pod-privileged:v0.1.10) | \
  jq -r '.layers[0].annotations."dev.sigstore.cosign/certificate"' | \
  openssl x509 -noout -text -in -

The end result is the same. A signature is added as a new image layer of a special OCI object that is created and managed by Sigstore. You can view those signatures as added layers,with sha256-<sha>.sig in the repo.

Even better, you can use tools like crane or the CLI tool, kwctl to perform the same action as demonstrated below.

kwctl pull <policy_url>; kwctl inspect <policy_url>

If you want to verify policies locally, you now can use kwctl verify:

$ kwctl verify --github-owner kubewarden registry://ghcr.io/kubewarden/policies/pod-privileged:v0.1.10
$ echo $?
0

When testing policies locally with kwctl pull or kwctl run, you can also enable signature verification by using any verification related flag. For example:

$ kwctl pull --github-owner kubewarden registry://ghcr.io/kubewarden/policies/pod-privileged:v0.1.10
$ echo $?
0

All the policies from the Kubewarden team are signed in keyless mode by the workers of the CI job, specifically the CI job of Github. We don’t leave certs around and they are verifiable by third parties.

Enforcing signature verification for instantiated Kubewarden policies

You can now configure PolicyServers to enforce that all policies being run need to be signed. When deploying Kubewarden via Helm charts, you can do it so for the default PolicyServer installed by kubewarden-defaults chart.

For this, the PolicyServers have a new spec.VerificationConfig argument. Here, you can put the name of a ConfigMap containing a “verification config”, to specify the needed signatures.

You can obtain a default verification config for policies from the Kubewarden team with:

$ kwctl scaffold verification-config
# Default Kubewarden verification config
#
# With this config, the only valid policies are those signed by Kubewarden
# infrastructure.
#
# This config can be saved to its default location (for this OS) with:
#   kwctl scaffold verification-config > /home/youruser/.config/kubewarden/verification-config.yml
#
# Providing a config in the default location enables Sigstore verification.
# See https://docs.kubewarden.io for more Sigstore verification options.
---
apiVersion: v1
allOf:
  - kind: githubAction
    owner: kubewarden
    repo: ~
    annotations: ~
anyOf: ~

The verification config format has several niceties, see its reference docs. For example, kind: githubAction with owner and repo, instead of checking the issuer and subject strings blindly. Or anyOf a list of signatures, with anyOf.atLeast a number of them: this allows for accepting at least a specific number of signatures, and makes migration between signatures in your cluster easy. It’s the little things ?.

If you want support for other CIs (such as GitLab, Jenkins, etc) drop us a note on Slack or file a GitHub issue!

Once you have crafted your verification config, create your ConfigMap:

$ kubectl create configmap my-verification-config \
  --from-file=verification-config=./my-verification-config.yml \
  --namespace=kubewarden

And pass it to your PolicyServers in spec.VerificationConfig, or if using the default PolicyServer from the kubewarden-defaults chart, set it there with for example:

$ helm upgrade --set policyServer.verificationConfig=my-verification-config \
  --wait -n kubewarden kubewarden-defaults ./kubewarden-defaults

Recap

Using cosign sign policy authors can sign or author their policies. All the policies owned by the Kubewarden team have already been signed in this way.

With kwctl verify, operators can verify them, and with kwctl inspect (and other tools such as crane manifest), operators can inspect the signatures. We can keep using kwctl pull and kwctl run to test policies locally as in the past, plus now verify their signatures too. Once we are satisfied, we can deploy Kubewarden PolicyServers so they enforce those signatures. If we want, the same verification config format can be used for kwctl and the cluster stack.

This way we are sure that the policies come from their stated authors, and have not been tampered with. Phew!

We, the Kubewarden team, are curious on how you approach this. What workflows are you interested in? What challenges do you have? Drop us a word in our Slack channel or foile a GitHub issue!

There are more things to secure in the chain and we’re excited for what lays ahead. Stay tuned for more blog entries on how to secure your supply chain with Kubewarden!