5.1 Monitoring Stack On Kubernetes

IMPORTANT: Monitoring Example

This is not an officially supported recommendation and does not claim complete coverage of any use case in a production environment.

The described monitoring approach in this document is a generalized example of one way of monitoring a SUSE CaaS Platform cluster.

Please apply best practices to develop your own monitoring approach using the described examples and available health checking endpoints.

This document aims to describe monitoring in a Kubernetes environment.

The monitoring stack consists of a metrics server, a visualization platform, and an ingress controller for authentication.

Prometheus Server & Alertmanager

Prometheus is an open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.

Prometheus Alertmanager handles client alerts, sanitizes duplicates and noise and routes them to configuratble receivers.

Grafana

Grafana is an open-source system for querying, analysing and visualizing metrics.

NGINX Ingress Controller

Deploying NGINX Ingress Controller allows us to provide TLS termination to our services and to provide basic authentication to the Prometheus Expression browser/API.

5.1.1 Prerequisites

  1. Monitoring namespace

    We will deploy our monitoring stack in its own namespace and therefore create one.

    tux > kubectl create namespace monitoring
      
  2. Create DNS entries

    In this example, we will use a worker node with IP 192.168.1.113 to expose our services.

    You should configure proper DNS names in any production environment. These values are only for example purposes.

    monitoring.example.com                      IN  A       192.168.1.113
    prometheus.example.com                      IN  CNAME   monitoring.example.com
    prometheus-alertmanager.example.com         IN  CNAME   monitoring.example.com
    grafana.example.com                         IN  CNAME   monitoring.example.com
          

    Or add this entry to /etc/hosts

    192.168.1.113 prometheus.example.com prometheus-alertmanager.example.com grafana.example.com
          
  3. Create certificates

    You will need SSL certificates for the shared resources. If you are deploying in a pre-defined network environment, please get proper certificates from your network administrator. In this example, the domains are named after the components they represent. prometheus.example.com, prometheus-alertmanager.example.com and grafana.example.com

5.1.2 NGINX Ingress Controller

Configure And Deploy NGINX Ingress Controller

  1. Choose which networking configuration the Ingress controller should have. Create a file nginx-ingress-config-values.yaml with one of the following examples as content.

    • NodePort: The services will be publicly exposed on each node of the cluster, including master nodes, at port 30080 for HTTP and 30443 for HTTPS.

      # Enable the creation of pod security policy
      podSecurityPolicy:
        enabled: true
      
      # Create a specific service account
      serviceAccount:
        create: true
        name: nginx-ingress
      
      # Publish services on port HTTP/30080
      # Publish services on port HTTPS/30443
      # These services are exposed on each node
      controller:
        service:
          type: NodePort
          nodePorts:
            http: 30080
            https: 30443
             
    • ClusterIP with external IP(s): The services will be exposed on specific nodes of the cluster, at port 80 for HTTP and port 443 for HTTPS.

      # Enable the creation of pod security policy
      podSecurityPolicy:
        enabled: true
      
      # Create a specific service account
      serviceAccount:
        create: true
        name: nginx-ingress
      
      # Publish services on port HTTP/80
      # Publish services on port HTTPS/443
      # These services are exposed on the node with IP 192.168.1.113
      controller:
        service:
          externalIPs:
            - 192.168.1.113
             
  2. Deploy the upstream helm chart and pass along our configuration values file.

    tux > helm install --name nginx-ingress stable/nginx-ingress \
    --namespace monitoring \
    --values nginx-ingress-config-values.yaml
       

    The result should be two running pods:

    tux > kubectl -n monitoring get po
    NAME                                             READY     STATUS    RESTARTS   AGE
    nginx-ingress-controller-74cffccfc-p8xbb         1/1       Running   0          4s
    nginx-ingress-default-backend-6b9b546dc8-mfkjk   1/1       Running   0          4s
    

5.1.3 TLS

You must configure your certificates for the components as secrets in Kubernetes. Get certificates from your local certificate authority. In this example we are using a single certificate shared by the components prometheus.example.com, prometheus-alertmanager.example.com and grafana.example.com.

NOTE: Create Individual Secrets For Components

Should you choose to secure each service with an individual certificate, you must repeat the step below for each component and adjust the name for the individual secret each time.

In this example the name is monitoring-tls.

IMPORTANT: Note Down Secret Names For Configuration

Please note down the names of the secrets you have created. Later configuration steps require secret names to be specified.

Create TLS secrets in Kubernetes

  1. tux > kubectl create -n monitoring secret tls monitoring-tls  \
    --key  ./monitoring.key \
    --cert ./monitoring.crt
     

Using Self-signed Certificates (optional)

In some cases you will want to create self-signed certificates for testing of the stack. This is not recommended. If you are using proper CA signed certificates, you must skip this entirely.

Create Self-signed Certificates

  1. IMPORTANT: Do not use self-signed certificates in production environments. There is severe risk of Man-in-the-middle attacks. Use proper certificates signed by your CA.

  2. Create a file openssl.conf with the appropriate values

    [req]
    distinguished_name = req_distinguished_name
    req_extensions = v3_req
    default_md = sha256
    default_bits = 4096
    prompt=no
    
    [req_distinguished_name]
    C = CZ
    ST = CZ
    L = Prague
    O = example
    OU = monitoring
    CN = example.com
    emailAddress = admin@example.com
    
    [ v3_req ]
    basicConstraints = CA:FALSE
    keyUsage = keyEncipherment, dataEncipherment
    extendedKeyUsage = serverAuth
    subjectAltName = @alt_names
    
    [alt_names]
    DNS.1 = prometheus.example.com
    DNS.2 = prometheus-alertmanager.example.com
    DNS.3 = grafana.example.com
    

    This certificate uses Subject Alternative Names so it can be used for Prometheus and Grafana.

  3. Generate certificate

    tux > openssl req -x509 -nodes -days 365 -newkey rsa:4096 \
    -keyout ./monitoring.key -out ./monitoring.crt \
    -config ./openssl.conf -extensions 'v3_req'
          
  4. Add TLS secret to Kubernetes

    tux > kubectl create -n monitoring secret tls monitoring-tls  \
    --key  ./monitoring.key \
    --cert ./monitoring.crt
        

5.1.4 Prometheus

NOTE: Prometheus Pushgateway

Deploying Prometheus Pushgateway is out of the scope of this document.

  1. Configure Authentication

    We need to create a basic-auth secret so the NGINX Ingress Controller can perform authentication.

    Install htpasswd on your local workstation

    tux > sudo zypper in apache2-utils
       

    Create the secret file auth

    IMPORTANT: It is very important that the filename is auth. During creation, a key in the configuration containing the secret is created that is named after the used filename. The ingress controller will expect a key named auth.

    htpasswd -c auth admin
    New password:
    Re-type new password:
    Adding password for user admin
    

    Create secret in Kubernetes

    tux > kubectl create secret generic -n monitoring prometheus-basic-auth --from-file=auth
      
  2. Create a configuration file prometheus-config-values.yaml

    We need to configure the storage for our deployment. Choose among the options and uncomment the line in the config file. In production environments you must configure persistent storage.

    • Use an existing PersistentVolumeClaim

    • Use a StorageClass (preferred)

    # Alertmanager configuration
    alertmanager:
      enabled: true
      ingress:
        enabled: true
        hosts:
        -  prometheus-alertmanager.example.com
        annotations:
          kubernetes.io/ingress.class: nginx
          nginx.ingress.kubernetes.io/auth-type: basic
          nginx.ingress.kubernetes.io/auth-secret: prometheus-basic-auth
          nginx.ingress.kubernetes.io/auth-realm: "Authentication Required"
        tls:
          - hosts:
            - prometheus-alertmanager.example.com
            secretName: monitoring-tls
      persistentVolume:
        enabled: true
        ## Use a StorageClass
        storageClass: my-storage-class
        ## Create a PersistentVolumeClaim of 2Gi
        size: 2Gi
        ## Use an existing PersistentVolumeClaim (my-pvc)
        #existingClaim: my-pvc
    
    ## AlertManager is configured through alertmanager.yml. This file and any others
    ## listed in alertmanagerFiles will be mounted into the alertmanager pod.
    ## See configuration options https://prometheus.io/docs/alerting/configuration/
    #alertmanagerFiles:
    #  alertmanager.yml:
    
    # Create a specific service account
    serviceAccounts:
      nodeExporter:
        name: prometheus-node-exporter
    
    # Allow scheduling of node-exporter on master nodes
    nodeExporter:
      hostNetwork: false
      hostPID: false
      podSecurityPolicy:
        enabled: true
        annotations:
          seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default'
          apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
          seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default'
          apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default'
      tolerations:
        - key: node-role.kubernetes.io/master
          operator: Exists
          effect: NoSchedule
    
    # Disable Pushgateway
    pushgateway:
      enabled: false
    
    # Prometheus configuration
    server:
      ingress:
        enabled: true
        hosts:
        - prometheus.example.com
        annotations:
          kubernetes.io/ingress.class: nginx
          nginx.ingress.kubernetes.io/auth-type: basic
          nginx.ingress.kubernetes.io/auth-secret: prometheus-basic-auth
          nginx.ingress.kubernetes.io/auth-realm: "Authentication Required"
        tls:
          - hosts:
            - prometheus.example.com
            secretName: monitoring-tls
      persistentVolume:
        enabled: true
        ## Use a StorageClass
        storageClass: my-storage-class
        ## Create a PersistentVolumeClaim of 8Gi
        size: 8Gi
        ## Use an existing PersistentVolumeClaim (my-pvc)
        #existingClaim: my-pvc
    
    ## Prometheus is configured through prometheus.yml. This file and any others
    ## listed in serverFiles will be mounted into the server pod.
    ## See configuration options
    ## https://prometheus.io/docs/prometheus/latest/configuration/configuration/
    #serverFiles:
    #  prometheus.yml:
       
  3. Deploy the upstream helm chart and pass our configuration values file.

    tux > helm install --name prometheus stable/prometheus \
    --namespace monitoring \
    --values prometheus-config-values.yaml
       

    There need to be 3 pods running (3 node-exporter pods because we have 3 nodes).

    tux > kubectl -n monitoring get po | grep prometheus
    NAME                                             READY     STATUS    RESTARTS   AGE
    prometheus-alertmanager-5487596d54-kcdd6         2/2       Running   0          2m
    prometheus-kube-state-metrics-566669df8c-krblx   1/1       Running   0          2m
    prometheus-node-exporter-jnc5w                   1/1       Running   0          2m
    prometheus-node-exporter-qfwp9                   1/1       Running   0          2m
    prometheus-node-exporter-sc4ls                   1/1       Running   0          2m
    prometheus-server-6488f6c4cd-5n9w8               2/2       Running   0          2m
       
  4. At this stage, the Prometheus Expression browser/API should be accessible, depending on your network configuration at https://prometheus.example.com or https://prometheus.example.com:30443.

5.1.5 Alertmanager Configuration Example

The configuration sets one "receiver" to get notified by email when a node meets one of these conditions:

  • Node is unschedulable

  • Node runs out of disk space

  • Node has memory pressure

  • Node has disk pressure

The first two are critical because the node can not accept new pods, the last two are just warnings.

The Alertmanager configuration can be added to prometheus-config-values.yaml by adding the alertmanagerFiles section.

For more information on how to configure Alertmanager, refer to Prometheus: Alerting - Configuration.

Configuring Alertmanager

  1. Add the alertmanagerFiles section to your Prometheus configuration.

    alertmanagerFiles:
      alertmanager.yml:
        global:
          # The smarthost and SMTP sender used for mail notifications.
          smtp_from: alertmanager@example.com
          smtp_smarthost: smtp.example.com:587
          smtp_auth_username: admin@example.com
          smtp_auth_password: <password>
          smtp_require_tls: true
    
        route:
          # The labels by which incoming alerts are grouped together.
          group_by: ['node']
    
          # When a new group of alerts is created by an incoming alert, wait at
          # least 'group_wait' to send the initial notification.
          # This way ensures that you get multiple alerts for the same group that start
          # firing shortly after another are batched together on the first
          # notification.
          group_wait: 30s
    
          # When the first notification was sent, wait 'group_interval' to send a batch
          # of new alerts that started firing for that group.
          group_interval: 5m
    
          # If an alert has successfully been sent, wait 'repeat_interval' to
          # resend them.
          repeat_interval: 3h
    
          # A default receiver
          receiver: admin-example
    
        receivers:
        - name: 'admin-example'
          email_configs:
          - to: 'admin@example.com'
     
  2. Replace the empty set of rules rules: {} in the serverFiles section of the configuration file.

    For more information on how to configure alerts, refer to: Prometheus: Alerting - Notification Template Examples

    serverFiles:
      alerts: {}
      rules:
        groups:
        - name: caasp.node.rules
          rules:
          - alert: NodeIsNotReady
            expr: kube_node_status_condition{condition="Ready",status="false"} == 1
            for: 1m
            labels:
              severity: critical
            annotations:
              description: '{{ $labels.node }} is not ready'
          - alert: NodeIsOutOfDisk
            expr: kube_node_status_condition{condition="OutOfDisk",status="true"} == 1
            labels:
              severity: critical
            annotations:
              description: '{{ $labels.node }} has insufficient free disk space'
          - alert: NodeHasDiskPressure
            expr: kube_node_status_condition{condition="DiskPressure",status="true"} == 1
            labels:
              severity: warning
            annotations:
              description: '{{ $labels.node }} has insufficient available disk space'
          - alert: NodeHasInsufficientMemory
            expr: kube_node_status_condition{condition="MemoryPressure",status="true"} == 1
            labels:
              severity: warning
            annotations:
              description: '{{ $labels.node }} has insufficient available memory'
            
  3. You should now be able to see you AlertManager at https://prometheus-alertmanager.example.com/.

5.1.6 Grafana

Starting from Grafana 5.0, it is possible to dynamically provision the data sources and dashbords via files. In Kubernetes, these files are provided via the utilization of ConfigMap, editing a ConfigMap will result by the modification of the configuration without having to delete/recreate the pod.

Configuring Grafana

  1. Configure provisoning

    Create the default datasource configuration file grafana-datasources.yaml which point to our Prometheus server

    ---
    kind: ConfigMap
    apiVersion: v1
    metadata:
      name: grafana-datasources
      namespace: monitoring
      labels:
         grafana_datasource: "1"
    data:
      datasource.yaml: |-
        apiVersion: 1
        deleteDatasources:
          - name: Prometheus
            orgId: 1
        datasources:
        - name: Prometheus
          type: prometheus
          url: http://prometheus-server.monitoring.svc.cluster.local:80
          access: proxy
          orgId: 1
          isDefault: true
            
  2. Create the ConfigMap in Kubernetes

    tux > kubectl create -f grafana-datasources.yaml
            
  3. Configure storage for the deployment

    Choose among the options and uncomment the line in the config file. In production environments you must configure persistent storage.

    • Use an existing PersistentVolumeClaim

    • Use a StorageClass (preferred)

    • Create a file grafana-config-values.yaml with the appropriate values

    # Configure admin password
    adminPassword: <password>
    
    # Ingress configuration
    ingress:
      enabled: true
      annotations:
        kubernetes.io/ingress.class: nginx
      hosts:
        - grafana.example.com
      tls:
        - hosts:
          - grafana.example.com
          secretName: monitoring-tls
    
    # Configure persistent storage
    persistence:
      enabled: true
      accessModes:
        - ReadWriteOnce
      ## Use a StorageClass
      storageClassName: my-storage-class
      ## Create a PersistentVolumeClaim of 10Gi
      size: 10Gi
      ## Use an existing PersistentVolumeClaim (my-pvc)
      #existingClaim: my-pvc
    
    # Enable sidecar for provisioning
    sidecar:
      datasources:
        enabled: true
        label: grafana_datasource
      dashboards:
        enabled: true
        label: grafana_dashboard
       
  4. Deploy the upstream helm chart and pass our configuration values file

    tux > helm install --name grafana stable/grafana \
    --namespace monitoring \
    --values grafana-config-values.yaml
       
  5. The result should be a running Grafana pod

    tux > kubectl -n monitoring get po | grep grafana
    NAME                                             READY     STATUS    RESTARTS   AGE
    grafana-dbf7ddb7d-fxg6d                          3/3       Running   0          2m
       

    At this stage, Grafana should be accessible, depending on your network configuration at https://grafana.example.com or https://grafana.example.com:30443

  6. Now you can deploy an existing Grafana dashboard or build your own.

Adding Grafana Dashboards

There are two ways to add dashboards to Grafana:

  • Deploy an existing dashboard from grafana.com

    • Open the deployed Grafana in your browser and log in.

    • On the home page of Grafana, hover your mousecursor over the + button on the left sidebar and click on the import menuitem.

    • Select an existing dashboard for your purpose from https://grafana.com/dashboards. Copy the URL to the clipboard.

    • Paste the URL (for example) https://grafana.com/dashboards/3131 into the first input field to import the "Kubernetes All Nodes" Grafana Dashboard. After pasting in the url, the view will change to another form.

    • Now select the "Prometheus" datasource in the prometheus field and click on the import button.

    • The browser will redirect you to your newly created dashboard.

  • Deploy a configuration file containing the dashboard definition.

    • Create your dashboard defintion file as a ConfigMap, for example grafana-dashboards-caasp-cluster.yaml.

      ---
      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: grafana-dashboards-caasp-cluster
        namespace: monitoring
        labels:
           grafana_dashboard: "1"
      data:
        caasp-cluster.json: |-
          {
            "__inputs": [
              {
                "name": "DS_PROMETHEUS",
                "label": "Prometheus",
                "description": "",
                "type": "datasource",
                "pluginId": "prometheus",
                "pluginName": "Prometheus"
              }
            ],
            "__requires": [
              {
                "type": "grafana",
      [...]
      continues with definition of dashboard JSON
      [...]
               
    • Apply the ConfigMap to the cluster.

      tux > kubectl apply -f grafana-dashboards-caasp-cluster.yaml
              

    You can find a couple of dashboard examples for SUSE CaaS Platform in the Kubic project on GitHub. This repo provides dashboards to visualize Kubernetes resources.