GitOps with Kubernetes: a better way to deploy?

I recently gave a talk at TechTrain, a monthly event in Mechelen (Belgium), hosted by Cronos. The talk is called “GitOps with Kubernetes: a better way to deploy” and is an introduction to GitOps with Weaveworks Flux as an example.

You can find a re-recording of the presentation on Youtube:

Writing a Kubernetes operator with Kopf

In today’s post, we will write a simple operator with Kopf, which is a Python framework created by Zalando. A Kubernetes operator is a piece of software, running in Kubernetes, that does something application specific. To see some examples of what operators are used for, check out operatorhub.io.

Our operator will do something simple in order to easily grasp how it works:

  • the operator will create a deployment that runs nginx
  • nginx will serve a static website based on a git repository that you specify; we will use an init container to grab the website from git and store it in a volume
  • you can control the number of instances via a replicas parameter

That’s great but how will the operator know when it has to do something, like creating or updating resources? We will use custom resources for that. Read on to learn more…

Note: source files are on GitHub

Custom Resource Definition (CRD)

Kubernetes allows you to define your own resources. We will create a resource of type (kind) DemoWeb. The CRD is created with the YAML below:

# A simple CRD to deploy a demo website from a git repo
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: demowebs.baeke.info
spec:
  scope: Namespaced
  group: baeke.info
  versions:
    - name: v1
      served: true
      storage: true
  names:
    kind: DemoWeb
    plural: demowebs
    singular: demoweb
    shortNames:
      - dweb
  additionalPrinterColumns:
    - name: Replicas
      type: string
      priority: 0
      JSONPath: .spec.replicas
      description: Amount of replicas
    - name: GitRepo
      type: string
      priority: 0
      JSONPath: .spec.gitrepo
      description: Git repository with web content

For more information (and there is a lot) about CRDs, see the documentation.

Once you create the above resource with kubectl apply (or create), you can create a custom resource based on the definition:

apiVersion: baeke.info/v1
kind: DemoWeb
metadata:
  name: demoweb1
spec:
  replicas: 2
  gitrepo: "https://github.com/gbaeke/static-web.git"

Note that we specified our own API and version in the CRD (baeke.info/v1) and that we set the kind to DemoWeb. In the additionalPrinterColumns, we defined some properties that can be set in the spec that will also be printed on screen. When you list resources of kind DemoWeb, you will the see replicas and gitrepo columns:

Custom resources based on the DemoWeb CRD

Of course, creating the CRD and the custom resources is not enough. To actually create the nginx deployment when the custom resource is created, we need to write and run the operator.

Writing the operator

I wrote the operator on a Mac with Python 3.7.6 (64-bit). On Windows, for best results, make sure you use Miniconda instead of Python from the Windows Store. First install Kopf and the Kubernetes package:

pip3 install kopf kubernetes

Verify you can run kopf:

Running kopf

Let’s write the operator. You can find it in full here. Here’s the first part:

Naturally, we import kopf and other necessary packages. As noted before, kopf and kubernetes will have to be installed with pip. Next, we define a handler that runs whenever a resource of our custom type is spotted by the operator (with the @kopf.on.create decorator). The handler has two parameters:

  • spec object: allows us to retrieve our custom properties with spec.get (e.g. spec.get(‘replicas’, 1) – the second parameter is the default value)
  • **kwargs: a dictionary with lots of extra values we can use; we use it to retrieve the name of our custom resource (e.g. demoweb1); we can use that name to derive the name of our deployment and to set labels for our pods

Note: instead of using **kwargs to retrieve the name, you can also define an extra name parameter in the handler like so: def create_fn(spec, name, **kwargs); see the docs for more information

Our deployment is just yaml stored in the doc variable with some help from the Python yaml package. We use spec.get and the name variable to customise it.

After the doc variable, the following code completes the event handler:

The rest of the operator

With kopf.adopt, we make sure the deployment we create is a child of our custom resource. When we delete the custom resource, its children are also deleted.

Next, we simply use the kubernetes client to create a deployment via the apps/v1 api. The method create_namespaced_deployment takes two required parameters: the namespace and the deployment specification. Note there is only minimal error checking here. There is much more you can do with regards to error checking, retries, etc…

Now we can run the operator with:

kopf run operator-filename.py

You can perfectly run this on your local workstation if you have a working kube config pointing at a running cluster with the CRD installed. Kopf will automatically use that for authentication:

Running the operator on your workstation

Running the operator in your cluster

To run the operator in your cluster, create a Dockerfile that produces an image with Python, kopf, kubernetes and your operator in Python. In my case:

FROM python:3.7
RUN mkdir /src
ADD with_create.py /src
RUN pip install kopf
RUN pip install kubernetes
CMD kopf run /src/with_create.py --verbose

We added the verbose parameter for extra logging. Next, run the following commands to build and push the image (example with my image name):

docker build -t gbaeke/kopf-demoweb .
docker push gbaeke/kopf-demoweb

Now you can deploy the operator to the cluster:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: demowebs-operator
spec:
  replicas: 1
  strategy:
    type: Recreate
  selector:
    matchLabels:
      application: demowebs-operator
  template:
    metadata:
      labels:
        application: demowebs-operator
    spec:
      serviceAccountName: demowebs-account
      containers:
      - name: demowebs
        image: gbaeke/kopf-demoweb

The above is just a regular deployment but the serviceAccountName is extremely important. It gives kopf and your operator the required access rights to create the deployment is the target namespace. Check out the documentation to find out more about the creation of the service account and the required roles. Note that you should only run one instance of the operator!

Once the operator is deployed, you will see it running as a normal pod:

The operator is running

To see what is going on, check the logs. Let’s show them with octant:

Your operator logs

At the bottom, you see what happens when a creation event is detected for a resource of type DemoWeb. The spec is shown with the git repository and the number on replicas.

Now you can create resources of kind DemoWeb and see what happens. If you have your own git repository with some HTML in it, try to use that. Otherwise, just use mine at https://github.com/gbaeke/static-web.

Conclusion

Writing an operator is easy to do with the Kopf framework. Do note that we only touched on the basics to get started. We only have an on.create handler, and no on.update handler. So if you want to increase the number of replicas, you will have to delete the custom resource and create a new one. Based on the example though, it should be pretty easy to fix that. The git repo contains an example of an operator that also implements the on.update handler (with_update.py).

A quick tour of Kustomize

Image above from: https://kustomize.io/

When you have to deploy an application to multiple environments like dev, test and production there are many solutions available to you. You can manually deploy the app (Nooooooo! ūüėČ), use a CI/CD system like Azure DevOps and its release pipelines (with or without Helm) or maybe even a “GitOps” approach where deployments are driven by a tool such as Flux or Argo based on a git repository.

In the latter case, you probably want to use a configuration management tool like Kustomize for environment management. Instead of explaining what it does, let’s take a look at an example. Suppose I have an app that can be deployed with the following yaml files:

  • redis-deployment.yaml: simple deployment of Redis
  • redis-service.yaml: service to connect to Redis on port 6379 (Cluster IP)
  • realtime-deployment.yaml: application that uses the socket.io library to display real-time updates coming from a Redis channel
  • realtime-service.yaml: service to connect to the socket.io application on port 80 (Cluster IP)
  • realtime-ingress.yaml: ingress resource that defines the hostname and TLS certificate for the socket.io application (works with nginx ingress controller)

Let’s call this collection of files the base and put them all in a folder:

Base files for the application

Now I would like to modify these files just a bit, to install them in a dev namespace called realtime-dev. In the ingress definition I want to change the name of the host to realdev.baeke.info instead of real.baeke.info for production. We can use Kustomize to reach that goal.

In the base folder, we can add a kustomization.yaml file like so:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- realtime-ingress.yaml
- realtime-service.yaml
- redis-deployment.yaml
- redis-service.yaml
- realtime-deployment.yaml

This lists all the resources we would like to deploy.

Now we can create a folder for our patches. The patches define the changes to the base. Create a folder called dev (next to base). We will add the following files (one file blurred because it’s not relevant to this post):

kustomization.yaml contains the following:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: realtime-dev
resources:
- ./namespace.yaml
bases:
- ../base
patchesStrategicMerge:
- realtime-ingress.yaml
 

The namespace: realtime-dev ensures that our base resource definitions are updated with that namespace. In resources, we ensure that namespace gets created. The file namespace.yaml contains the following:

apiVersion: v1
kind: Namespace
metadata:
  name: realtime-dev 

With patchesStrategicMerge we specify the file(s) that contain(s) our patches, in this case just realtime-ingress.yaml to modify the hostname:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    kubernetes.io/ingress.class: nginx
  name: realtime-ingress
spec:
  rules:
  - host: realdev.baeke.info
    http:
      paths:
      - backend:
          serviceName: realtime
          servicePort: 80
        path: /
  tls:
  - hosts:
    - realdev.baeke.info
    secretName: real-dev-baeke-info-tls

Note that we also use certmanager here to issue a certificate to use on the ingress. For dev environments, it is better to use the Let’s Encrypt staging issuer instead of the production issuer.

We are now ready to generate the manifests for the dev environment. From the parent folder of base and dev, run the following command:

kubectl kustomize dev

The above command generates the patched manifests like so:

apiVersion: v1 
kind: Namespace
metadata:      
  name: realtime-dev
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: realtime
  name: realtime
  namespace: realtime-dev
spec:
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: realtime
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: redis
  name: redis
  namespace: realtime-dev
spec:
  ports:
  - port: 6379
    targetPort: 6379
  selector:
    app: redis
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: realtime
  name: realtime
  namespace: realtime-dev
spec:
  replicas: 1
  selector:
    matchLabels:
      app: realtime
  template:
    metadata:
      labels:
        app: realtime
    spec:
      containers:
      - env:
        - name: REDISHOST
          value: redis:6379
        image: gbaeke/fluxapp:1.0.5
        name: realtime
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: 150m
            memory: 150Mi
          requests:
            cpu: 25m
            memory: 50Mi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: redis
  name: redis
  namespace: realtime-dev
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - image: redis:4-32bit
        name: redis
        ports:
        - containerPort: 6379
        resources:
          requests:
            cpu: 200m
            memory: 100Mi
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    kubernetes.io/ingress.class: nginx
  name: realtime-ingress
  namespace: realtime-dev
spec:
  rules:
  - host: realdev.baeke.info
    http:
      paths:
      - backend:
          serviceName: realtime
          servicePort: 80
        path: /
  tls:
  - hosts:
    - realdev.baeke.info
    secretName: real-dev-baeke-info-tls

Note that namespace realtime-dev is used everywhere and that the Ingress resource uses realdev.baeke.info. The original Ingress resource looked like below:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: realtime-ingress
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  tls:
  - hosts:
    - real.baeke.info
    secretName: real-baeke-info-tls
  rules:
  - host: real.baeke.info
    http:
      paths:
      - path: /
        backend:
          serviceName: realtime
          servicePort: 80 

As you can see, Kustomize has updated the host in tls: and rules: and also modified the secret name (which will be created by certmanager).

You have probably seen that Kustomize is integrated with kubectl. It’s also available as a standalone executable.

To directly apply the patched manifests to your cluster, run kubectl apply -k dev. The result:

namespace/realtime-dev created
service/realtime created
service/redis created
deployment.apps/realtime created
deployment.apps/redis created
ingress.extensions/realtime-ingress created

In another post, we will look at using Kustomize with Flux. Stay tuned!

Creating Kubernetes secrets from Key Vault

If you do any sort of development, you often have to deal with secrets. There are many ways to deal with secrets, one of them is retrieving the secrets from a secure system from your own code. When your application runs on Kubernetes and your code (or 3rd party code) cannot be configured to retrieve the secrets directly, you have several options. This post looks at one such solution: Azure Key Vault to Kubernetes from Sparebanken Vest, Norway.

In short, the solution connects to Azure Key Vault and does one of two things:

In my scenario, I just wanted regular secrets to use in a KEDA project that processes IoT Hub messages. The following secrets were required:

  • Connection string to a storage account: AzureWebJobsStorage
  • Connection string to IoT Hub’s event hub: EventEndpoint

In the YAML that deploys the pods that are scaled by KEDA, the secrets are referenced as follows:

env:
 - name: AzureFunctionsJobHost__functions__0
   value: ProcessEvents
 - name: FUNCTIONS_WORKER_RUNTIME
   value: node
 - name: EventEndpoint
   valueFrom:
     secretKeyRef:
       name: kedasample-event
       key: EventEndpoint
 - name: AzureWebJobsStorage
   valueFrom:
     secretKeyRef:
       name: kedasample-storage
       key: AzureWebJobsStorage

Because the YAML above is deployed with Flux from a git repo, we need to get the secrets from an external system. That external system in this case, is Azure Key Vault.

To make this work, we first need to install the controller that makes this happen. This is very easy to do with the Helm chart. By default, this Helm chart will work well on Azure Kubernetes Service as long as you give the AKS security principal read access to Key Vault:

Access policies in Key Vault (azure-cli-2019-… is the AKS service principal here)

Next, define the secrets in Key Vault:

Secrets in Key Vault

With the access policies in place and the secrets defined in Key Vault, the controller installed by the Helm chart can do its work with the following YAML:

apiVersion: spv.no/v1alpha1
kind: AzureKeyVaultSecret
metadata:
  name: eventendpoint
  namespace: default
spec:
  vault:
    name: gebakv
    object:
      name: EventEndpoint
      type: secret
  output:
    secret: 
      name: kedasample-event
      dataKey: EventEndpoint
      type: opaque
---
apiVersion: spv.no/v1alpha1
kind: AzureKeyVaultSecret
metadata:
  name: azurewebjobsstorage
  namespace: default
spec:
  vault:
    name: gebakv
    object:
      name: AzureWebJobsStorage
      type: secret
  output:
    secret: 
      name: kedasample-storage
      dataKey: AzureWebJobsStorage
      type: opaque     

The above YAML defines two objects of kind AzureKeyVaultSecret. In each object we specify the Key Vault secret to read (vault) and the Kubernetes secret to create (output). The above YAML results in two Kubernetes secrets:

Two regular secrets

When you look inside such a secret, you will see:

Inside the secret

To double check the secret, just do echo RW5K… | base64 -d to see the decoded secret and that it matches the secret stored in Key Vault. You can now reference the secret with ValueFrom as shown earlier in this post.

Conclusion

If you want to turn Azure Key Vault secrets into regular Kubernetes secrets for use in your manifests, give the solution from Sparebanken Vest a go. It is very easy to use. If you do not want regular Kubernetes secrets, opt for the Env Injector instead, which injects the environment variables directly in your pod.

Trying out k3sup

k3sup is a utility created by Alex Ellis to easily deploy k3s to any local or remote VM. In this post, I am giving the tool a try on a Civo cloud Ubuntu VM. You can of course pick any cloud provider you want or use a local system.

Deploying a VM on Civo Cloud

There’s not much to say here. Civo cloud is super simple to use and deploys VMs very fast. Just get an account and launch a new instance. Make sure you can access the VM over SSH. I deployed a simple Ubuntu 18.04 VM with 2 GBs of RAM:

VM deployed on Civo Cloud

Note: make sure you enable SSH via private/public key pair; use ssh-keygen to create the key pair and upload the contents of id_rsa.pub to Civo (SSH Keys section)

After deployment, check that you can access the VM with ssh chosen-user@IP-of-VM

Getting k3sup

On my Windows box, I used the Ubuntu shell to install k3sup:

curl -sLS https://get.k3sup.dev | sh 
sudo install k3sup /usr/local/bin/ 

You can now run the k3sup command as follows:

k3sup install --ip PUBLIC-IP-OF-CLOUD-VM --user root

And off it goes…

k3s installation via k3sup over SSH

At the end of the installation, you will see:

Saving file to: /home/gbaeke/kubeconfig

This means you can now use kubectl to interact with k3s. Just make sure kubectl knows where to find your kubeconfig file with (in my case in /home/gbaeke):

export KUBECONFIG=/home/gbaeke/kubeconfig

Before continuing, make sure your cloud VM allows access to TCP port 6443!

Now you can run something like kubectl get nodes:

kubectl running in Ubuntu shell on laptop to access k3s on remote VM

Installing applications

k3sup allows you to install the following applications to k3s via k3sup app install:

To install OpenFaas, just run k3sup app install openfaas. And off it goes….

Installing OpenFaas via k3sup

To install other applications, just use YAML files or any other method you prefer. It’s still Kubernetes! ūüėä

Conclusion

This was just a quick post (or note to self ūüėä) about k3sup which allows you to install k3s to any VM over SSH. It really is a great and simple to use tool so highly recommended. Note that Civo has a k3s service as well which is currently in beta. That service makes it easy to provision k3s from the Civo portal, similar to how you deploy AKS or GKE!

Giving linkerd a spin

A while ago, I gave linkerd a spin. Due to vacations and a busy schedule, I was not able to write about my experience. I will briefly discuss how to setup linkerd and then deploy a sample service to illustrate what it can do out of the box. Let’s go!

Wait! What is linkerd?

linkerd basically is a network proxy for your Kubernetes pods that’s designed to be deployed as a service mesh. When the pods you care about have been infused with linkerd, you will automatically get metrics like latency and requests per second, a web portal to check these metrics, live inspection of traffic and much more. Below is an example of a Kubernetes namespace that has been meshed:

A meshed namespace; all deployments in this particular namespace are meshed which means all pods get the linkerd network proxy that provides the metrics and features such as encryption

Installation

I can be very brief about this: installation is about as simple as it gets. Simply navigate to https://linkerd.io/2/getting-started to get started. Here are the simplified steps:

  • Download the linkerd executable as described in the Getting Started guide; I used WSL for this
  • Create a Kubernetes cluster with AKS (or another provider); for AKS, use the Azure CLI to get your credentials (az aks get-credentials); make sure the Azure CLI is installed in WSL and that you connected to your Azure subscription with az login
  • Make sure you can connect to your cluster with kubectl
  • Run linkerd check –pre to check if prerequisites are fulfilled
  • Install linkerd with linkerd install | kubectl apply -f –
  • Check the installation with linkerd check

The last step will nicely show its progress and end when the installation is complete:

linkerd check output

Exploring linkerd with the dashboard

linkerd automatically installs a dashboard. The dashboard is exposed as a Kubernetes service called linkerd-web. The service is of type ClusterIP. Although you could expose the service using an ingress, you can easily tunnel to the service with the following linkerd command (first line is the command; other lines are the output):

linkerd dashboard

Linkerd dashboard available at:
http://127.0.0.1:50750
Grafana dashboard available at:
http://127.0.0.1:50750/grafana
Opening Linkerd dashboard in the default browser
Failed to open Linkerd dashboard automatically
Visit http://127.0.0.1:50750 in your browser to view the dashboard

From WSL, the dashboard can not open automatically but you can manually browse to it. Note that linkerd also installs Prometheus and Grafana.

Out of the box, the linkerd deployment is meshed:

Adding linkerd to your own service

In this section, we will deploy a simple service that can add numbers and add linkerd to it. Although there are many ways to do this, I chose to create a separate namespace and enable auto-injection via an annotation. Here’s the yaml to create the namespace (add-ns.yaml):

apiVersion: v1
kind: Namespace
metadata:
  name: add
  annotations:
    linkerd.io/inject: enabled

Just run kubectl create -f add-ns.yaml to create the namespace. The annotation ensures that all pods added to the namespace get the linkerd proxy in the pod. All traffic to and from the pod will then pass through the proxy.

Now, let’s install the add service and deployment:

apiVersion: v1
kind: Service
metadata:
  name: add-svc
spec:
  ports:
  - port: 80
    name: http
    protocol: TCP
    targetPort: 8000
  - port: 8080
    name: grpc
    protocol: TCP
    targetPort: 8080
  selector:
    app: add
    version: v1
  type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: add
spec:
  replicas: 2
  selector:
    matchLabels:
      app: add
  template:
    metadata:
      labels:
        app: add
        version: v1
    spec:
      containers:
      - name: add
        image: gbaeke/adder

The deployment deploys to two pods with the gbaeke/adder image. To deploy the above, save it to a file (add.yaml) and use the following command to deploy:

kubectl create -f add-yaml -n add

Because the deployment uses the add namespace, the linkerd proxy will be added to each pod automatically. When you list the pods in the deployment, you see:

Each add pod has two containers: the actual add container based on gbaeke/adder and the proxy

To see more details about one of these pods, I can use the following command:

k get po add-5b48fcc894-2dc97 -o yaml -n add

You will clearly see the two containers in the output:

Two containers in the pod: actual service (gbaeke/adder) and the linkerd proxy

Generating some traffic

Let’s deploy a client that continuously uses the calculator service:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: add-cli
spec:
  replicas: 1
  selector:
    matchLabels:
      app: add-cli
  template:
    metadata:
      labels:
        app: add-cli
    spec:
      containers:
      - name: add-cli
        image: gbaeke/adder-cli
        env:
        - name: SERVER
          value: "add-svc"

Save the above to add-cli.yaml and deploy with the below command:

kubectl create -f add-cli.yaml -n add

The deployment uses another image called gbaeke/adder-cli that continuously makes requests to the server specified in the SERVER environment variable.

Checking the deployment in the linkerd portal

When you now open the add namespace in the linked portal, you should see something similar to the below screenshot (note: I deployed 5 servers and 5 clients):

A view on the add namespace; linkerd has learned how the deployments talk to eachother

The linkerd proxy in all pods sees all traffic. From the traffic, it can infer that the add-cli deployment talks to the add deployment. The add deployment receives about 150 requests per second. The 99th percentile latency is relatively high because the cluster nodes are very small, I deployed more instances and the client is relatively inefficient.

When I click the deployment called add, the following screen is shown:

A view on the deployment

The deployment clearly shows where traffic is coming from plus relevant metrics such as RPS and P99 latency. You also get a view on the live calls now. Note that the client is using GRPC which uses a HTTP POST. When you scroll down on this page, you get more information about the caller and a view on the individual pods:

A view on the inbound calls to the deployment plus a view on the pods

To see live calls in more detail, you can click the Tap icon:

A live view on the calls with Tap

For each call, details can be requested:

Request details

Conclusion

This was just a brief look at linkerd. It is trivially easy to install and with auto-injection, very simple to add it to your own services. Highly recommended to give it a spin to see where it can add value to your projects!

Deploy AKS and Traefik with an Azure DevOps YAML pipeline

This post is a companion to the following GitHub repository: https://github.com/gbaeke/aks-traefik-azure-deploy. The repository contains ARM templates to deploy an AD integrated Kubernetes cluster and an IP address plus a Helm chart to deploy Traefik. Traefik is configured to use the deployed IP address. In addition to those files, the repository also contains the YAML pipeline, ready to be imported in Azure DevOps.

Let’s take a look at the different building blocks!

AKS ARM Template

The aks folder contains the template and a parameters file. You will need to modify the parameters file because it requires settings to integrate the AKS cluster with Azure AD. You will need to specify:

  • clientAppID: the ID of the client app registration
  • serverAppID: the ID of the server app registration
  • tenantID: the ID of your AD tenant

Also specify clientId, which is the ID of the service principal for your cluster. Both the serverAppID and the clientID require a password. The passwords have been set via a pipeline secret variable.

The template configures a fairly standard AKS cluster that uses Azure networking (versus kubenet). It also configures Log Analytics for the cluster (container insights).

Deploying the template from the YAML file is done with the task below. You will need to replace YOUR SUBSCRIPTION with an authorized service connection:

 # DEPLOY AKS IN TEST   
 - task: AzureResourceGroupDeployment@2
   inputs:
     azureSubscription: 'YOUR SUBSCRIPTION'
     action: 'Create Or Update Resource Group'
     resourceGroupName: '$(aksTestRG)'
     location: 'West Europe'
     templateLocation: 'Linked artifact'
     csmFile: 'aks/deploy.json'
     csmParametersFile: 'aks/deployparams.t.json'
     overrideParameters: '-serverAppSecret $(serverAppSecret) -clientIdsecret $(clientIdsecret) -clusterName $(aksTest)'
       deploymentMode: 'Incremental'
       deploymentName: 'CluTest' 

The task uses several variables like $(aksTestRG) etc… If you check azure-pipelines.yaml, you will notice that most are configured at the top of the file in the variables section:

variables:
  aksTest: 'clu-test'
  aksTestRG: 'rg-clu-test'
  aksTestIP: 'clu-test-ip' 

The two secrets are the secret ūüĒź vaiables. Naturally, they are configured in the Azure DevOps UI. Note that there are other means to store and obtain secrets, such as Key Vault. In Azure DevOps, the secret variables can be found here:

Azure DevOps secret variables

IP Address Template

The ip folder contains the ARM template to deploy the IP address. We need to deploy the IP address resource to the resource group that holds the AKS agents. With the names we have chosen, that name is MC_rg-clu-test_clu-test_westeurope. It is possible to specify a custom name for the resource group.

Because we want to obtain the IP address after deployment, the ARM template contains an output:

 "outputs": {
        "ipaddress": {
            "type": "string",
            "value": "[reference(concat('Microsoft.Network/publicIPAddresses/', parameters('ipName')), '2017-10-01').ipAddress]"
        }
     } 

The output ipaddress is of type string. Via the reference template function we can extract the IP address.

The ARM template is deployed like the AKS template but we need to capture the ARM outputs. The last line of the AzureResourceGroupDeployment@2 that deploys the IP address contains:

deploymentOutputs: 'armoutputs'

Now we need to extract the IP address and set it as a variable in the pipeline. One way of doing this is via a bash script:

 - task: Bash@3
      inputs:
        targetType: 'inline'
        script: |
          echo "##vso[task.setvariable variable=test-ip;]$(echo '$(armoutputs)' | jq .ipaddress.value -r)" 

You can set a variable in Azure DevOps with echo ##vso[task.setvariable variable=variable_name;]value. In our case, the “value” should be the raw string of the IP address output. The $(armoutputs) variable contains the output of the IP address ARM template as follows:

{"ipaddress":{"type":"String","value":"IP ADDRESS"}}

To extract IP ADDRESS, we pipe the output of “echo $(armoutputs)” to js .ipaddress.value -r which extracts the IP ADDRESS from the JSON. The -r parameter removes double quotes from the IP ADDRESS to give us the raw string. For more info about jq, check https://stedolan.github.io/jq/ .

We now have the IP address in the test-ip variable, to be used in other tasks via $(test-ip).

Taking care of the prerequisites

In a later phase, we install Traefik via Helm. So we need kubectl and helm on the build agent. In addition, we need to install tiller on the cluster. Because the cluster is RBAC-enabled, we need a cluster account and a role binding as well. The following tasks take care of all that:

- task: KubectlInstaller@0
   inputs:
     kubectlVersion: '1.13.5'


- task: HelmInstaller@1
   inputs:
     helmVersionToInstall: '2.14.1'

- task: AzureCLI@1
  inputs:
    azureSubscription: 'YOUR SUB'
    scriptLocation: 'inlineScript'
    inlineScript: 'az aks get-credentials -g $(aksTestRG) -n $(aksTest) --admin'

 - task: Bash@3
   inputs:
     filePath: 'tiller/tillerconfig.sh'
     workingDirectory: 'tiller/' 

Note that we use the AzureCLI built-in task to easily obtain the cluster credentials for kubectl on the build agent. We use the –admin flag to gain full access. Note that this downloads sensitive information to the build agent temporarily.

The last task just runs a shell script to configure the service account and role binding and install tiller. Check the repository to see the contents of this simple script. Note that this is the quick and easy way to install tiller, not the most secure way! ūüôá‚Äć‚ôāÔłŹ

Install Traefik and use the IP address

The repository contains the downloaded chart (helm fetch stable/traefik –untar). The values.yaml file was modified to set the ingressClass to traefik-ext. We could have used the chart from the Helm repository but I prefer having the chart in source control. Here’s the pipeline task:

 - task: HelmDeploy@0
   inputs:
     connectionType: 'None'
     namespace: 'kube-system'
     command: 'upgrade'
     chartType: 'FilePath'
     chartPath: 'traefik-ext/.'
     releaseName: 'traefik-ext'
     overrideValues: 'loadBalancerIP=$(test-ip)'
     valueFile: 'traefik-ext/values.yaml' 

kubectl is configured to use the cluster so connectionType can be set to ‘None’. We simply specify the IP address we created earlier by setting loadBalancerIP to $(test-ip) with the overrides for values.yaml. This sets the loadBalancerIP setting in Traefik’s service definition (in the templates folder). Service.yaml in the templates folder contains the following section:

 spec:
  type: {{ .Values.serviceType }}
  {{- if .Values.loadBalancerIP }}
  loadBalancerIP: {{ .Values.loadBalancerIP }}
  {{- end }} 

Conclusion

Deploying AKS together with one or more public IP addresses is a common scenario. Hopefully, this post together with the GitHub repo gave you some ideas about automating these deployments with Azure DevOps. All you need to do is create a pipeline from the repo. Azure DevOps will read the azure-pipelines.yml file automatically.

Quick Tip: deploying multiple Traefik ingresses

For a customer that is developing a microservices application, the proposed architecture contains two Kubernetes ingresses:

  • internal ingress: exposed via an Azure internal load balancer, deployed in a separate subnet in the customer’s VNET; no need for SSL
  • external ingress: exposed via an external load balancer; SSL via Let’s Encrypt

The internal ingress exposes API endpoints via Azure API Management and its ability to connect to internal subnets. The external ingress exposes web applications via Azure Front Door.

The Ingress Controller of choice is Traefik. We use the Helm chart to deploy Traefik in the cluster. The example below uses Azure Kubernetes Service so I will refer to Azure objects such as VNETs, subnets, etc… Let’s get started!

Internal Ingress

In values.yaml, use ingressClass to set a custom class. For example:

 kubernetes:
  ingressClass: traefik-int 

When you do not set this value, the default ingressClass is traefik. When you define the ingress object, you refer to this class in your manifest via the annotation below:

 annotations:
    kubernetes.io/ingress.class: traefik-int

When we deploy the internal ingress, we need to tell Traefik to create an internal load balancer. Optionally, you can specify a subnet to deploy to. You can add these options under the service section in values.yaml:

service:
  annotations:
    service.beta.kubernetes.io/azure-load-balancer-internal: "true"
    service.beta.kubernetes.io/azure-load-balancer-internal-subnet: "traefik" 

The above setting makes sure that the annotations are set on the service that the Helm chart creates to expose Traefik to the “outside” world. The settings are not Traefik specific.

Above, we want Kubernetes to deploy the Azure internal load balancer to a subnet called traefik. That subnet needs to exist in the VNET that contains the Kubernetes subnet. Make sure that the AKS service principal has the necessary access rights to deploy the load balancer in the subnet. If it takes a long time to deploy the load balancer, use kubectl get events in the namespace where you deploy Traefik (typically kube-system).

If you want to provide an static IP address to the internal load balancer, you can do so via the loadBalancerIp setting near the top of values.yaml. You can use any free address in the subnet where you deploy the load balancer.

loadBalancerIP: 172.20.3.10 

All done! You can now deploy the internal ingress with:

helm install . --name traefik-int --namespace kube-system

Note that we install the Helm chart from our local file system and that we are in the folder that contains the chart and values.yaml. Hence the dot (.) in the command.

TIP: if you want to use a private DNS zone to resolve the internal services, see the private DNS section in Azure API Management and Azure Kubernetes Service. Private DNS zones are still in preview.

External ingress

The external ingress is simple now. Just set the ingressClass to traefik-ext (or leave it at the default of traefik although that’s not very clear) and remove the other settings. If you want a static public IP address, you can create such an address first and specify it in values.yaml. In an Azure context, you would create a public IP object in the resource group that contains your Kubernetes nodes.

Conclusion

If you need multiple ingresses of the same type or brand, use distinct values for ingressClass and reference the class in your ingress manifest file. Naturally, when you use two different solutions, say Kong for APIs and Traefik for web sites, you do not need to do that since they use different ingressClass values by default (kong and traefik). Hope this quick tip was useful!

Azure Front Door and multi-region deployments

In the previous post, we looked at publishing and securing an API with Azure Front Door and Azure Web Application Firewall. The API ran on Kubernetes, exposed by Kong and Kong Ingress Controller. Kong was configured to require an API key to call the /users API, allowing us to identify the consumer of the API. The traffic flow was as follows:

Consumer -- HTTPS --> Azure Front Door with WAF policy -- HTTPS --> Kong (exposed with Azure Load Balancer) -- HTTP --> API Kubernetes service --> API pods 

Although Kubernetes makes the API(s) highly available, you might want to take extra precautions such as deploying the API in multiple regions. In this post, we will take a look at doing so. That means we will deploy the API in both West and North Europe, in two distinct Kubernetes clusters:

  • we-clu: Kubernetes cluster in West Europe
  • ne-clu: Kubernetes cluster in North Europe

The flow is very similar of course:

Consumer -- HTTPS --> Azure Front Door with WAF policy -- HTTPS --> Kong (exposed with Azure Load Balancer; region to connect to depends on Front Door configuration and health probes) -- HTTP --> API Kubernetes service --> API pods  

Let’s take a look at the configuration! By the way, the supporting files to deploy the Kubernetes objects are here: https://github.com/gbaeke/api-kong/tree/master. To deploy Kong, check out this post.

Kubernetes

We deploy a Kubernetes cluster in each region, install Helm, deploy Kong, deploy our API and configure ingresses and related Kong custom resource definitions (CRDs). The result is an external IP address in each region that leads to the Kong proxy. Search for “kong” on this blog to find posts with more details about this deployment.

Note that the API deployment specifies an environment variable that will contain the string WE or NE. This environment variable will be displayed in the output when we call the API. Here is the API deployment for West Europe:

apiVersion: v1
kind: Service
metadata:
  name: func
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: func
  type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: func
spec:
  replicas: 2
  selector:
    matchLabels:
      app: func
  template:
    metadata:
      labels:
        app: func
    spec:
      containers:
      - name: func
        image: gbaeke/ingfunc
        env:
        - name: REGION
          value: "WE"
        ports:
        - containerPort: 80 

When we call the API and Azure Front Door uses the backend in West Europe, the result will be:

WE included in the response from the West Europe cluster

Origin APIs

The origin APIs need to be exposed on the public Internet using a DNS name. Azure Front Door requires a public backend to connect to. Naturally, the backend can be configured to only accept incoming requests from Front Door. In our case, the APIs are available on the public IP of the Kong proxy. The following names were used:

  • api-o-we.baeke.info: Kong proxy in West Europe
  • api-o-ne.baeke.info; Kong proxy in North Europe

Both endpoints are configured to accept TLS connections only, and use a Let’s Encrypt wildcard certificate for *.baeke.info.

Front Door Configuration

The Front Door designer looks the same as in the previous post:

Front Door Designer

However, the backend pool api-o now has two backends:

Two backend hosts, both enabled with same priority and weight

To determine the health of the backend, Front Door needs to be configured with a health probe that returns status code 200. If we were to specify the probe below, the health probe would fail:

Errrrrrr, this won’t work

The health probe would hit Kong’s proxy and return a 404 (Not Found). We did not create a route for /, only for /users. With Azure Front Door, when all health probes fail, all backends are considered healthy. Yes, you read that right.

Although we could create a route called /health that returns a 200, we will use the following probe just to make it work:

Fixing the health probe (quick and dirty fix); just can the /users API

If you are exposing multiple APIs on each cluster, the health probe above would not make sense. Also note that the purpose of the health probe is to determine if the cluster is up or not. It will not fix one API behaving badly or being removed accidentally!

You can check the health probes from the portal:

Yep! Health probes in West and North Europe are at 100%

Connection test

When I connect from my home laptop in Belgium, I get the following response:

Connection to West Europe cluster

When I connect from my second home in Dublin ūü§∑‚Äć‚ôÄÔłŹ I get:

Connection from a VM in North Europe (I was kidding about the second home)

If you enable logging to Log Analytics, you can check this in the FrontDoorAccessLog:

Connection from home via Brussels (West Europe would show DB in Tenant_x)

When I remove the /users API in West Europe (kubectl delete deploy func), my home laptop will connect to North Europe as expected:

I didn’t fake this! It’s 100% real, my laptop now connects to the North Europe cluster via Front Door as expected

Note that the calls will not fail from the moment you delete the /users API (the health probe here). That depends on the following setting (backends in Front Door designer):

When should the backend be determined healthy or unhealthy; decrease sample size and or samples to make it go faster

The backend health percentage graph indicates the probe failure as well:

We’ve lost West Europe folks!

Conclusion

When you are going for a multi-region deployment of services, Azure Front Door is one of the options. Of course, there is much more to a multi-region deployment than the “front-end stuff” described in this post. What do you do with databases for instance? Can you use active-active write regions (e.g. Cosmos DB) or does your database only support active/passive with read replicas?

As in other load balancing and fail over solutions, proper health probes are crucial in the design. Think about what a good health probe can be and what it means when it is not available. One option is to just write a health probe exposed via an endpoint such as /health that merely returns a 200 status code. But your health probe could also be designed to connect to backend systems such as databases or queues to determine the health of the system.

Hopefully, this post gives you some ideas to start! Follow me on Twitter for updates.

Publishing and securing your API with Kong and Azure Front Door

In the post, Securing your API with Kong and CloudFlare, I exposed a dummy API on Kubernetes with Kong and published it securely with CloudFlare. The breadth of features and its ease of use made CloudFlare a joy to work with. It didn’t take long before I got the question: “can’t you do that with Azure only?”. The answer is obvious: “Of course you can!”

In this post, the traffic flow is as follows:

Consumer -- HTTPS --> Azure Front Door with WAF policy -- HTTPS --> Kong (exposed with Azure Load Balancer) -- HTTP --> API Kubernetes service --> API pods

Similarly to CloudFlare, Azure Front Door provides a fully trusted certificate for consumers of the API. In contrast to CloudFlare, Azure Front Door does not provide origin certificates which are trusted by Front Door. That’s easy to solve though by using a fully trusted Let’s Encrypt certificate which is stored as a Kubernetes secret and used in the Kubernetes Ingress definition. For this post, I requested a wildcard certificate for *.baeke.info via https://www.sslforfree.com/

Let’s take it step-by-step, starting at the API and Kong level.

APIs and Kong

Just like in the previous posts, we have a Kubernetes service called func and back-end pods that host the API implemented via Azure Functions in a container. Below you see the API pods in the default namespace. For convenience, Kong is also deployed in that namespace (not recommended in production):

A view on the API pods and Kong via k9s

The ingress definition is shown below:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: func
  namespace: default
  annotations:
    kubernetes.io/ingress.class: kong
    plugins.konghq.com: http-auth
spec:
  tls:
  - hosts:
    - api-o.baeke.info
    secretName: wildcard-baeke.info.tls
  rules:
    - host: api-o.baeke.info
      http:
        paths:
        - path: /users
          backend:
            serviceName: func
            servicePort: 80 

Kong will pick up the above definition and configure itself accordingly.

The API is exposed publicly via https://api-o.baeke.info where the o stands for origin. The secret wildcard-baeke.info.tls refers to a secret which contains the wildcard certificate for *.baeke.info:

apiVersion: v1
kind: Secret
metadata:
  name: wildcard-baeke.info.tls
  namespace: default
type: kubernetes.io/tls
data:
  tls.crt: certificate
  tls.key: key

Naturally, certificate and key should be replaced with the base64-encoded strings of the certificate and key you have obtained (in this case from https://www.sslforfree.com).

At the DNS level, api-o.baeke.info should refer to the external IP address of the exposed Kong Ingress Controller (proxy):

The service kong-kong-proxy is exposed via a public IP address (service of type LoadBalancer)

For the rest, the Kong configuration is not very different from the configuration in Securing your API with Kong and CloudFlare. I did remove the whitelisting configuration, which needs to be updated for Azure Front Door.

Great, we now have our API listening on https://api-o.baeke.info but it is not exposed via Azure Front Door and it does not have a WAF policy. Let’s change that.

Web Application Firewall (WAF) Policy

You can create a WAF policy from the portal:

WAF Policy

The above policy is set to detection only. No custom rules have been defined, but a managed rule set is activated:

Managed rule set for OWASP

The WAF policy was saved as baekeapiwaf. It will be attached to an Azure Front Door frontend. When a policy is attached to a frontend, it will be shown in the policy:

Associated frontends (Front Door front-ends)

Azure Front Door

We will now add Azure Front Door to obtain the following flow:

Consumer ---> https://api.baeke.info (Front Door + WAF) --> https://api-o.baeke.info

The final configuration in Front Door Designer looks like this:

Front Door Designer

When a request comes in for api.baeke.info, the response from api-o.baeke.info is served. Caching was not enabled. The frontend and backend are tied together via the routing rule.

The first thing you need to do is to add the azurefd.net frontend which is baeke-api.azurefd.net in the above config. There’s not much to say about that. Just click the blue plus next to Frontend hosts and follow the prompts. I did not attach a WAF policy to that frontend because it will not forward requests to the backend. We will use a custom domain for that.

Next, click the blue plus again to add the custom domain (here api.baeke.info). In your DNS zone, create a CNAME record that maps api.yourdomain.com to the azurefd.net name:

Mapping of custom domain to azurefd.net domain in CloudFlare DNS

I attached the WAF policy baekeapiwaf to the front-end domain:

WAF policy with OWASP rules to protect the API

Next, I added a certificate. When you select Front Door managed, you will get a Digicert managed image. If the CNAME mapping is not complete, you will get an e-mail from Digicert to approve certificate issuance. Make sure you check your e-mails if it takes long to issue the certificate. It will take a long time either way so be patient! ūüí§ūüí§ūüí§

Now that we have the frontend, specify the backend that Front Door needs to connect to:

Backend pool

The backend pool uses the API exposed at api-o.baeke.info as defined earlier. With only one backend, priority and weight are of no importance. It should be clear that you can add multiple backends, potentially in different regions, and load balance between them.

You will also need a health probe to check for healthy and unhealthy backends:

Health probes of the backend

Note that the above health check does NOT return a 200 OK status code. That is the only status code that would result in a healthy endpoint. With the above config, Kong will respond with a “no Route matched” 404 Not Found error instead. That does not mean that Front Door will not route to this endpoint though! When all endpoints are in a failed state, Front Door considers them healthy anyway ūüė≤ūüė≤ūüė≤ and routes traffic using round-robin. See the documentation for more info.

Now that we have the frontend and the backend, let’s tie the two together with a rule:

First part of routing rule

In the first part of the rule, we specify that we listen for requests to api.baeke.info (and not the azurefd.net domain) and that we only accept https. The pattern /* basically forwards everything to the backend.

In the route details, we specify the backend to route to:

Backend to route to

Clearly, we want to route to the api-o backend we defined earlier. We only connect to the backend via HTTPS. It only accepts HTTPS anyway, as defined at the Kong level via a KongIngress resource.

Note that it is possible to create a HTTP to HTTPS redirect rule. See the post Azure Front Door Revisited for more information. Without the rule, you will get the following warning:

Please disregard this warning ūüėé

Test, test, test

Let’s call the API via the http tool:

Clearly, Azure Front Door has served this request as indicated by the X-Azure-Ref header. Let’s try http:

Azure Front Door throws the above error because the routing rule only accepts https on api.baeke.info!

White listing Azure Front Door

To restrict calls to the backend to Azure Front Door, I used the following KongPlugin definition:

apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: whitelist-fd
  namespace: default
config:
  whitelist: 
  - 147.243.0.0/16
plugin: ip-restriction 

The IP range is documented here. Note that the IP range can and probably will change in the future.

In the ingress definition, I added the plugin via the annotations:

annotations:
  kubernetes.io/ingress.class: kong
  plugins.konghq.com: http-auth, whitelist-fd 

Calling the backend API directly will now fail:

That’s a no no! Please use the Front Door!

Conclusion

Publishing APIs (or any web app), whether they are running on Kubernetes or other systems, is easy to do with the combination of Azure Front Door and Web Application Firewall policies. Do take pricing into account though. It’s a mixture of relatively low fixed prices with variable pricing per GB and requests processed. In general, CloudFlare has the upper hand here, from both a pricing and features perspective. On the other hand, Front Door has advantages when it comes to automating its deployment together with other Azure resources. As always: plan, plan, plan and choose wisely! ūü¶Č