Trying out k3sup

k3sup is a utility created by Alex Ellis to easily deploy k3s to any local or remote VM. In this post, I am giving the tool a try on a Civo cloud Ubuntu VM. You can of course pick any cloud provider you want or use a local system.

Deploying a VM on Civo Cloud

There’s not much to say here. Civo cloud is super simple to use and deploys VMs very fast. Just get an account and launch a new instance. Make sure you can access the VM over SSH. I deployed a simple Ubuntu 18.04 VM with 2 GBs of RAM:

VM deployed on Civo Cloud

Note: make sure you enable SSH via private/public key pair; use ssh-keygen to create the key pair and upload the contents of id_rsa.pub to Civo (SSH Keys section)

After deployment, check that you can access the VM with ssh chosen-user@IP-of-VM

Getting k3sup

On my Windows box, I used the Ubuntu shell to install k3sup:

curl -sLS https://get.k3sup.dev | sh 
sudo install k3sup /usr/local/bin/ 

You can now run the k3sup command as follows:

k3sup install --ip PUBLIC-IP-OF-CLOUD-VM --user root

And off it goes…

k3s installation via k3sup over SSH

At the end of the installation, you will see:

Saving file to: /home/gbaeke/kubeconfig

This means you can now use kubectl to interact with k3s. Just make sure kubectl knows where to find your kubeconfig file with (in my case in /home/gbaeke):

export KUBECONFIG=/home/gbaeke/kubeconfig

Before continuing, make sure your cloud VM allows access to TCP port 6443!

Now you can run something like kubectl get nodes:

kubectl running in Ubuntu shell on laptop to access k3s on remote VM

Installing applications

k3sup allows you to install the following applications to k3s via k3sup app install:

To install OpenFaas, just run k3sup app install openfaas. And off it goes….

Installing OpenFaas via k3sup

To install other applications, just use YAML files or any other method you prefer. It’s still Kubernetes! 😊

Conclusion

This was just a quick post (or note to self 😊) about k3sup which allows you to install k3s to any VM over SSH. It really is a great and simple to use tool so highly recommended. Note that Civo has a k3s service as well which is currently in beta. That service makes it easy to provision k3s from the Civo portal, similar to how you deploy AKS or GKE!

GitOps with Weaveworks Flux – Installing and Updating Applications

In a previous post, we installed Weaveworks Flux. Flux synchronizes the contents of a git repository with your Kubernetes cluster. Flux can easily be installed via a Helm chart. As an example, we installed Traefik by adding the following yaml to the synced repository:

apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
  name: traefik
  namespace: default
  annotations:
    fluxcd.io/ignore: "false"
spec:
  releaseName: traefik
  chart:
    repository: https://kubernetes-charts.storage.googleapis.com/
    name: traefik
    version: 1.78.0
  values:
    serviceType: LoadBalancer
    rbac:
      enabled: true
    dashboard:
      enabled: true   

It does not matter where you put this file because Flux scans the complete repository. I added the file to a folder called traefik.

If you look more closely at the YAML file, you’ll notice its kind is HelmRelease. You need an operator that can handle this type of file, which is this one. In the previous post, we installed the custom resource definition and the operator manually.

Adding a custom application

Now it’s time to add our own application. You do not need to use Helm packages or the Helm operator to install applications. Regular yaml will do just fine.

The application we will deploy needs a Redis backend. Let’s deploy that first. Add the following yaml file to your repository:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
  labels:
    app: redis       
spec:
  selector:
    matchLabels:     
      app: redis
  replicas: 1        
  template:          
    metadata:
      labels:        
        app: redis
    spec:            
      containers:
      - name: redis
        image: redis
        resources:
          requests:
            cpu: 200m
            memory: 100Mi
        ports:
        - containerPort: 6379
---        
apiVersion: v1
kind: Service        
metadata:
  name: redis
  labels:            
    app: redis
spec:
  ports:
  - port: 6379       
    targetPort: 6379
  selector:          
    app: redis

After committing this file, wait a moment or run fluxctl sync. When you run kubectl get pods for the default namespace, you should see the Redis pod:

Redis is running — yay!!!

Now it’s time to add the application. I will use an image, based on the following code: https://github.com/gbaeke/realtime-go (httponly branch because master contains code to automatically request a certificate with Let’s Encrypt). I pushed the image to Docker Hub as gbaeke/fluxapp:1.0.0. Now let’s deploy the app with the following yaml:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: realtime
  labels:
    app: realtime       
spec:
  selector:
    matchLabels:     
      app: realtime
  replicas: 1        
  template:          
    metadata:
      labels:        
        app: realtime
    spec:            
      containers:
      - name: realtime
        image: gbaeke/fluxapp:1.0.0
        env:
        - name: REDISHOST
          value: "redis:6379"
        resources:
          requests:
            cpu: 50m
            memory: 50Mi
          limits:
            cpu: 150m
            memory: 150Mi
        ports:
        - containerPort: 8080
---        
apiVersion: v1
kind: Service        
metadata:
  name: realtime
  labels:            
    app: realtime
spec:
  ports:
  - port: 80       
    targetPort: 8080
  selector:          
    app: realtime
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: realtime-ingress
spec:
  rules:
  - host: realtime.IP.xip.io
    http:
      paths:
      - path: /
        backend:
          serviceName: realtime
          servicePort: 80

In the above yaml, replace IP in the Ingress specification to the IP of the external load balancer used by your Ingress Controller. Once you add the yaml to the git repository and you run fluxctl sync the application should be deployed. You see the following page when you browse to http://realtime.IP.xip.io:

Web app deployed via Flux and standard yaml

Great, v1.0.0 of the app is deployed using the gbaeke/fluxapp:1.0.0 image. But what if I have a new version of the image and the yaml specification does not change? Read on…

Upgrading the application

If you have been following along, you can now run the following command:

fluxctl list-workloads -a

This will list all workloads on the cluster, including the ones that were not installed by Flux. If you check the list, none of the workloads are automated. When a workload is automated, it can automatically upgrade the application when a new image appears. Let’s try to automate the fluxapp. To do so, you can either add annotations to your yaml or use fluxctl. Let’s use the yaml approach by adding the following to our deployment:

annotations:
    flux.weave.works/automated: "true"
    flux.weave.works/tag.realtime: semver:~1.0

Note: Flux only works with immutable tags; do not use latest

After committing the file and running fluxctl sync, you can run fluxctl list-workloads -a again. The deployment should now be automated:

fluxapp is now automated

Now let’s see what happens when we add a new version of the image with tag 1.0.1. That image uses a different header color to show the difference. Flux monitors the repository for changes. When it detects a new version of the image that matches the semver filter, it will modify the deployment. Let’s check with fluxctl list-workloads -a:

new image deployed

And here’s the new color:

New color in version 1.0.1. Exciting! 😊

But wait… what about the git repo?

With the configuration of a deploy key, Flux has access to the git repository. When a deployment is automated and the image is changed, that change is also reflected in the git repo:

Weave Flux updated the realtime yaml file

In the yaml, version 1.0.1 is now used:

Flux updated the yaml file

What if I don’t like this release? With fluxctl, you can rollback to a previous version like so:

Rolling back a release – will also update the git repo

Although this works, the deployment will be updated to 1.0.1 again since it is automated. To avoid that, first lock the deployment (or workload) and then force the release of the old image:

fluxctl lock -w=deployment/realtime

fluxctl release -n default --workload=deployment/realtime --update-image=gbaeke/fluxapp:1.0.0 --force

In your yaml, there will be an additional annotation: fluxcd.io/locked: ‘true’ and the image will be set to 1.0.0.

Conclusion

In this post, we looked at deploying and updating an application via Flux automation. You only need a couple of annotations to make this work. This was just a simple example. For an example with dev, staging and production branches and promotion from staging to production, be sure to look at https://github.com/fluxcd/helm-operator-get-started as well.

GitOps with Weaveworks Flux

If you have ever deployed applications to Kubernetes or other platforms, you are probably used to the following approach:

  • developers check in code which triggers CI (continuous integration) and eventually results in deployable artifacts
  • a release process deploys the artifacts to one or more environments such as a development and a production environment

In the case of Kubernetes, the artifact is usually a combination of a container image and a Helm chart. The release process then authenticates to the Kubernetes cluster and deploys the artifacts. Although this approach works, I have always found this deployment process overly complicated with many release pipelines configured to trigger on specific conditions.

What if you could store your entire cluster configuration in a git repository as the single source of truth and use simple git operations (is there such a thing? 😁) to change your configuration? Obviously, you would need some extra tooling that synchronizes the configuration with the cluster, which is exactly what Weaveworks Flux is designed to do. Also check the Flux git repo.

In this post, we will run through a simple example to illustrate the functionality. We will do the following over two posts:

Post one:

  • Create a git repo for our configuration
  • Install Flux and use the git repo as our configuration source
  • Install an Ingress Controller with a Helm chart

Post two:

  • Install an application using standard YAML (including ingress definition)
  • Update the application automatically when a new version of the application image is available

Let’s get started!

Create a git repository

To keep things simple, make sure you have an account on GitHub and create a new repository. You can also clone my demo repository. To clone it, use the following command:

git clone https://github.com/gbaeke/gitops-sample.git

Note: if you clone my repo and use it in later steps, the resources I defined will get created automatically; if you want to follow the steps, use your own empty repo

Install Flux

Flux needs to be installed on Kubernetes, so make sure you have a cluster at your disposal. In this post, I use Azure Kubernetes Services (AKS). Make sure kubectl points to that cluster. If you have kubectl installed, obtain the credentials to the cluster with the Azure CLI and then run kubectl get nodes or kubectl cluster-info to make sure you are connected to the right cluster.

az aks get-credentials -n CLUSTER_NAME -g RESOURCE_GROUP

It is easy to install Flux with Helm and in this post, I will use Helm v3 which is currently in beta. You will need to install Helm v3 on your system. I installed it in Windows 10’s Ubuntu shell. Use the following command to download and unpack it:

curl -sSL "https://get.helm.sh/helm-v3.0.0-beta.3-linux-amd64.tar.gz" | tar xvz

This results in a folder linux-amd64 which contains the helm executable. Make the file executable with chmod +x and copy it to your path as helmv3. Next, run helmv3. You should see the help text:

The Kubernetes package manager
 
Common actions for Helm:

- helm search:    search for charts
- helm fetch:     download a chart to your local directory to view
- helm install:   upload the chart to Kubernetes
- helm list:      list releases of charts 
...

Now you are ready to install Flux. First, add the FLux Helm repository to allow helmv3 to find the chart:

helmv3 repo add fluxcd https://charts.fluxcd.io

Create a namespace for Flux:

kubectl create ns flux

Install Flux in the namespace with Helm v3:

helmv3 upgrade -i flux fluxcd/flux --wait \
 --namespace flux \
 --set registry.pollInterval=1m \
 --set git.pollInterval=1m \
 --set git.url=git@github.com:GITHUBUSERNAME/gitops-sample

The above command upgrades Flux but installs it if it is missing (-i). The chart to install is fluxcd/flux. With –wait, we wait until the installation is finished. We will not go into the first two –set options for now. The last option defines the git repository Flux should use to sync the configuration to the cluster. Currently, Flux supports one repository. Because we use a public repository, Flux can easily read its contents. At times, Flux needs to update the git repository. To support that, you can add a deploy key to the repository. First, install the fluxctl tool:

curl -sL https://fluxcd.io/install | sh
export PATH=$PATH:$HOME/.fluxcd/bin

Now run the following commands to obtain the public key to use as deploy key:

export FLUX_FORWARD_NAMESPACE=flux
fluxctl identity

The output of the command is something like:

ssh-rsa AAAAB3NzaC1yc2EAAAA...

Copy and paste this key as a deploy key for your github repo:

git repo deploy key

Phew… Flux should now be installed on your cluster. Time to install some applications to the cluster from the git repo.

Note: Flux also supports private repos; it just so happens I used a public one here

Install an Ingress Controller

Let’s try to install Traefik via its Helm chart. Since I am not using traditional CD with pipelines that run helm commands, we will need something else. Luckily, there’s a Flux Helm Operator that allows us to declaratively install Helm charts. The Helm Operator installs a Helm chart when it detects a custom resource definition (CRD) of type helm.fluxcd.io/v1. Let’s first create the CRD for Helm v3:

kubectl apply -f https://raw.githubusercontent.com/fluxcd/helm-operator/master/deploy/flux-helm-release-crd.yaml

Next, install the operator:

helmv3 upgrade -i helm-operator flux/helm-operator --wait \
 --namespace fluxcd \
 --set git.ssh.secretName=flux-git-deploy \
 --set git.pollInterval=1m \
 --set chartsSyncInterval=1m \
 --set configureRepositories.enable=true \
 --set configureRepositories.repositories[0].name=stable \
 --set configureRepositories.repositories[0].url=https://kubernetes-charts.storage.googleapis.com \
 --set extraEnvs[0].name=HELM_VERSION \
 --set extraEnvs[0].value=v3 \
 --set image.repository=docker.io/fluxcd/helm-operator-prerelease \
 --set image.tag=helm-v3-71bc9d62

You didn’t think I found the above myself did you? 😁 It’s from an excellent tutorial here.

When the operator is installed, you should be able to install Traefik with the following YAML:

apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
  name: traefik
  namespace: default
  annotations:
    fluxcd.io/ignore: "false"
spec:
  releaseName: traefik
  chart:
    repository: https://kubernetes-charts.storage.googleapis.com/
    name: traefik
    version: 1.78.0
  values:
    serviceType: LoadBalancer
    rbac:
      enabled: true
    dashboard:
      enabled: true   

Just add the above YAML to the GitHub repository. I added it to the ingress folder:

traefik.yaml added to the GitHub repo

If you wait a while, or run fluxctl sync, the repo gets synced and the resources created. When the helm.fluxcd.io/v1 object is created, the Helm Operator will install the chart in the default namespace. Traefik will be exposed via an Azure Load Balancer. You can check the release with the following command:

kubectl get helmreleases.helm.fluxcd.io

NAME      RELEASE   STATUS     MESSAGE                  AGE
traefik   traefik   deployed   helm install succeeded   15m

Also check that the Traefik pod is created in the default namespace (only 1 replica; the default):

kubectl get po

NAME                       READY   STATUS    RESTARTS   AGE
traefik-86f4c5f9c9-gcxdb   1/1     Running   0          21m

Also check the public IP of Traefik:

kubectl get svc
 
NAME                TYPE           CLUSTER-IP     EXTERNAL-IP 
traefik             LoadBalancer   10.0.8.59      41.44.245.234   

We will later use that IP when we define the ingress for our web application.

Conclusion

In this post, you learned a tiny bit about GitOps with WeaveWorks Flux. The concept is simple enough: store your cluster config in a git repo as the single source of truth and use git operations to initiate (or rollback) cluster operations. To start, we simply installed Traefik via the Flux Helm Operator. In a later post, we will add an application and look at image management. There’s much more you can do so stay tuned!

The basics of meshing Traefik 2.0 with Linkerd

A while ago, I blogged about Linkerd 2.x. In that post, I used a simple calculator API, reachable via an Azure Load Balancer. When you look at that traffic in Linkerd, you see the following:

Incoming load balancer traffic to a meshed deployment (in this case Traefik 2.0)

Above, you do not see this is Azure Load Balancer traffic. The traffic reaches the meshed service via the Azure CNI pods.

In this post, we will install Traefik 2.0, mesh the Traefik deployment and make the calculator service reachable via Traefik and the new IngressRoute. Let’s get started!

Install Traefik 2.0

We will install Traefik 2.0 with http support only. There’s an excellent blog that covers the installation over here. In short, you do the following:

  • deploy prerequisites such as custom resource definitions (CRDs), ClusterRole, ClusterRoleBinding, ServiceAccount
  • deploy Traefik 2.0: it’s just a Kubernetes deployment
  • deploy a service to expose the Traefik HTTP endpoint via a Load Balancer; I used an Azure Load Balancer automatically deployed via Azure Kubernetes Service (AKS)
  • deploy a service to expose the Traefik admin endpoint via an IngressRoute

Here are the prerequisites for easy copy and pasting:

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ingressroutes.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: IngressRoute
    plural: ingressroutes
    singular: ingressroute
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ingressroutetcps.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: IngressRouteTCP
    plural: ingressroutetcps
    singular: ingressroutetcp
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: middlewares.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: Middleware
    plural: middlewares
    singular: middleware
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: tlsoptions.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: TLSOption
    plural: tlsoptions
    singular: tlsoption
  scope: Namespaced

---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: traefik-ingress-controller

rules:
  - apiGroups:
      - ""
    resources:
      - services
      - endpoints
      - secrets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
    resources:
      - ingresses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
    resources:
      - ingresses/status
    verbs:
      - update
  - apiGroups:
      - traefik.containo.us
    resources:
      - middlewares
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - traefik.containo.us
    resources:
      - ingressroutes
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - traefik.containo.us
    resources:
      - ingressroutetcps
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - traefik.containo.us
    resources:
      - tlsoptions
    verbs:
      - get
      - list
      - watch

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: traefik-ingress-controller

roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: traefik-ingress-controller
subjects:
  - kind: ServiceAccount
    name: traefik-ingress-controller
    namespace: default

---
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: default
  name: traefik-ingress-controller

Save this to a file and then use kubectl apply -f filename.yaml. Here’s the deployment:

kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  namespace: default
  name: traefik
  labels:
    app: traefik

spec:
  replicas: 2
  selector:
    matchLabels:
      app: traefik
  template:
    metadata:
      labels:
        app: traefik
    spec:
      serviceAccountName: traefik-ingress-controller
      containers:
        - name: traefik
          image: traefik:v2.0
          args:
            - --api
            - --accesslog
            - --entrypoints.web.Address=:8000
            - --entrypoints.web.forwardedheaders.insecure=true
            - --providers.kubernetescrd
            - --ping
            - --accesslog=true
            - --log=true
          ports:
            - name: web
              containerPort: 8000
            - name: admin
              containerPort: 8080

Here’s the service to expose Traefik’s web endpoint. This is different from the post I referred to because that post used DigitalOcean. I am using Azure here.

apiVersion: v1
kind: Service
metadata:
  name: traefik
spec:
  type: LoadBalancer
  ports:
    - protocol: TCP
      name: web
      port: 80
      targetPort: 8000
  selector:
    app: traefik

The above service definition will give you a public IP. Traffic destined to port 80 on that IP goes to the Traefik pods on port 8000.

Now we can expose the Traefik admin interface via Traefik itself. Note that I am not using any security here. Check the original post for basic auth config via middleware.

apiVersion: v1
kind: Service
metadata:
  name: traefik-admin
spec:
  type: ClusterIP
  ports:
    - protocol: TCP
      name: admin
      port: 8080
  selector:
    app: traefik
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: traefik-admin
spec:
  entryPoints:
    - web
  routes:
  - match: Host(`somehost.somedomain.com`) && PathPrefix(`/`)
    kind: Rule
    priority: 1
    services:
    - name: traefik-admin
      port: 8080

Traefik’s admin site is first exposed as a ClusterIP service on port 8080. Next, an object of kind IngressRoute is defined, which is new for Traefik 2.0. You don’t need to create standard Ingress objects and configure Traefik with custom annotations. This new approach is cleaner. Of course, substitute the host with a host that points to the public IP of the load balancer. Or use the IP address with the xip.io domain. If your IP would be 1.1.1.1 then you could use something like admin.1.1.1.1.xip.io. That name automatically resolves to the IP in the name.

Let’s see if we can reach the admin interface:

The new Traefik 2 admin UI

Traefik 2.0 is now installed in a basic way and working properly. We exposed the admin interface but now it is time to expose the calculator API.

Exposing the calculator API

The API is deployed as 5 pods in the add namespace:

Calculator API exposed

The API is exposed as a service of type ClusterIP with only an internal Kubernetes IP. To expose it via Traefik, we create the following object in the add namespace:

apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: calc-svc
  namespace: add  
spec:
  entryPoints:
    - web
  routes:
  - match: Host(`calc.1.1.1.1.xip.io`) && PathPrefix(`/`)
    kind: Rule
    priority: 1
    middlewares:
      - name: calcheader
    services:
    - name: add-svc
      port: 80

I am using xip.io above. Change 1.1.1.1 to the public IP of Traefik’s Azure Load Balancer. The add-svc that exposes the calculator API on port 80 is exposed via Traefik. We can easily call the service via:

curl http://calc.1.1.1.1.xip.io/add/10/10

20

Great! But what is that calcheader middleware? Middlewares modify the requests and responses to and from Traefik 2.0. There are all sorts of middelwares as explained here. You can set headers, configure authentication, perform rate limiting and much much more. In this case we create the following middleware object in the add namespace:

apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: calcheader
  namespace: add
spec:
  headers:
    customRequestHeaders:
      l5d-dst-override: "add-svc.add.svc.cluster.local:80"

This middleware adds a header to the request before it comes in to Traefik. The header overrides the destination and sets it to the internal DNS name of the add-svc service that exposes the calculator API. This requirement is documented by Linkerd here.

Meshing the Traefik deployment

Because we want to mesh Traefik to get Linkerd metrics and more, we need to inject the Linkerd proxy in the Traefik pods. In my case, Traefik is deployed in the default namespace so the command below can be used:

kubectl get deploy -o yaml | linkerd inject - | kubectl apply -f - 

Make sure you run the command on a system with the linkerd executable in your path and kubectl homed to the cluster that has Linkerd installed.

Checking the traffic in the Linkerd dashboard

With some traffic generated, this is what you should see when you check the meshed deployment that runs the calculator API (deploy/add):

Both the traffic generator (add-cli) and Traefik are meshed which results in a more detailed view of the traffic

If you are wondering what these services are and do, check this post. In the above diagram, we can clearly see we are receiving traffic to the calculator API from Traefik. When I click on Traefik, I see the following:

A view on the meshed Traefik deployment

From the above, we see Traefik receives traffic via the Azure Load Balancer and that it forwards traffic to the calculator service. The live calls are coming from the admin UI which refreshes regularly.

In Grafana, we can get more information about the Traefik deployment:

Linkerd metrics for Traefik in the Grafana dashboard that comes with Linkerd
More metrics

Conclusion

This was just a brief look at both Traefik 2 and “meshing” Traefik with Linkerd. There is much more to say and I have much more to explore. Hopefully, this can get you started!

Giving linkerd a spin

A while ago, I gave linkerd a spin. Due to vacations and a busy schedule, I was not able to write about my experience. I will briefly discuss how to setup linkerd and then deploy a sample service to illustrate what it can do out of the box. Let’s go!

Wait! What is linkerd?

linkerd basically is a network proxy for your Kubernetes pods that’s designed to be deployed as a service mesh. When the pods you care about have been infused with linkerd, you will automatically get metrics like latency and requests per second, a web portal to check these metrics, live inspection of traffic and much more. Below is an example of a Kubernetes namespace that has been meshed:

A meshed namespace; all deployments in this particular namespace are meshed which means all pods get the linkerd network proxy that provides the metrics and features such as encryption

Installation

I can be very brief about this: installation is about as simple as it gets. Simply navigate to https://linkerd.io/2/getting-started to get started. Here are the simplified steps:

  • Download the linkerd executable as described in the Getting Started guide; I used WSL for this
  • Create a Kubernetes cluster with AKS (or another provider); for AKS, use the Azure CLI to get your credentials (az aks get-credentials); make sure the Azure CLI is installed in WSL and that you connected to your Azure subscription with az login
  • Make sure you can connect to your cluster with kubectl
  • Run linkerd check –pre to check if prerequisites are fulfilled
  • Install linkerd with linkerd install | kubectl apply -f –
  • Check the installation with linkerd check

The last step will nicely show its progress and end when the installation is complete:

linkerd check output

Exploring linkerd with the dashboard

linkerd automatically installs a dashboard. The dashboard is exposed as a Kubernetes service called linkerd-web. The service is of type ClusterIP. Although you could expose the service using an ingress, you can easily tunnel to the service with the following linkerd command (first line is the command; other lines are the output):

linkerd dashboard

Linkerd dashboard available at:
http://127.0.0.1:50750
Grafana dashboard available at:
http://127.0.0.1:50750/grafana
Opening Linkerd dashboard in the default browser
Failed to open Linkerd dashboard automatically
Visit http://127.0.0.1:50750 in your browser to view the dashboard

From WSL, the dashboard can not open automatically but you can manually browse to it. Note that linkerd also installs Prometheus and Grafana.

Out of the box, the linkerd deployment is meshed:

Adding linkerd to your own service

In this section, we will deploy a simple service that can add numbers and add linkerd to it. Although there are many ways to do this, I chose to create a separate namespace and enable auto-injection via an annotation. Here’s the yaml to create the namespace (add-ns.yaml):

apiVersion: v1
kind: Namespace
metadata:
  name: add
  annotations:
    linkerd.io/inject: enabled

Just run kubectl create -f add-ns.yaml to create the namespace. The annotation ensures that all pods added to the namespace get the linkerd proxy in the pod. All traffic to and from the pod will then pass through the proxy.

Now, let’s install the add service and deployment:

apiVersion: v1
kind: Service
metadata:
  name: add-svc
spec:
  ports:
  - port: 80
    name: http
    protocol: TCP
    targetPort: 8000
  - port: 8080
    name: grpc
    protocol: TCP
    targetPort: 8080
  selector:
    app: add
    version: v1
  type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: add
spec:
  replicas: 2
  selector:
    matchLabels:
      app: add
  template:
    metadata:
      labels:
        app: add
        version: v1
    spec:
      containers:
      - name: add
        image: gbaeke/adder

The deployment deploys to two pods with the gbaeke/adder image. To deploy the above, save it to a file (add.yaml) and use the following command to deploy:

kubectl create -f add-yaml -n add

Because the deployment uses the add namespace, the linkerd proxy will be added to each pod automatically. When you list the pods in the deployment, you see:

Each add pod has two containers: the actual add container based on gbaeke/adder and the proxy

To see more details about one of these pods, I can use the following command:

k get po add-5b48fcc894-2dc97 -o yaml -n add

You will clearly see the two containers in the output:

Two containers in the pod: actual service (gbaeke/adder) and the linkerd proxy

Generating some traffic

Let’s deploy a client that continuously uses the calculator service:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: add-cli
spec:
  replicas: 1
  selector:
    matchLabels:
      app: add-cli
  template:
    metadata:
      labels:
        app: add-cli
    spec:
      containers:
      - name: add-cli
        image: gbaeke/adder-cli
        env:
        - name: SERVER
          value: "add-svc"

Save the above to add-cli.yaml and deploy with the below command:

kubectl create -f add-cli.yaml -n add

The deployment uses another image called gbaeke/adder-cli that continuously makes requests to the server specified in the SERVER environment variable.

Checking the deployment in the linkerd portal

When you now open the add namespace in the linked portal, you should see something similar to the below screenshot (note: I deployed 5 servers and 5 clients):

A view on the add namespace; linkerd has learned how the deployments talk to eachother

The linkerd proxy in all pods sees all traffic. From the traffic, it can infer that the add-cli deployment talks to the add deployment. The add deployment receives about 150 requests per second. The 99th percentile latency is relatively high because the cluster nodes are very small, I deployed more instances and the client is relatively inefficient.

When I click the deployment called add, the following screen is shown:

A view on the deployment

The deployment clearly shows where traffic is coming from plus relevant metrics such as RPS and P99 latency. You also get a view on the live calls now. Note that the client is using GRPC which uses a HTTP POST. When you scroll down on this page, you get more information about the caller and a view on the individual pods:

A view on the inbound calls to the deployment plus a view on the pods

To see live calls in more detail, you can click the Tap icon:

A live view on the calls with Tap

For each call, details can be requested:

Request details

Conclusion

This was just a brief look at linkerd. It is trivially easy to install and with auto-injection, very simple to add it to your own services. Highly recommended to give it a spin to see where it can add value to your projects!

Deploy AKS and Traefik with an Azure DevOps YAML pipeline

This post is a companion to the following GitHub repository: https://github.com/gbaeke/aks-traefik-azure-deploy. The repository contains ARM templates to deploy an AD integrated Kubernetes cluster and an IP address plus a Helm chart to deploy Traefik. Traefik is configured to use the deployed IP address. In addition to those files, the repository also contains the YAML pipeline, ready to be imported in Azure DevOps.

Let’s take a look at the different building blocks!

AKS ARM Template

The aks folder contains the template and a parameters file. You will need to modify the parameters file because it requires settings to integrate the AKS cluster with Azure AD. You will need to specify:

  • clientAppID: the ID of the client app registration
  • serverAppID: the ID of the server app registration
  • tenantID: the ID of your AD tenant

Also specify clientId, which is the ID of the service principal for your cluster. Both the serverAppID and the clientID require a password. The passwords have been set via a pipeline secret variable.

The template configures a fairly standard AKS cluster that uses Azure networking (versus kubenet). It also configures Log Analytics for the cluster (container insights).

Deploying the template from the YAML file is done with the task below. You will need to replace YOUR SUBSCRIPTION with an authorized service connection:

 # DEPLOY AKS IN TEST   
 - task: AzureResourceGroupDeployment@2
   inputs:
     azureSubscription: 'YOUR SUBSCRIPTION'
     action: 'Create Or Update Resource Group'
     resourceGroupName: '$(aksTestRG)'
     location: 'West Europe'
     templateLocation: 'Linked artifact'
     csmFile: 'aks/deploy.json'
     csmParametersFile: 'aks/deployparams.t.json'
     overrideParameters: '-serverAppSecret $(serverAppSecret) -clientIdsecret $(clientIdsecret) -clusterName $(aksTest)'
       deploymentMode: 'Incremental'
       deploymentName: 'CluTest' 

The task uses several variables like $(aksTestRG) etc… If you check azure-pipelines.yaml, you will notice that most are configured at the top of the file in the variables section:

variables:
  aksTest: 'clu-test'
  aksTestRG: 'rg-clu-test'
  aksTestIP: 'clu-test-ip' 

The two secrets are the secret πŸ” vaiables. Naturally, they are configured in the Azure DevOps UI. Note that there are other means to store and obtain secrets, such as Key Vault. In Azure DevOps, the secret variables can be found here:

Azure DevOps secret variables

IP Address Template

The ip folder contains the ARM template to deploy the IP address. We need to deploy the IP address resource to the resource group that holds the AKS agents. With the names we have chosen, that name is MC_rg-clu-test_clu-test_westeurope. It is possible to specify a custom name for the resource group.

Because we want to obtain the IP address after deployment, the ARM template contains an output:

 "outputs": {
        "ipaddress": {
            "type": "string",
            "value": "[reference(concat('Microsoft.Network/publicIPAddresses/', parameters('ipName')), '2017-10-01').ipAddress]"
        }
     } 

The output ipaddress is of type string. Via the reference template function we can extract the IP address.

The ARM template is deployed like the AKS template but we need to capture the ARM outputs. The last line of the AzureResourceGroupDeployment@2 that deploys the IP address contains:

deploymentOutputs: 'armoutputs'

Now we need to extract the IP address and set it as a variable in the pipeline. One way of doing this is via a bash script:

 - task: Bash@3
      inputs:
        targetType: 'inline'
        script: |
          echo "##vso[task.setvariable variable=test-ip;]$(echo '$(armoutputs)' | jq .ipaddress.value -r)" 

You can set a variable in Azure DevOps with echo ##vso[task.setvariable variable=variable_name;]value. In our case, the “value” should be the raw string of the IP address output. The $(armoutputs) variable contains the output of the IP address ARM template as follows:

{"ipaddress":{"type":"String","value":"IP ADDRESS"}}

To extract IP ADDRESS, we pipe the output of “echo $(armoutputs)” to js .ipaddress.value -r which extracts the IP ADDRESS from the JSON. The -r parameter removes double quotes from the IP ADDRESS to give us the raw string. For more info about jq, check https://stedolan.github.io/jq/ .

We now have the IP address in the test-ip variable, to be used in other tasks via $(test-ip).

Taking care of the prerequisites

In a later phase, we install Traefik via Helm. So we need kubectl and helm on the build agent. In addition, we need to install tiller on the cluster. Because the cluster is RBAC-enabled, we need a cluster account and a role binding as well. The following tasks take care of all that:

- task: KubectlInstaller@0
   inputs:
     kubectlVersion: '1.13.5'


- task: HelmInstaller@1
   inputs:
     helmVersionToInstall: '2.14.1'

- task: AzureCLI@1
  inputs:
    azureSubscription: 'YOUR SUB'
    scriptLocation: 'inlineScript'
    inlineScript: 'az aks get-credentials -g $(aksTestRG) -n $(aksTest) --admin'

 - task: Bash@3
   inputs:
     filePath: 'tiller/tillerconfig.sh'
     workingDirectory: 'tiller/' 

Note that we use the AzureCLI built-in task to easily obtain the cluster credentials for kubectl on the build agent. We use the –admin flag to gain full access. Note that this downloads sensitive information to the build agent temporarily.

The last task just runs a shell script to configure the service account and role binding and install tiller. Check the repository to see the contents of this simple script. Note that this is the quick and easy way to install tiller, not the most secure way! πŸ™‡β€β™‚οΈ

Install Traefik and use the IP address

The repository contains the downloaded chart (helm fetch stable/traefik –untar). The values.yaml file was modified to set the ingressClass to traefik-ext. We could have used the chart from the Helm repository but I prefer having the chart in source control. Here’s the pipeline task:

 - task: HelmDeploy@0
   inputs:
     connectionType: 'None'
     namespace: 'kube-system'
     command: 'upgrade'
     chartType: 'FilePath'
     chartPath: 'traefik-ext/.'
     releaseName: 'traefik-ext'
     overrideValues: 'loadBalancerIP=$(test-ip)'
     valueFile: 'traefik-ext/values.yaml' 

kubectl is configured to use the cluster so connectionType can be set to ‘None’. We simply specify the IP address we created earlier by setting loadBalancerIP to $(test-ip) with the overrides for values.yaml. This sets the loadBalancerIP setting in Traefik’s service definition (in the templates folder). Service.yaml in the templates folder contains the following section:

 spec:
  type: {{ .Values.serviceType }}
  {{- if .Values.loadBalancerIP }}
  loadBalancerIP: {{ .Values.loadBalancerIP }}
  {{- end }} 

Conclusion

Deploying AKS together with one or more public IP addresses is a common scenario. Hopefully, this post together with the GitHub repo gave you some ideas about automating these deployments with Azure DevOps. All you need to do is create a pipeline from the repo. Azure DevOps will read the azure-pipelines.yml file automatically.

Inspecting Web Application Firewall logs

In some of my previous posts, I talked about Azure Front Door and Web Application Firewall policies to protect a workload like one or more APIs running on Kubernetes or App Service. Although I enabled the Web Application Firewall policies, I did not show what happens when the rules are triggered. Let’s take a look at that! πŸ•Ά

Before we get started though, take the following diagram into account:

Azure web application firewall
From: https://docs.microsoft.com/en-us/azure/frontdoor/waf-overview

WAF for Front Door is a global solution. You create a WAF policy in the portal or via other means and attach it to a Front Door frontend. Rules are evaluated and acted upon at the edge versus on your application server.

Azure WAF supports custom rules and Azure-managed rule sets (based on OWASP). The custom rules are interesting because they allow you to restrict IP addresses, configure geographic based access control and more.

There’s an additional rule type called bot protection rule as well. At the time of this writing (beginning June 2019) this feature is in public preview. It uses the Microsoft Intelligent Security Graph to do its magic, similarly to Azure Firewall when you enable Threat Intelligence.

WAF Logs

Let’s first use a tool that can scan an endpoint for vulnerabilities to trigger the WAF rules. One such tool is OWASP ZAP, which you need to install on your workstation.

OWASP ZAP tool

Before we check the logs, note we have set the policy to Detection:

WAF policy set to Detection; start with detection to learn what the rules might block in your app

Now let’s take a look at the logs. Use the following query in Log Analytics and modify it for your own host (host_s field):

AzureDiagnostics
| where ResourceType == "FRONTDOORS" and Category == "FrontdoorWebApplicationFirewallLog"
| where action_s == "Block" 
| where host_s == "api.baeke.info"  

The result:

Blocked requests (if the policy were set at Prevention at the global level)

Let’s look at a SQL Injection block:

Typical SQL Injection

The decoded requestUri_s is https://api.baeke.info:443/users?apikey=theapikey’ AND ‘1’=’1′ –. Typical! It was blocked at the edge. This request went via the BRU location.

Like with any Log Analytics query, you can place alerts on log occurrences. You will need to be in the Log Analytics workspace, and not in the Logs section of Azure Front Door:

Conclusion

Azure Web Application Firewall policies for Azure Front Door integrate with Azure Monitor and Log Analytics, like most other Azure services. With some KQL, the query language for Log Analytics, it is straightforward to request the logs and set alerts on them.