If you have ever deployed an application to Kubernetes, even a simple one, you are probably familiar with deployments. A deployment describes the pods to run, how many of them to run and how they should be upgraded. That last point is especially important because the strategy you select has an impact on the availability of the deployment. A deployment supports the following two strategies:
Recreate: all existing pods are killed and new ones are created; this obviously leads to some downtime
RollingUpdate: pods are gradually replaced which means there is a period when old and new pods coexist; this can result in issues for stateful pods or if there is no backward compatibility
But what if you want to use other methods such as BlueGreen or Canary? Although you could do that with a custom approach that uses deployments, there are some solution that provide a more automated approach. Below, I discuss two of them briefly. Videos provide a more in depth look.
One of the solutions out there is Argo Rollouts. It is very easy to use. If you want to start slowly, with BlueGreen deployments and manual approval for instance, Argo Rollouts is recommended. It has a nice kubectl plugin and integration with Argo CD, a GitOps solution.
The following video demonstrates BlueGreen deployments:
This video discusses a canary deployment with Argo Rollouts albeit a simple one without metric analysis:
This video shows the integration between Argo Rollouts and Argo CD:
One thing to note is that, instead of a deployment, you will create a rollout object. It is easy to convert an existing deployment into a rollout. Other tools such as Flagger (see below), provide their functionality on top of an existing deployment.
For traffic splitting and metrics analysis, Argo Rollouts does not support Linkerd. More information about traffic splitting and management can be found here.
Flagger, by Weaveworks, is another solution that provides BlueGreen and Canary deployment support to Kubernetes. In the video below, I demonstrate the basic look and feel of doing a canary deployment that includes metric analysis. Linkerd is used for gradual traffic shifting to the canary based on the built-in success rate metric of Linkerd:
If you want to get started with canary releases and easy traffic splitting and metrics, I suggest using the Flagger and Linkerd combination. This is based simply on the fact that Linkerd is much easier to install and use than Istio. Argo Rollouts in combination with Istio and Prometheus could be used to achieve exactly the same result.
Which one to use?
If you just want BlueGreen deployments with manual approvals, I would suggest using Argo Rollouts. When you integrate it with Argo CD, you can even use the Argo CD UI to promote your deployment. If you are comfortable with Istio and Prometheus, you can go a step further and add metrics analysis to automatically progress your deployment. You can also use a simple Kubernetes job to validate your deployment. Also, note that other metrics providers are supported.
Flagger supports more options for traffic splitting and metrics, due to its support for both Linkerd and Istio. Because Linkerd is so easy to use, Flagger is simpler to get started with canary releases and metrics analysis.
I recently gave a talk at TechTrain, a monthly event in Mechelen (Belgium), hosted by Cronos. The talk is called “GitOps with Kubernetes: a better way to deploy” and is an introduction to GitOps with Weaveworks Flux as an example.
You can find a re-recording of the presentation on Youtube:
In today’s post, we will write a simple operator with Kopf, which is a Python framework created by Zalando. A Kubernetes operator is a piece of software, running in Kubernetes, that does something application specific. To see some examples of what operators are used for, check out operatorhub.io.
Our operator will do something simple in order to easily grasp how it works:
the operator will create a deployment that runs nginx
nginx will serve a static website based on a git repository that you specify; we will use an init container to grab the website from git and store it in a volume
you can control the number of instances via a replicas parameter
That’s great but how will the operator know when it has to do something, like creating or updating resources? We will use custom resources for that. Read on to learn more…
Note that we specified our own API and version in the CRD (baeke.info/v1) and that we set the kind to DemoWeb. In the additionalPrinterColumns, we defined some properties that can be set in the spec that will also be printed on screen. When you list resources of kind DemoWeb, you will the see replicas and gitrepo columns:
Of course, creating the CRD and the custom resources is not enough. To actually create the nginx deployment when the custom resource is created, we need to write and run the operator.
Writing the operator
I wrote the operator on a Mac with Python 3.7.6 (64-bit). On Windows, for best results, make sure you use Miniconda instead of Python from the Windows Store. First install Kopf and the Kubernetes package:
pip3 install kopf kubernetes
Verify you can run kopf:
Let’s write the operator. You can find it in full here. Here’s the first part:
Naturally, we import kopf and other necessary packages. As noted before, kopf and kubernetes will have to be installed with pip. Next, we define a handler that runs whenever a resource of our custom type is spotted by the operator (with the @kopf.on.create decorator). The handler has two parameters:
spec object: allows us to retrieve our custom properties with spec.get (e.g. spec.get(‘replicas’, 1) – the second parameter is the default value)
**kwargs: a dictionary with lots of extra values we can use; we use it to retrieve the name of our custom resource (e.g. demoweb1); we can use that name to derive the name of our deployment and to set labels for our pods
Note: instead of using **kwargs to retrieve the name, you can also define an extra name parameter in the handler like so: def create_fn(spec, name, **kwargs); see the docs for more information
Our deployment is just yaml stored in the doc variable with some help from the Python yaml package. We use spec.get and the name variable to customise it.
After the doc variable, the following code completes the event handler:
With kopf.adopt, we make sure the deployment we create is a child of our custom resource. When we delete the custom resource, its children are also deleted.
Next, we simply use the kubernetes client to create a deployment via the apps/v1 api. The method create_namespaced_deployment takes two required parameters: the namespace and the deployment specification. Note there is only minimal error checking here. There is much more you can do with regards to error checking, retries, etc…
Now we can run the operator with:
kopf run operator-filename.py
You can perfectly run this on your local workstation if you have a working kube config pointing at a running cluster with the CRD installed. Kopf will automatically use that for authentication:
Running the operator in your cluster
To run the operator in your cluster, create a Dockerfile that produces an image with Python, kopf, kubernetes and your operator in Python. In my case:
RUN mkdir /src
ADD with_create.py /src
RUN pip install kopf
RUN pip install kubernetes
CMD kopf run /src/with_create.py --verbose
We added the verbose parameter for extra logging. Next, run the following commands to build and push the image (example with my image name):
The above is just a regular deployment but the serviceAccountName is extremely important. It gives kopf and your operator the required access rights to create the deployment is the target namespace. Check out the documentation to find out more about the creation of the service account and the required roles. Note that you should only run one instance of the operator!
Once the operator is deployed, you will see it running as a normal pod:
To see what is going on, check the logs. Let’s show them with octant:
At the bottom, you see what happens when a creation event is detected for a resource of type DemoWeb. The spec is shown with the git repository and the number on replicas.
Now you can create resources of kind DemoWeb and see what happens. If you have your own git repository with some HTML in it, try to use that. Otherwise, just use mine at https://github.com/gbaeke/static-web.
Writing an operator is easy to do with the Kopf framework. Do note that we only touched on the basics to get started. We only have an on.create handler, and no on.update handler. So if you want to increase the number of replicas, you will have to delete the custom resource and create a new one. Based on the example though, it should be pretty easy to fix that. The git repo contains an example of an operator that also implements the on.update handler (with_update.py).
If you have followed my blog a little, you have seen a few posts about GitOps with Flux CD. This time, I am taking a look at Argo CD which, like Flux CD, is a GitOps tool to deploy applications from manifests in a git repository.
Don’t want to read this whole thing?
There are several differences between the two tools:
At first glance, Flux appears to use a single git repo for your cluster where Argo immediately introduces the concept of apps. Each app can be connected to a different git repo. However Flux can also use multiple git repositories in the same cluster. See https://github.com/fluxcd/multi-tenancy for more information
Flux has the concept of workloads which can be automated. This means that image repositories are scanned for updates. When an update is available (say from tag v1.0.0 to v1.0.1), Flux will update your application based on filters you specify. As far as I can see, Argo requires you to drive the update from your CI process, which might be preferred.
By default, Argo deploys an administrative UI (next to a CLI) with a full view on your deployment and its dependencies
Argo supports RBAC and integrates with external identity providers (e.g. Azure Active Directory)
The Argo CD admin interface is shown below:
Let’s take a look at how to deploy Argo and deploy the app you see above. The app is deployed using a single yaml file. Nothing fancy yet such as kustomize or jsonnet.
The getting started guide is pretty clear, so do have a look over there as well. To install, just run (with a deployed Kubernetes cluster and kubectl pointing at the cluster):
Great! You are all set now to deploy an application.
Deploying an application
We will deploy an application that has a couple of dependencies. Normally, you would install those dependencies with Argo CD as well but since I am using a cluster that has these dependencies installed via Azure DevOps, I will just list what you need (Helm commands):
To know more about these dependencies and use an Azure DevOps YAML pipeline to deploy them, see this post. If you want, you can skip the externaldns installation and create a DNS record yourself that resolves to the public IP address of Nginx Ingress. If you do not want to use an Azure static IP address, you can remove the loadBalancerIP parameter from the first command.
Two YAML files that create a certificate cluster issuer based on custom resource definitions (CRDs) from cert-manager
realtime.yaml: Redis deployment, Redis service (ClusterIP), realtime web app deployment (based on this), realtime web app service (ClusterIP), ingress resource for https://real.baeke.info (record automatically created by externaldns)
It’s best that you fork my repo and modify realtime.yaml’s ingress resource with your own DNS name.
Create the Argo app
Now you can create the Argo app based on my forked repo. I used the following command with my original repo:
The command above creates an app called realtime based on the specified repo. The app should use the manifests folder and apply (kubectl apply) all the manifests in that folder. The manifests are deployed to the cluster that Argo CD runs in. Note that you can run Argo CD in one cluster and deploy to totally different clusters.
The above command does not configure the repository to be synced automatically, although that is an option. To sync manually, use the following command:
argocd app sync realtime
The application should now be synced and viewable in the UI:
Let’s set up this app to automatically sync with the repo (default = every 3 minutes). This can be done from both the CLI and the UI. Let’s do it from the UI. Click on the app and then click App Details. You will find a Sync Policy in the app details where you can enable auto-sync
You can now make changes to the git repo like changing the image tag for gbaeke/fluxapp (yes, I used this image with the Flux posts as well 😊 ) to 1.0.6 and wait for the sync to happen. Or sync manually from the CLI or the UI.
This was a quick tour of Argo CD. There is much more you can do but the above should get you started quickly. I must say I quite like the solution and am eager to see what the collaboration of Flux CD, Argo CD and Amazon comes up with in the future.
A while ago, I blogged about an Azure YAML pipeline to deploy AKS together with Traefik. As a variation on that theme, this post talks about deploying AKS together with Nginx, External DNS, a Helm Operator and Flux CD. I blogged about Flux before if you want to know what it does.
Let’s break the pipeline down a little. In what follows, replace AzureMPN with a reference to your own subscription. The first two tasks, AKS deployment and IP address deployment are ARM templates that deploy these resources in Azure. Nothing too special there. Note that the AKS cluster is one with default networking, no Azure AD integration and without VMSS (so no multiple node pools either).
Note: I modified the pipeline to deploy a VMSS-based cluster with a standard load balancer, which is recommended instead of a cluster based on an availability set with a basic load balancer.
The third task takes the output of the IP address deployment and parses out the IP address using jq (last echo statement on one line):
For External DNS to work, I found I had to set controller.publishService.enabled=true. As you can see, the Nginx service is configured to use the IP we created earlier. Azure will create a load balancer with a front end IP configuration that uses this address. This all happens automatically.
Note: controller.metrics.enabled enables a Prometheus scraping endpoint; that is not discussed further in this blog
External DNS can automatically add DNS records for ingresses and services you add to Kubernetes. For instance, if I create an ingress for test.baeke.info, External DNS can create this record in the baeke.info zone and use the IP address of the Ingress Controller (nginx here). Installation is pretty straightforward but you need to provide credentials to your DNS provider. In my case, I use CloudFlare. Many others are available. Here is the task:
On CloudFlare, I created a token that has the required access rights to my zone (read, edit). I provide that token to the chart via the CFAPIToken variable defined as a secret on the pipeline. The valueFile looks like this:
In the beginning, it’s best to set the logLevel to debug in case things go wrong. With interval 1m, External DNS checks for ingresses and services every minute and syncs with your DNS zone. Note that External DNS only touches the records it created. It does so by creating TXT records that provide a record that External DNS is indeed the owner.
With External DNS in place, you just need to create an ingress like below to have the A record real.baeke.info created:
This installs the latest version of the operator at the time of this writing (image.repository and image.tag) and also sets Helm to v3. With this installed, you can install a Helm chart by submitting files like below:
The gitURL variable should be set to a git repo that contains your cluster configuration. For instance: gbaeke/demo-clu-flux. Flux will check the repo for changes every minute. Note that we are using a public repo here. Private repos and systems other than GitHub are supported.
Add a simple app that uses a Go socket.io implementation to provide realtime updates based on Redis channel content; this app is published via nginx and real.baeke.info is created in DNS (by External DNS)
Adds a ConfigMap that is used to configure Azure Monitor to enable Prometheus endpoint scraping (to show this can be used for any object you need to add to Kubernetes)
Note that the ingress of the Go app has an annotation (in realtime.yaml, in the git repo) to issue a certificate via cert-manager. If you want to make that work, add an extra task to the pipeline that installs cert-manager:
You will also need to create another namespace, cert-manager, just like we created the fluxcd namespace.
In order to make the above work, you will need Issuers or ClusterIssuers. The repo used by Flux CD contains two ClusterIssuers, one for Let’s Encrypt staging and one for production. The ingress resource uses the production issuer due to the following annotation:
In a previous post, we installed Weaveworks Flux. Flux synchronizes the contents of a git repository with your Kubernetes cluster. Flux can easily be installed via a Helm chart. As an example, we installed Traefik by adding the following yaml to the synced repository:
It does not matter where you put this file because Flux scans the complete repository. I added the file to a folder called traefik.
If you look more closely at the YAML file, you’ll notice its kind is HelmRelease. You need an operator that can handle this type of file, which is this one. In the previous post, we installed the custom resource definition and the operator manually.
Adding a custom application
Now it’s time to add our own application. You do not need to use Helm packages or the Helm operator to install applications. Regular yaml will do just fine.
The application we will deploy needs a Redis backend. Let’s deploy that first. Add the following yaml file to your repository:
After committing this file, wait a moment or run fluxctl sync. When you run kubectl get pods for the default namespace, you should see the Redis pod:
Now it’s time to add the application. I will use an image, based on the following code: https://github.com/gbaeke/realtime-go (httponly branch because master contains code to automatically request a certificate with Let’s Encrypt). I pushed the image to Docker Hub as gbaeke/fluxapp:1.0.0. Now let’s deploy the app with the following yaml:
In the above yaml, replace IP in the Ingress specification to the IP of the external load balancer used by your Ingress Controller. Once you add the yaml to the git repository and you run fluxctl sync the application should be deployed. You see the following page when you browse to http://realtime.IP.xip.io:
Great, v1.0.0 of the app is deployed using the gbaeke/fluxapp:1.0.0 image. But what if I have a new version of the image and the yaml specification does not change? Read on…
Upgrading the application
If you have been following along, you can now run the following command:
fluxctl list-workloads -a
This will list all workloads on the cluster, including the ones that were not installed by Flux. If you check the list, none of the workloads are automated. When a workload is automated, it can automatically upgrade the application when a new image appears. Let’s try to automate the fluxapp. To do so, you can either add annotations to your yaml or use fluxctl. Let’s use the yaml approach by adding the following to our deployment:
Note: Flux only works with immutable tags; do not use latest
After committing the file and running fluxctl sync, you can run fluxctl list-workloads -a again. The deployment should now be automated:
Now let’s see what happens when we add a new version of the image with tag 1.0.1. That image uses a different header color to show the difference. Flux monitors the repository for changes. When it detects a new version of the image that matches the semver filter, it will modify the deployment. Let’s check with fluxctl list-workloads -a:
And here’s the new color:
But wait… what about the git repo?
With the configuration of a deploy key, Flux has access to the git repository. When a deployment is automated and the image is changed, that change is also reflected in the git repo:
In the yaml, version 1.0.1 is now used:
What if I don’t like this release? With fluxctl, you can rollback to a previous version like so:
Although this works, the deployment will be updated to 1.0.1 again since it is automated. To avoid that, first lock the deployment (or workload) and then force the release of the old image:
In your yaml, there will be an additional annotation: fluxcd.io/locked: ‘true’ and the image will be set to 1.0.0.
In this post, we looked at deploying and updating an application via Flux automation. You only need a couple of annotations to make this work. This was just a simple example. For an example with dev, staging and production branches and promotion from staging to production, be sure to look at https://github.com/fluxcd/helm-operator-get-started as well.
Traefik’s admin site is first exposed as a ClusterIP service on port 8080. Next, an object of kind IngressRoute is defined, which is new for Traefik 2.0. You don’t need to create standard Ingress objects and configure Traefik with custom annotations. This new approach is cleaner. Of course, substitute the host with a host that points to the public IP of the load balancer. Or use the IP address with the xip.io domain. If your IP would be 188.8.131.52 then you could use something like admin.184.108.40.206.xip.io. That name automatically resolves to the IP in the name.
Let’s see if we can reach the admin interface:
Traefik 2.0 is now installed in a basic way and working properly. We exposed the admin interface but now it is time to expose the calculator API.
Exposing the calculator API
The API is deployed as 5 pods in the add namespace:
The API is exposed as a service of type ClusterIP with only an internal Kubernetes IP. To expose it via Traefik, we create the following object in the add namespace:
I am using xip.io above. Change 220.127.116.11 to the public IP of Traefik’s Azure Load Balancer. The add-svc that exposes the calculator API on port 80 is exposed via Traefik. We can easily call the service via:
Great! But what is that calcheader middleware? Middlewares modify the requests and responses to and from Traefik 2.0. There are all sorts of middelwares as explained here. You can set headers, configure authentication, perform rate limiting and much much more. In this case we create the following middleware object in the add namespace:
This middleware adds a header to the request before it comes in to Traefik. The header overrides the destination and sets it to the internal DNS name of the add-svc service that exposes the calculator API. This requirement is documented by Linkerd here.
Meshing the Traefik deployment
Because we want to mesh Traefik to get Linkerd metrics and more, we need to inject the Linkerd proxy in the Traefik pods. In my case, Traefik is deployed in the default namespace so the command below can be used:
Make sure you run the command on a system with the linkerd executable in your path and kubectl homed to the cluster that has Linkerd installed.
Checking the traffic in the Linkerd dashboard
With some traffic generated, this is what you should see when you check the meshed deployment that runs the calculator API (deploy/add):
If you are wondering what these services are and do, check this post. In the above diagram, we can clearly see we are receiving traffic to the calculator API from Traefik. When I click on Traefik, I see the following:
From the above, we see Traefik receives traffic via the Azure Load Balancer and that it forwards traffic to the calculator service. The live calls are coming from the admin UI which refreshes regularly.
In Grafana, we can get more information about the Traefik deployment:
This was just a brief look at both Traefik 2 and “meshing” Traefik with Linkerd. There is much more to say and I have much more to explore. Hopefully, this can get you started!
A while ago, I gave linkerd a spin. Due to vacations and a busy schedule, I was not able to write about my experience. I will briefly discuss how to setup linkerd and then deploy a sample service to illustrate what it can do out of the box. Let’s go!
Wait! What is linkerd?
linkerd basically is a network proxy for your Kubernetes pods that’s designed to be deployed as a service mesh. When the pods you care about have been infused with linkerd, you will automatically get metrics like latency and requests per second, a web portal to check these metrics, live inspection of traffic and much more. Below is an example of a Kubernetes namespace that has been meshed:
Download the linkerd executable as described in the Getting Started guide; I used WSL for this
Create a Kubernetes cluster with AKS (or another provider); for AKS, use the Azure CLI to get your credentials (az aks get-credentials); make sure the Azure CLI is installed in WSL and that you connected to your Azure subscription with az login
Make sure you can connect to your cluster with kubectl
Run linkerd check –pre to check if prerequisites are fulfilled
Install linkerd with linkerd install | kubectl apply -f –
Check the installation with linkerd check
The last step will nicely show its progress and end when the installation is complete:
Exploring linkerd with the dashboard
linkerd automatically installs a dashboard. The dashboard is exposed as a Kubernetes service called linkerd-web. The service is of type ClusterIP. Although you could expose the service using an ingress, you can easily tunnel to the service with the following linkerd command (first line is the command; other lines are the output):
Linkerd dashboard available at:
Grafana dashboard available at:
Opening Linkerd dashboard in the default browser
Failed to open Linkerd dashboard automatically
Visit http://127.0.0.1:50750 in your browser to view the dashboard
From WSL, the dashboard can not open automatically but you can manually browse to it. Note that linkerd also installs Prometheus and Grafana.
Out of the box, the linkerd deployment is meshed:
Adding linkerd to your own service
In this section, we will deploy a simple service that can add numbers and add linkerd to it. Although there are many ways to do this, I chose to create a separate namespace and enable auto-injection via an annotation. Here’s the yaml to create the namespace (add-ns.yaml):
Just run kubectl create -f add-ns.yaml to create the namespace. The annotation ensures that all pods added to the namespace get the linkerd proxy in the pod. All traffic to and from the pod will then pass through the proxy.
Now, let’s install the add service and deployment:
Save the above to add-cli.yaml and deploy with the below command:
kubectl create -f add-cli.yaml -n add
The deployment uses another image called gbaeke/adder-cli that continuously makes requests to the server specified in the SERVER environment variable.
Checking the deployment in the linkerd portal
When you now open the add namespace in the linked portal, you should see something similar to the below screenshot (note: I deployed 5 servers and 5 clients):
The linkerd proxy in all pods sees all traffic. From the traffic, it can infer that the add-cli deployment talks to the add deployment. The add deployment receives about 150 requests per second. The 99th percentile latency is relatively high because the cluster nodes are very small, I deployed more instances and the client is relatively inefficient.
When I click the deployment called add, the following screen is shown:
The deployment clearly shows where traffic is coming from plus relevant metrics such as RPS and P99 latency. You also get a view on the live calls now. Note that the client is using GRPC which uses a HTTP POST. When you scroll down on this page, you get more information about the caller and a view on the individual pods:
To see live calls in more detail, you can click the Tap icon:
For each call, details can be requested:
This was just a brief look at linkerd. It is trivially easy to install and with auto-injection, very simple to add it to your own services. Highly recommended to give it a spin to see where it can add value to your projects!
In the previous post, we looked at API Management with Kong and the Kong Ingress Controller. We did not care about security and exposed a sample toy API over a public HTTP endpoint that also required an API key. All in the clear, no firewall, no WAF, nothing… 👎👎👎
In this post, we will expose the API over TLS and configure Kong to use a CloudFlare origin certificate. An origin certificate is issued and trusted by CloudFlare to connect to the origin, which in our case is an API hosted on Kubernetes.
The API consumer will not connect directly to the Kubernetes-hosted API exposed via Kong. Instead, the consumer connects to CloudFlare over TLS and uses a certificate issued by CloudFlare that is fully trusted by browsers and other clients.
The traffic flow is as follows:
Consumer --> CloudFlare (TLS with fully trusted cert, WAF, ...) --> Kong Ingress (TLS with origin cert) --> API (HTTP)
Refer to the previous post for installation instructions. The YAML files to configure the Ingress, KongIngress, Consumer, etc… are almost the same. The Ingress resource has the following changes:
We use a new hostname api.baeke.info
We configure TLS for api.baeke.info by referring to a secret called baeke.info.tls which contains the CloudFlare origin certificate.
We use an additional Kong plugin which provides whitelisting of CloudFlare addresses; only CloudFlare is allowed to connect to the Ingress
Here is the plugin definition for whitelisting with the current (June 15th, 2019) list of IP ranges used by CloudFlare. Note that you have to supply the addresses and ranges as an array. The documentation shows a comma-separated list! 🤷♂️
In the previous post, the protocols array contained the http value.
Note: for whitelisting to work, the Kong proxy service needs externalTrafficPolicy set to Local. Use kubectl edit svc kong-kong-proxy to modify that setting. You can set this value at deployment time as well. This might or might not work for you. I used AKS where this produces the desired outcome.
Get the external IP of the kong-kong-proxy service and create a DNS entry for it. I created a A record for api.baeke.info:
Make sure the orange cloud is active. In this case, this means that requests for api.baeke.info are proxied by CloudFlare. That allows us to cache, enable WAF (web application firewall), rate limiting and more!
In the Firewall section, WAF is turned on. Note that this is a paying feature!
In Crypto, Universal SSL is turned on and set to Full (strict).
Full (strict) means that CloudFlare connects to your origin over HTTPS and that it expects a valid certificate, which is checked. An origin certificate, issued by CloudFlare but not trusted by your operating system is also valid. As stated above, I use such an origin certificate at the Ingress level.
The origin certificate can be issued and/or downloaded from the Crypto section:
I created an origin certificate for *.baeke.info and baeke.info and downloaded the certificate and private key in PEM format. I then encoded the contents of the certificate and key in base64 format and used them in a secret:
As you have seen in the Ingress definition, it referred to this secret via its name, baeke.info.tls.
When a consumer connects to the API, the fully trusted certificate issued by CloudFlare is used:
We also make sure consumers of the API need to use TLS:
With the above configuration, consumers need to securely connect to https://api.baeke.info at CloudFlare. CloudFlare connects securely to the origin, which is the external IP of the ingress. Only CloudFlare is allowed to connect to that external IP because of the whitelisting configuration.
Testing the API
Let’s try the API with the http tool:
All sorts of headers are added by CloudFlare which makes it clear that CloudFlare is proxying the requests. When we don’t add a key or specify a wrong one:
The key is now securely sent from consumer to CloudFlare to origin. Phew! 😎
In this post, we hosted an API on Kubernetes, exposed it with Kong and secured it with CloudFlare. This example can easily be extended with multiple Kong proxies for high availability and multiple APIs (/users, /orders, /products, …) that are all protected by CloudFlare with end-to-end encryption and WAF. CloudFlare lends an extra helping hand by automatically generating both the “front-end” and origin certificates.
We will install Traefik with Helm and I assume the cluster has rbac enabled. If you deploy clusters with AKS, that is the default although you can turn it off. With rbac enabled, you need to install the server-side component of Helm, tiller, using the following commands:
The above command uses Helm to install the stable/traefik chart. Note that the chart is maintained by the community and not by the folks at Traefik. Traefik itself is exposed via a service of type LoadBalancer, which results in a public IP address. Use kubectl get svc traefik -n kube-system to check. There are ways to make sure the service uses a static IP but that is not discussed in this post. Check out this doc for AKS. The other settings do the following:
ssl.enabled: yes, SSL 😉
ssl.enforced: redirect to https when user uses http
acme.enabled: enable Let’s Encrypt
acme.email: set the e-mail address to use with Let’s Encrypt; you will get certificate expiry mails on that address
onHostRule: issue certificates based on the host setting in the ingress definition
acme.challengeType: method used by Let’s Encrypt to issue the certificate; use this one for regular certs; use DNS verification for wildcard certs
acme.staging: set to false to issue fully trusted certs; beware of rate limiting
dashboard.enabled: enable the Traefik dashboard; you can expose the service via an ingress object as well
Note: to specify a specific version of Traefik, use the imageTag parameter as part of –set; for instance imageTag=1.7.12
When the installation is finished, run the following commands:
# check installation
# check traefik service
kubectl get svc traefik --namespace kube-system -w
The first command should show that Traefik is installed. The second command returns the traefik service, which we configured with serviceType LoadBalancer. The external IP of the service will be pending for a while. When you have an address and you browse it, you should get a 404. Result from curl -v below:
Rebuilt URL to: http://IP/
Connected to 18.104.22.168 (IP) port 80 (#0)
GET / HTTP/1.1
< HTTP/1.1 404 Not Found
< Content-Type: text/plain; charset=utf-8
< Vary: Accept-Encoding
< X-Content-Type-Options: nosniff
< Date: Fri, 24 May 2019 17:00:29 GMT
< Content-Length: 19
404 page not found
Next, install nginx just to have a simple website to securely publish. Yes I know, kubectl run… 🤷
kubectl run nginx --image nginx --expose --port 80
The above command installs nginx but also creates an nginx service of type ClusterIP. We can expose that service via an ingress definition:
Replace your.domain.com with a host that resolves to the external IP address of the Traefik service. The annotation is not technically required if Traefik is the only Ingress Controller in your cluster. I prefer being explicit though. Save the above contents to a file and then run:
kubectl apply -f yourfile.yaml
Now browse to whatever you used as domain. The result should be:
To expose the Traefik dashboard, use the yaml below. Note that we explicitly installed the dashboard by setting dashboard.enabled to true.
Put the above contents in a file and create the ingress object in the same namespace as the traefik-dashboard service. Use kubectl apply -f yourfile.yaml -n kube-system. You should then be able to access the dashboard with the host name you provided:
Note: if you do not want to mess with DNS records that map to the IP address of the Ingress Controller, just use a xip.io address. In the ingress object’s host setting, use something like web.w.x.y.z.xip.io where web is just something you choose and w.x.y.z is the IP address of the Ingress Controller. Traefik will also request a certificate for such a name. For more information, check xip.io. Simple for testing purposes!