In a previous post, I talked about installing Consul on Kubernetes and using some of its features. In that post, I did not look at the service mesh functionality. Before looking at that, it is beneficial to try out the service mesh features on your local machine.
You can easily install Consul on your local machine with Chocolatey for Windows or Homebrew for Mac. On Windows, a simple choco install consul is enough. Since Consul is just a single executable, you can start it from the command line with all the options you need.
In the video below, I walk through configuring two services running as containers on my local machine: a web app that talks to Redis. We will “mesh” both services and then use an intention to deny service-to-service traffic.
Consul Service Mesh on your local machine… speed it up! ☺
In a later post and video, we will look at Consul Connect on Kubernetes. Stay tuned!
Although I have heard a lot about Hashicorp’s Consul, I have not had the opportunity to work with it and get acquainted with the basics. In this post, I will share some of the basics I have learned, hopefully giving you a bit of a head start when you embark on this journey yourself.
Want to watch a video about this instead?
What is Consul?
Basically, Consul is a networking tool. It provides service discovery and allows you to store and retrieve configuration values. On top of that, it provides service-mesh capability by controlling and encrypting service-to-service traffic. Although that looks simple enough, in complex and dynamic infrastructure spanning multiple locations such as on-premises and cloud, this can become extremely complicated. Let’s stick to the basics and focus on three things:
Installation on Kubernetes
Using the key-value store for configuration
Using the service catalog to retrieve service information
We will use a small Go program to illustrate the use of the Consul API. Let’s get started… 🚀🚀🚀
Installation of Consul
I will install Consul using the provided Helm chart. Note that the installation I will perform is great for testing but should not be used for production. In production, there are many more things to think about. Look at the configuration values for hints: certificates, storage size and class, options to enable/disable, etc… That being said, the chart does install multiple servers and clients to provide high availability.
I installed Consul with Pulumi and Python. You can check the code here. You can use that code on Azure to deploy both Kubernetes and Consul in one step. The section in the code that installs Consul is shown below:
The code above would be equivalent to this Helm chart installation (Helm v3):
helm install consul -f consul-helm/values.yaml \
--namespace consul ./consul-helm \
--set connectInject.enabled=true \
--set client.enabled=true --set client.grpc=true \
--set syncCatalog.enabled=true
Connecting to the Consul UI
The chart installs Consul in the consul namespace. You can run the following command to get to the UI:
kubectl port-forward services/consul-consul-ui 8888:80 -n consul8:80 -n consul
You will see the screen below. The list of services depends on the Kubernetes services in your system.
Consul UI with list of services
The services above include consul itself. The consul service also has health checks configured. The other services in the screenshot are Kubernetes services that were discovered by Consul. I have installed Redis in the default namespace and exposed Redis via a service called redisapp. This results in a Consul service called redisadd-default. Later, we will query this service from our Go application.
When you click Key/Value, you can see the configured keys. I have created one key called REDISPATTERN which is later used in the Go program to know the Redis channels to subscribe to. It’s just a configuration value that is retrieved at runtime.
A simple key/value pair: REDISPATTERn=*
The Key/Value pair can be created via the consul CLI, the HTTP API or via the UI (Create button in the main Key/Value screen). I created the REDISPATTERN key via the Create button.
Querying the Key/Value store
Let’s turn our attention to writing some code that retrieves a Consul key at runtime. The question of course is: “how does your application find Consul?”. Look at the diagram below:
Simplifgied diagram of Consul installation on Kubernetes via the Helm chart
Above, you see the Consul server agents, implemented as a Kubernetes StatefulSet. Each server pod has a volume (Azure disk in this case) to store data such as key/value pairs.
Your application will not connect to these servers directly. Instead, it will connect via the client agents. The client agents are implemented as a DaemonSet resulting in a client agent per Kubernetes node. The client agent pods expose a static port on the Kubernetes host (yes, you read that right). This means that your app can connect to the IP address of the host it is running on. Your app can discover that IP address via the Downward API.
The HOST_IP will be set to the IP of the Kubernetes host via a reference to status.hostIP. Next, the environment variable CONSUL_HTTP_ADDR is set to the full HTTP address including port 8500. In your code, you will need to read that environment variable.
Retrieving a key/value pair
Here is some code to read a Consul key/value pair with Go. Full source code is here.
// return a Consul client based on given address
func getConsul(address string) (*consulapi.Client, error) {
config := consulapi.DefaultConfig()
config.Address = address
consul, err := consulapi.NewClient(config)
return consul, err
}
// get key/value pair from Consul client and passed key name
func getKvPair(client *consulapi.Client, key string) (*consulapi.KVPair, error) {
kv := client.KV()
keyPair, _, err := kv.Get(key, nil)
return keyPair, err
}
func main() {
// retrieve address of Consul set via downward API in spec
consulAddress := getEnv("CONSUL_HTTP_ADDR", "")
if consulAddress == "" {
log.Fatalf("CONSUL_HTTP_ADDRESS environment variable not set")
}
// get Consul client
consul, err := getConsul(consulAddress)
if err != nil {
log.Fatalf("Error connecting to Consul: %s", err)
}
// get key/value pair with Consul client
redisPattern, err := getKvPair(consul, "REDISPATTERN")
if err != nil || redisPattern == nil {
log.Fatalf("Could not get REDISPATTERN: %s", err)
}
log.Printf("KV: %v %s\n", redisPattern.Key, redisPattern.Value)
... func main() continued...
The comments in the code should be self-explanatory. When the REDISPATTERN key is not set or another error occurs, the program will exit. If REDISPATTERN is set, we can use the value later:
consul is a *consulapi.client obtained earlier. You use the Catalog() function to obtain access to catalog service functionality. In this case, we simply retrieve the address and port value of the Kubernetes service redisapp in the default namespace. We can use that information to connect to our Redis back-end.
Conclusion
It’s easy to get started with Consul on Kubernetes and to write some code to take advantage of it. Be aware though that we only scratched the surface here and that this is both a sample deployment (without TLS, RBAC, etc…) and some sample code. In addition, you should only use Consul in more complex application landscapes with many services to discover, traffic to secure and more. If you do think you need it, you should also take a look at managed Consul on Azure. It runs in your subscription but fully managed by Hashicorp! It can be integrated with Azure Kubernetes Service as well.
In a later post, I will take a look at the service mesh capabilities with Connect.
If you have ever deployed an application to Kubernetes, even a simple one, you are probably familiar with deployments. A deployment describes the pods to run, how many of them to run and how they should be upgraded. That last point is especially important because the strategy you select has an impact on the availability of the deployment. A deployment supports the following two strategies:
Recreate: all existing pods are killed and new ones are created; this obviously leads to some downtime
RollingUpdate: pods are gradually replaced which means there is a period when old and new pods coexist; this can result in issues for stateful pods or if there is no backward compatibility
But what if you want to use other methods such as BlueGreen or Canary? Although you could do that with a custom approach that uses deployments, there are some solution that provide a more automated approach. Below, I discuss two of them briefly. Videos provide a more in depth look.
Argo Rollouts
One of the solutions out there is Argo Rollouts. It is very easy to use. If you want to start slowly, with BlueGreen deployments and manual approval for instance, Argo Rollouts is recommended. It has a nice kubectl plugin and integration with Argo CD, a GitOps solution.
The following video demonstrates BlueGreen deployments:
BlueGreen deployments with Argo Rollouts
This video discusses a canary deployment with Argo Rollouts albeit a simple one without metric analysis:
Canary deployments with Argo Rollouts
This video shows the integration between Argo Rollouts and Argo CD:
Argo CD and Argo Rollouts integration
One thing to note is that, instead of a deployment, you will create a rollout object. It is easy to convert an existing deployment into a rollout. Other tools such as Flagger (see below), provide their functionality on top of an existing deployment.
For traffic splitting and metrics analysis, Argo Rollouts does not support Linkerd. More information about traffic splitting and management can be found here.
Flagger
Flagger, by Weaveworks, is another solution that provides BlueGreen and Canary deployment support to Kubernetes. In the video below, I demonstrate the basic look and feel of doing a canary deployment that includes metric analysis. Linkerd is used for gradual traffic shifting to the canary based on the built-in success rate metric of Linkerd:
Canary release with Flagger and Linkerd
If you want to get started with canary releases and easy traffic splitting and metrics, I suggest using the Flagger and Linkerd combination. This is based simply on the fact that Linkerd is much easier to install and use than Istio. Argo Rollouts in combination with Istio and Prometheus could be used to achieve exactly the same result.
Which one to use?
If you just want BlueGreen deployments with manual approvals, I would suggest using Argo Rollouts. When you integrate it with Argo CD, you can even use the Argo CD UI to promote your deployment. If you are comfortable with Istio and Prometheus, you can go a step further and add metrics analysis to automatically progress your deployment. You can also use a simple Kubernetes job to validate your deployment. Also, note that other metrics providers are supported.
Flagger supports more options for traffic splitting and metrics, due to its support for both Linkerd and Istio. Because Linkerd is so easy to use, Flagger is simpler to get started with canary releases and metrics analysis.
I recently gave a talk at TechTrain, a monthly event in Mechelen (Belgium), hosted by Cronos. The talk is called “GitOps with Kubernetes: a better way to deploy” and is an introduction to GitOps with Weaveworks Flux as an example.
You can find a re-recording of the presentation on Youtube:
In today’s post, we will write a simple operator with Kopf, which is a Python framework created by Zalando. A Kubernetes operator is a piece of software, running in Kubernetes, that does something application specific. To see some examples of what operators are used for, check out operatorhub.io.
Our operator will do something simple in order to easily grasp how it works:
the operator will create a deployment that runs nginx
nginx will serve a static website based on a git repository that you specify; we will use an init container to grab the website from git and store it in a volume
you can control the number of instances via a replicas parameter
That’s great but how will the operator know when it has to do something, like creating or updating resources? We will use custom resources for that. Read on to learn more…
Note that we specified our own API and version in the CRD (baeke.info/v1) and that we set the kind to DemoWeb. In the additionalPrinterColumns, we defined some properties that can be set in the spec that will also be printed on screen. When you list resources of kind DemoWeb, you will the see replicas and gitrepo columns:
Custom resources based on the DemoWeb CRD
Of course, creating the CRD and the custom resources is not enough. To actually create the nginx deployment when the custom resource is created, we need to write and run the operator.
Writing the operator
I wrote the operator on a Mac with Python 3.7.6 (64-bit). On Windows, for best results, make sure you use Miniconda instead of Python from the Windows Store. First install Kopf and the Kubernetes package:
pip3 install kopf kubernetes
Verify you can run kopf:
Running kopf
Let’s write the operator. You can find it in full here. Here’s the first part:
Naturally, we import kopf and other necessary packages. As noted before, kopf and kubernetes will have to be installed with pip. Next, we define a handler that runs whenever a resource of our custom type is spotted by the operator (with the @kopf.on.create decorator). The handler has two parameters:
spec object: allows us to retrieve our custom properties with spec.get (e.g. spec.get(‘replicas’, 1) – the second parameter is the default value)
**kwargs: a dictionary with lots of extra values we can use; we use it to retrieve the name of our custom resource (e.g. demoweb1); we can use that name to derive the name of our deployment and to set labels for our pods
Note: instead of using **kwargs to retrieve the name, you can also define an extra name parameter in the handler like so: def create_fn(spec, name, **kwargs); see the docs for more information
Our deployment is just yaml stored in the doc variable with some help from the Python yaml package. We use spec.get and the name variable to customise it.
After the doc variable, the following code completes the event handler:
The rest of the operator
With kopf.adopt, we make sure the deployment we create is a child of our custom resource. When we delete the custom resource, its children are also deleted.
Next, we simply use the kubernetes client to create a deployment via the apps/v1 api. The method create_namespaced_deployment takes two required parameters: the namespace and the deployment specification. Note there is only minimal error checking here. There is much more you can do with regards to error checking, retries, etc…
Now we can run the operator with:
kopf run operator-filename.py
You can perfectly run this on your local workstation if you have a working kube config pointing at a running cluster with the CRD installed. Kopf will automatically use that for authentication:
Running the operator on your workstation
Running the operator in your cluster
To run the operator in your cluster, create a Dockerfile that produces an image with Python, kopf, kubernetes and your operator in Python. In my case:
FROM python:3.7
RUN mkdir /src
ADD with_create.py /src
RUN pip install kopf
RUN pip install kubernetes
CMD kopf run /src/with_create.py --verbose
We added the verbose parameter for extra logging. Next, run the following commands to build and push the image (example with my image name):
The above is just a regular deployment but the serviceAccountName is extremely important. It gives kopf and your operator the required access rights to create the deployment is the target namespace. Check out the documentation to find out more about the creation of the service account and the required roles. Note that you should only run one instance of the operator!
Once the operator is deployed, you will see it running as a normal pod:
The operator is running
To see what is going on, check the logs. Let’s show them with octant:
Your operator logs
At the bottom, you see what happens when a creation event is detected for a resource of type DemoWeb. The spec is shown with the git repository and the number on replicas.
Now you can create resources of kind DemoWeb and see what happens. If you have your own git repository with some HTML in it, try to use that. Otherwise, just use mine at https://github.com/gbaeke/static-web.
Conclusion
Writing an operator is easy to do with the Kopf framework. Do note that we only touched on the basics to get started. We only have an on.create handler, and no on.update handler. So if you want to increase the number of replicas, you will have to delete the custom resource and create a new one. Based on the example though, it should be pretty easy to fix that. The git repo contains an example of an operator that also implements the on.update handler (with_update.py).
If you have followed my blog a little, you have seen a few posts about GitOps with Flux CD. This time, I am taking a look at Argo CD which, like Flux CD, is a GitOps tool to deploy applications from manifests in a git repository.
Don’t want to read this whole thing?
Here’s the video version of this post
There are several differences between the two tools:
At first glance, Flux appears to use a single git repo for your cluster where Argo immediately introduces the concept of apps. Each app can be connected to a different git repo. However Flux can also use multiple git repositories in the same cluster. See https://github.com/fluxcd/multi-tenancy for more information
Flux has the concept of workloads which can be automated. This means that image repositories are scanned for updates. When an update is available (say from tag v1.0.0 to v1.0.1), Flux will update your application based on filters you specify. As far as I can see, Argo requires you to drive the update from your CI process, which might be preferred.
By default, Argo deploys an administrative UI (next to a CLI) with a full view on your deployment and its dependencies
Argo supports RBAC and integrates with external identity providers (e.g. Azure Active Directory)
The Argo CD admin interface is shown below:
Argo CD admin interface… not too shabby
Let’s take a look at how to deploy Argo and deploy the app you see above. The app is deployed using a single yaml file. Nothing fancy yet such as kustomize or jsonnet.
Deployment
The getting started guide is pretty clear, so do have a look over there as well. To install, just run (with a deployed Kubernetes cluster and kubectl pointing at the cluster):
Next, install the CLI. On a Mac, that is simple (with Homebrew):
brew tap argoproj/tap
brew install argoproj/tap/argocd
You will need access to the API server, which is not exposed over the Internet by default. For testing, port forwarding is easiest. In a separate shell, run the following command:
You can now connect to https://localhost:8080 to get to the UI. You will need the admin password by running:
kubectl get pods -n argocd -l app.kubernetes.io/name=argocd-server -o name | cut -d'/' -f 2
You can now login to the UI with the user admin and the displayed password. You should also login from the CLI and change the password with the following commands:
Great! You are all set now to deploy an application.
Deploying an application
We will deploy an application that has a couple of dependencies. Normally, you would install those dependencies with Argo CD as well but since I am using a cluster that has these dependencies installed via Azure DevOps, I will just list what you need (Helm commands):
To know more about these dependencies and use an Azure DevOps YAML pipeline to deploy them, see this post. If you want, you can skip the externaldns installation and create a DNS record yourself that resolves to the public IP address of Nginx Ingress. If you do not want to use an Azure static IP address, you can remove the loadBalancerIP parameter from the first command.
The manifests we will deploy with Argo CD can be found in the following public git repository: https://github.com/gbaeke/argo-demo. The application is in three YAML files:
Two YAML files that create a certificate cluster issuer based on custom resource definitions (CRDs) from cert-manager
realtime.yaml: Redis deployment, Redis service (ClusterIP), realtime web app deployment (based on this), realtime web app service (ClusterIP), ingress resource for https://real.baeke.info (record automatically created by externaldns)
It’s best that you fork my repo and modify realtime.yaml’s ingress resource with your own DNS name.
Create the Argo app
Now you can create the Argo app based on my forked repo. I used the following command with my original repo:
The command above creates an app called realtime based on the specified repo. The app should use the manifests folder and apply (kubectl apply) all the manifests in that folder. The manifests are deployed to the cluster that Argo CD runs in. Note that you can run Argo CD in one cluster and deploy to totally different clusters.
The above command does not configure the repository to be synced automatically, although that is an option. To sync manually, use the following command:
argocd app sync realtime
The application should now be synced and viewable in the UI:
Not Secure because we use Let’s Encrypt staging for this app
Set up auto-sync
Let’s set up this app to automatically sync with the repo (default = every 3 minutes). This can be done from both the CLI and the UI. Let’s do it from the UI. Click on the app and then click App Details. You will find a Sync Policy in the app details where you can enable auto-sync
Setting up auto-sync from the UI
You can now make changes to the git repo like changing the image tag for gbaeke/fluxapp (yes, I used this image with the Flux posts as well 😊 ) to 1.0.6 and wait for the sync to happen. Or sync manually from the CLI or the UI.
Conclusion
This was a quick tour of Argo CD. There is much more you can do but the above should get you started quickly. I must say I quite like the solution and am eager to see what the collaboration of Flux CD, Argo CD and Amazon comes up with in the future.
Flux has a feature called manifest generation that works together with Kustomize. Instead of just picking YAML files from a git repo and applying them, customisation is performed with the kustomize build command. The resulting YAML then gets applied to your cluster.
If you don’t know how customisation works (without Flux), take a look at the article I wrote earlier. Or look at the core docs.
You need to be aware of a few things before you get started. In order for Flux to use this method, you need to turn on manifest generation. With the Flux Helm chart, just pass the following parameter:
--set manifestGeneration=true
In my case, I have plain YAML files without customisation in a config folder. I want the files that use customisation in a different folder, say kustomize, like so:
Two folders to pass as git.path
To pass these folders to the Helm chart, use the following parameter:
--set git.path="config\,kustomize"
The kustomize folder contains the following files:
base files with environments dev and prod
There is nothing special about the base folder here. It is as explained in my previous post. The dev and prod folders are similar so I will focus only on dev.
The dev folder contains a .flux.yaml file, which is required by Flux. In this simple example, it contains the following:
The file specifies the generator to use, in this case Kustomize. The kustomize executable is in the Flux image. I specify one patchFile which contains patches for several resources separated by —:
Above, you see the patches for the dev environment:
the workload should be automated by Flux, installing new images based on the semantic version filter ~1
the ingress should use host realdev.baeke.info with a different name for the secret as well (the secret will be created by cert-manager)
The prod folder contains a similar configuration. Perhaps naively, I thought that specifying the kustomize folder in git.path was sufficient for Flux to scan the folders and run customisation wherever a .flux.yaml file was found. Sadly, that is not the case. ☹️With just the kustomization folder specified, Flux find conflicts between base, dev and prod folders because they contain similar files. That is expected behaviour for regular YAML files but , in my opinion, should not happen in this case. There is a bit of a clunky way to make this work though. Just specify the following as git.path:
With the above parameter, Flux will find no conflicts and will happily apply the customisations.
As a side note, you should also specify the namespace in the patch file explicitly. It is not added automatically even though kustomization.yaml contains the namespace.
Let’s look at the cluster when Flux has applied the changes.
Namespaces for dev and prod created via Flux & Kustomize
And here is the deployed “production app”:
Who chose that ugly colour!
The way customisations are handled could be improved. It’s unwieldy to specify every “customisation” folder in the git.path parameter. Just give me a –git-kustomize-path parameter and scan the paths in that parameter for .flux.yaml files. On the other hand, maybe I am missing something here so remarks are welcome.
When you have to deploy an application to multiple environments like dev, test and production there are many solutions available to you. You can manually deploy the app (Nooooooo! 😉), use a CI/CD system like Azure DevOps and its release pipelines (with or without Helm) or maybe even a “GitOps” approach where deployments are driven by a tool such as Flux or Argo based on a git repository.
In the latter case, you probably want to use a configuration management tool like Kustomize for environment management. Instead of explaining what it does, let’s take a look at an example. Suppose I have an app that can be deployed with the following yaml files:
redis-deployment.yaml: simple deployment of Redis
redis-service.yaml: service to connect to Redis on port 6379 (Cluster IP)
realtime-deployment.yaml: application that uses the socket.io library to display real-time updates coming from a Redis channel
realtime-service.yaml: service to connect to the socket.io application on port 80 (Cluster IP)
realtime-ingress.yaml: ingress resource that defines the hostname and TLS certificate for the socket.io application (works with nginx ingress controller)
Let’s call this collection of files the base and put them all in a folder:
Base files for the application
Now I would like to modify these files just a bit, to install them in a dev namespace called realtime-dev. In the ingress definition I want to change the name of the host to realdev.baeke.info instead of real.baeke.info for production. We can use Kustomize to reach that goal.
In the base folder, we can add a kustomization.yaml file like so:
This lists all the resources we would like to deploy.
Now we can create a folder for our patches. The patches define the changes to the base. Create a folder called dev (next to base). We will add the following files (one file blurred because it’s not relevant to this post):
The namespace: realtime-dev ensures that our base resource definitions are updated with that namespace. In resources, we ensure that namespace gets created. The file namespace.yaml contains the following:
Note that we also use certmanager here to issue a certificate to use on the ingress. For dev environments, it is better to use the Let’s Encrypt staging issuer instead of the production issuer.
We are now ready to generate the manifests for the dev environment. From the parent folder of base and dev, run the following command:
kubectl kustomize dev
The above command generates the patched manifests like so:
Note that namespace realtime-dev is used everywhere and that the Ingress resource uses realdev.baeke.info. The original Ingress resource looked like below:
As you can see, Kustomize has updated the host in tls: and rules: and also modified the secret name (which will be created by certmanager).
You have probably seen that Kustomize is integrated with kubectl. It’s also available as a standalone executable.
To directly apply the patched manifests to your cluster, run kubectl apply -k dev. The result:
namespace/realtime-dev created
service/realtime created
service/redis created
deployment.apps/realtime created
deployment.apps/redis created
ingress.extensions/realtime-ingress created
In another post, we will look at using Kustomize with Flux. Stay tuned!
If you do any sort of development, you often have to deal with secrets. There are many ways to deal with secrets, one of them is retrieving the secrets from a secure system from your own code. When your application runs on Kubernetes and your code (or 3rd party code) cannot be configured to retrieve the secrets directly, you have several options. This post looks at one such solution: Azure Key Vault to Kubernetes from Sparebanken Vest, Norway.
In short, the solution connects to Azure Key Vault and does one of two things:
Create a regular Kubernetes secret with the controller
Inject the secrets in the pod with the Env Injector
In my scenario, I just wanted regular secrets to use in a KEDA project that processes IoT Hub messages. The following secrets were required:
Connection string to a storage account: AzureWebJobsStorage
Connection string to IoT Hub’s event hub: EventEndpoint
In the YAML that deploys the pods that are scaled by KEDA, the secrets are referenced as follows:
Because the YAML above is deployed with Flux from a git repo, we need to get the secrets from an external system. That external system in this case, is Azure Key Vault.
To make this work, we first need to install the controller that makes this happen. This is very easy to do with the Helm chart. By default, this Helm chart will work well on Azure Kubernetes Service as long as you give the AKS security principal read access to Key Vault:
Access policies in Key Vault (azure-cli-2019-… is the AKS service principal here)
Next, define the secrets in Key Vault:
Secrets in Key Vault
With the access policies in place and the secrets defined in Key Vault, the controller installed by the Helm chart can do its work with the following YAML:
The above YAML defines two objects of kind AzureKeyVaultSecret. In each object we specify the Key Vault secret to read (vault) and the Kubernetes secret to create (output). The above YAML results in two Kubernetes secrets:
Two regular secrets
When you look inside such a secret, you will see:
Inside the secret
To double check the secret, just do echo RW5K… | base64 -d to see the decoded secret and that it matches the secret stored in Key Vault. You can now reference the secret with ValueFrom as shown earlier in this post.
Conclusion
If you want to turn Azure Key Vault secrets into regular Kubernetes secrets for use in your manifests, give the solution from Sparebanken Vest a go. It is very easy to use. If you do not want regular Kubernetes secrets, opt for the Env Injector instead, which injects the environment variables directly in your pod.
I have always wanted to create a Kubernetes operator with the operator framework and tried to give that a go on my Windows 10 system. Note that the emphasis is on creating an operator, not necessarily writing a useful one 😁. All I am doing is using the boilerplate that is generated by the framework. If you have never even seen how this is done, then this post if for you. 👍
An operator is an application-specific controller. A controller is a piece of software that implements a control loop, watching the state of the Kubernetes cluster via the API. It makes changes to the state to drive it towards the desired state.
An operator uses Kubernetes to create and manage complex applications. Many operators can be found here: https://operatorhub.io/. The Cassandra operator for instance, has domain-specific knowledge embedded in it, that knows how to deploy and configure this database. That’s great because that means some of the burden is shifted from you to the operator.
Installation
I installed the Operator SDK CLI from the GitHub releases in WSL, Windows Subsystem for Linux. I am using WSL 1, not WSL 2 as I am not running a Windows Insiders release. The commands to run:
You should now be able to run operator-sdk in WSL 1.
Creating an operator
In WSL, you should have installed Go. I am using version 1.13.5. Although not required, I used my Go path on Windows to generate the operator and not the %GOPATH set in WSL. My working directory was:
/mnt/c/Users/geert/go/src/github.com/baeke.info
To create the operator, I ran the following commands (one line):
export GO111MODULE=on
operator-sdk new fun-operator --repo github.com/baeke.info/fun-operator
This creates a folder, fun-operator, under baeke.info and sets up the project:
Project structure in VS Code
Before continuing, cd into fun-operator and run go mod tidy. Now we can run the following command:
operator-sdk add api --api-version=fun.baeke.info/v1alpha1 --kind FunOp
This creates a new CRD (Custom Resource Definition) API called FunOp. The API version is fun.baeke.info/v1alpha1 which you choose yourself. With the above you can create CRDs like below that the operator acts upon:
The above will generate a file, funop_controller.go, that contains some boilerplate code that creates a busybox pod. The Reconcile function is responsible for doing this work:
Reconcile function in the controller (incomplete)
As stated above, I will just use the boilerplate code and build the project:
operator-sdk build gbaeke/fun-operator
In WSL 1, you cannot run Docker so the above command will build the operator from the Go code but fail while building the container image. Can’t wait for WSL 2! The build creates the following artifact:
fun-operator in _output/bin
The supplied Dockerfile can be used to build the container images in Windows. In Windows, copy the Dockerfile from the build folder to the root of the operator project (in my case C:\Users\geert\go\src\github.com\baeke.info\fun-operator) and run docker build and push:
The project folder structure contains a bunch of yaml in the deploy folder:
Great! Some YAML to deploy
The service account, role and role binding make sure your code can create (or delete/update) resources in the cluster. The operator.yaml actually deploys the operator on your cluster. You just need to update the container spec with the name of your image (here gbaeke/fun-operator).
Before you deploy the operator, make sure you deploy the CRD manifest (here fun.baeke.info_funops_crd.yaml).
As always, just use kubectl apply-f with the above YAML files.
Testing the operator
With the operator deployed, create a resource based on the CRD. For instance:
From the moment you create this resource with kubectl apply, a pod will be created by the operator.
pod created upon submitting the custom resource
When you delete example-funop, the pod will be removed by the operator.
That’s it! We created a Kubernetes operator with the boilerplate code supplied by the operator-sdk cli. Another time, maybe we’ll create an operator that actually does something useful! 😉