In one on my videos on my YouTube channel, I talked about Kubernetes authentication and used the image below:
To secure access to the Kubernetes API server, you need to be authenticated and properly authorized to do what you need to do. The third mechanism to secure access is admission control. Simply put, admission control allows you to inspect requests to the API server and accept or deny the request based on rules you set. You will need an admission controller, which is just code that intercepts the request after authentication and authorization.
There is a list of admission controllers that are compiled-in with two special ones (check the docs):
With the two admission controllers above, you can develop admission plugins as extensions and configure them at runtime. In this post, we will look at a ValidatingAdmissionWebhook that is used together with Azure Policy to inspect requests to the AKS API Server and either deny or audit these requests.
Note that I already have a post about Azure Policy and pod security policies here. There is some overlap between that post and this one. In this post, we will look more closely at what happens on the cluster.
Want a video instead?
Azure has its own policy engine to control the Azure Resource Manager (ARM) requests you can make. A common rule in many organizations for instance is the prohibition of creation of expensive resources or even creating resources in unapproved regions. For example, a European company might want to only create resources in West Europe or North Europe. Azure Policy is the engine that can enforce such a rule. For more information, see Overview of Azure Policy. In short, you select from an ever growing list of policies or you create your own. Policies can be grouped in policy initiatives. A single policy or an initiative gets assigned to a scope, which can be a management group, a subscription or a resource group. In the portal, you then check for compliance:
Besides checking for compliance, you can deny the requests in real time. There are also policies that can create resources when they are missing.
Azure Policy for Kubernetes
Although Azure Policy works great with Azure Resource Manager (ARM), which is basically the API that allows you to interact with Azure resources, it does not work with Kubernetes out of the box. We will need an admission controller (see above) that understands how to interpret Kubernetes API requests in addition to another component that can sync policies in Azure Policy to Kubernetes for the admission controller to pick up. There is a built-in list of supported Kubernetes policies.
For the admission controller, Microsoft uses Gatekeeper v3. There is a lot, and I do mean a LOT, to say about Gatekeeper and its history. We will not go down that path here. Check out this post for more information if you are truly curious. For us it’s enough to know that Gatekeeper v3 needs to be installed on AKS. In order to do that, we can use an AKS add-on. In fact, you should use the add-on if you want to work with Azure Policy. Installing Gatekeeper v3 on its own will not work.
Note: there are ways to configure Azure Policy to work with Azure Arc for Kubernetes and AKS Engine. In this post, we only focus on the managed Azure Kubernetes Service (AKS)
So how do we install the add-on? It is very easy to do with the portal or the Azure CLI. For all details, check out the docs. With the Azure CLI, it is as simple as:
az aks enable-addons --addons azure-policy --name CLUSTERNAME --resource-group RESOURCEGROUP
If you want to do it from an ARM template, just add the add-on to the template as shown here.
What happens after installing the add-on?
I installed the add-on without active policies. In kube-system, you will find the two pods below:
The above pods are part of the add-on. I am not entirely sure what the azure-policy-webhook does, but the azure-policy pod is responsible for checking Azure Policy for new assignments and translating that to resources that Gatekeeper v3 understands (hint: constraints). It also checks policies on the cluster and reports results back to Azure Policy. In the logs, you will see things like:
- No audit results found
- Schedule running
- Creating constraint
The last line creates a constraint but what exactly is that? Constraints tell GateKeeper v3 what to check for when a request comes to the API server. An example of a constraint is that a container should not run privileged. Constraints are backed by constraint templates that contain the schema and logic of the constraint. Good to know, but where are the Gatekeeper v3 pods?
Gatekeeper was automatically installed by the Azure Policy add-on and will work with the constraints created by the add-on, synced from Azure Policy. When you remove these pods, the add-on will install them again.
Creating a policy
Although you normally create policy initiatives, we will create a single policy and see what happens on the cluster. In Azure Policy, choose Assign Policy and scope the policy to the resource group of your cluster. In Policy definition, select Kubernetes cluster should not allow privileged containers. As discussed, that is one of the built-in policies:
In the next step, set the effect to deny. This will deny requests in real time. Note that the three namespaces in Namespace exclusions are automatically added. You can add extra namespaces there. You can also specifically target a policy to one or more namespaces or even use a label selector.
You can now select Review and create and then select Create to create the policy assignment. This is the result:
Now we have to wait a while for the change to be picked up by the add-on on the cluster. This can take several minutes. After a while, you will see the following log entry in the azure-policy pod:
You can see the constraint when you run k get constraints. The constraint is based on a constraint template. You can list the templates with k get constrainttemplates. This is the result:
With k get constrainttemplates k8sazurecontainernoprivilege -o yaml, you will find that the template contains some logic:
The block of rego contains the logic of this template. Without knowing rego, which is the policy language used by Open Policy Agent (OPA) which is used by Gatekeeper v3 on our cluster, you can actually guess that the privileged field inside securityContext is checked. If that field is true, that’s a violation of policy. Although it is useful to understand more details about OPA and rego, Azure Policy hides the complexity for you.
Does it work?
Let’s try to deploy the following deployment.yaml:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80 securityContext: privileged: true
After running kubectl apply -f deployment.yaml, everything seems fine. But when we run kubectl get deploy:
Let’s run kubectl get events:
Notice that validation.gatekeeper.sh denied the request because privileged was set to true.
Adding more policies
Azure Security Center comes with a large initiative, Azure Security Benchmark, that also includes many Kubernetes policies. All of these policies are set to audit for compliance. On my system, the initiative is assigned at the subscription level:
The Azure Policy add-on on our cluster will pick up the Kubernetes policies and create the templates and constraints:
Now we have two constraints for k8sazurecontainernoprivilege:
The new constraint comes from the larger initiative. In the spec, the enforcementAction is set to dryrun (audit). Although I do not have pods that violate k8sazurecontainernoprivilege, I do have pods that violate another policy that checks for host path mapping. That is reported back by the add-on in the compliance report:
In this post, you have seen what happens when you install the AKS policy add-on and enable a Kubernetes policy in Azure Policy. The add-on creates constraints and constraint templates that Gatekeeper v3 understands. The rego in a constraint template contains logic used to define the policy. When the policy is set to deny, Gatekeeper v3, which is an admission controller denies the request in real-time. When the policy is set to audit (or dry run at the constraint level), audit results are reported by the add-on to Azure Policy.