Quick Guide to the Secret Store CSI driver for Azure Key Vault on AKS

Yesterday, I posted the Quick Guide to Kubernetes Workload Identity on AKS. This post contains a similar guide to enabling and using the Secret Store CSI driver for Azure Key Vault on AKS.

All commands assume bash. You should have the Azure CLI installed and logged in to the subscription as the owner (because you need to configure RBAC in the scripts below).

Step 1: Enable the driver

The command to enable the driver on an existing cluster is below. Please set the variables to point to your cluster and resource group:

RG=YOUR_RESOURCE_GROUP
CLUSTER=YOUR_CLUSTER_NAME

az aks enable-addons --addons=azure-keyvault-secrets-provider --name=$CLUSTER --resource-group=$RG

If the driver is already enabled, you will simply get a message stating that.

Step 2: Create a Key Vault

In this step, we create a Key Vault and configure RBAC. We will also add a sample secret.

# replace <SOMETHING> with a value like your initials for example
KV=<SOMETHING>$RANDOM

# name of the key vault secret
SECRET=demosecret

# value of the secret
VALUE=demovalue

# create the key vault and turn on Azure RBAC; we will grant a managed identity access to this key vault below
az keyvault create --name $KV --resource-group $RG --location westeurope --enable-rbac-authorization true

# get the subscription id
SUBSCRIPTION_ID=$(az account show --query id -o tsv)

# get your user object id
USER_OBJECT_ID=$(az ad signed-in-user show --query objectId -o tsv)

# grant yourself access to key vault
az role assignment create --assignee-object-id $USER_OBJECT_ID --role "Key Vault Administrator" --scope /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RG/providers/Microsoft.KeyVault/vaults/$KV

# add a secret to the key vault
az keyvault secret set --vault-name $KV --name $SECRET --value $VALUE

You can use the portal to check the Key Vault and see the secret:

Key Vault created and secret added

If you go to Access Policies, you will notice that the Key Vault uses Azure RBAC:

Key Vault uses RBAC permission model

Step 3: Grant a managed identity access to Key Vault

In the previous step, your account was granted access to Key Vault. In this step, we will grant the same access to the managed identity that the secret store csi provider will use. We will need to configure the managed identity we want to use in later steps.

This guide uses the managed identity created by the secret store provider. It lives in the resource group associated with your cluster. By default, that group starts with MC_. The account is called azurekeyvaultsecretsprovider-<CLUSTER-NAME>.

# grab the managed identity principalId assuming it is in the default
# MC_ group for your cluster and resource group
IDENTITY_ID=$(az identity show -g MC\_$RG\_$CLUSTER\_westeurope --name azurekeyvaultsecretsprovider-$CLUSTER --query principalId -o tsv)

# grant access rights on Key Vault
az role assignment create --assignee-object-id $IDENTITY_ID --role "Key Vault Administrator" --scope /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RG/providers/Microsoft.KeyVault/vaults/$KV

Above, we grant the Key Vault Administrator role. In production, that should be a role with less privileges.

Step 4: Create a SecretProviderClass

Let’s create and apply the SecretProviderClass in one step.

AZURE_TENANT_ID=$(az account show --query tenantId -o tsv)
CLIENT_ID=$(az aks show -g $RG -n $CLUSTER --query addonProfiles.azureKeyvaultSecretsProvider.identity.clientId -o tsv)

cat <<EOF | kubectl apply -f -
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: demo-secret
  namespace: default
spec:
  provider: azure
  secretObjects:
  - secretName: demosecret
    type: Opaque
    data:
    - objectName: "demosecret"
      key: demosecret
  parameters:
    usePodIdentity: "false"
    useVMManagedIdentity: "true"
    userAssignedIdentityID: "$CLIENT_ID"
    keyvaultName: "$KV"
    objects: |
      array:
        - |
          objectName: "demosecret"
          objectType: secret
    tenantId: "$AZURE_TENANT_ID"
EOF

After retrieving the Azure AD tenant Id and managed identity client Id, the SecretProviderClass is created. Pay special attention to the following fields:

  • userAssignedIdentityID: the clientId (⚠️ not the principalId we retrieved earlier) of the managed identity used by the secret store provider; you can use other user-assigned managed identities or even a system-assigned managed identity assigned to the virtual machine scale set that runs your agent pool; I recommend using user-assigned identity
    • above, the clientId is retrieved via the az aks command
  • keyvaultName: the name you assigned your Key Vault
  • tenantId: the Azure AD tenant Id where your identities live
  • usePodIdentity: not recommended because pod identity will be replaced by workload identity
  • useVMManagedIdentity: set to true even if you use user-assigned managed identity

Step 5: Mount the secrets in pods

Create pods that use the secrets.

cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: secretpods
  name: secretpods
spec:
  replicas: 1
  selector:
    matchLabels:
      app: secretpods
  template:
    metadata:
      labels:
        app: secretpods
    spec:
      containers:
      - image: nginx
        name: nginx
        env:
          - name:  demosecret
            valueFrom:
              secretKeyRef:
                name:  demosecret
                key:  demosecret
        volumeMounts:
          - name:  secret-store
            mountPath:  "mnt/secret-store"
            readOnly: true
      volumes:
        - name:  secret-store
          csi:
            driver: secrets-store.csi.k8s.io
            readOnly: true
            volumeAttributes:
              secretProviderClass: "demo-secret"
EOF

The above command creates a deployment that runs nginx. The Key Vault secrets are mounted in a volume that is mounted at mnt/secret-store. The Key Vault secret is also available as an environment variable demosecret.

Step 6: Verify

Issue the commands below to get a shell to the pods of the nginx deployment and check the mount path and environment variable:

export POD_NAME=$(kubectl get pods -l "app=secretpods" -o jsonpath="{.items[0].metadata.name}")

# if this does not work, check the status of the pod
# if still in ContainerCreating there might be an issue
kubectl exec -it $POD_NAME -- sh

cd /mnt/secret-store
ls # the file containing the secret is listed
cat demosecret; echo # demovalue is revealed

# echo the value of the environment variable
echo $demosecret # demovalue is revealed

Important: the secret store CSI provider always mounts secrets in a volume. A Kubernetes secret (here used to populate the environment variable) is not created by default. It is created here because of the secretObjects field in the SecretProviderClass.

Conclusion

The above commands should make it relatively straightforward to try the secret store CSI provider and understand what it does. It works especially well in GitOps scenarios where you cannot store secrets in Git and you do not have pipelines that can retrieve Azure Key Vault secrets at deploy time.

If you spot errors in the above commands, please let me know!

Quick Guide to Kubernetes Workload Identity on AKS

I recently had to do a demo about Workload Identity on AKS and threw together some commands to enable and verify the setup. It contains bits and pieces from the documentation plus some extras. I wrote another post some time ago with more background.

All commands are for bash and should be run sequentially in the same shell to re-use the variables.

Step 1: Enable OIDC issuer on AKS

You need an existing AKS cluster for this. You can quickly deploy one from the portal. Note that workload identity is not exclusive to AKS.

CLUSTER=<AKS_CLUSTER_NAME>
RG=<AKS_CLUSTER_RESOURCE_GROUP>

az aks update -n $CLUSTER -g $RG --enable-oidc-issuer

After enabling OIDC, retrieve the issuer URL with ISSUER_URL=$(az aks show -n $CLUSTER -g $RG --query oidcIssuerProfile.issuerUrl -o tsv). To check, run echo $ISSUER_URL. It contains a URL like https://oidc.prod-aks.azure.com/GUID/. You can issue the command below to obtain the OpenID configuration. It will list other URLs that can be used to retrieve keys that allow Azure AD to verify tokens it receives from Kubernetes.

curl $ISSUER_URL/.well-known/openid-configuration

Step 2: Install the webhook on AKS

Use the Helm chart to install the webhook. First, save the Azure AD tenant Id to a variable. The tenantId will be retrieved with the Azure CLI so make sure you are properly logged in. You also need Helm installed and a working Kube config for your cluster.

AZURE_TENANT_ID=$(az account show --query tenantId -o tsv)
 
helm repo add azure-workload-identity https://azure.github.io/azure-workload-identity/charts
 
helm repo update
 
helm install workload-identity-webhook azure-workload-identity/workload-identity-webhook \
   --namespace azure-workload-identity-system \
   --create-namespace \
   --set azureTenantID="${AZURE_TENANT_ID}"

Step 3: Create an Azure AD application

Although you can create the application directly in the portal or with the Azure CLI, workload identity has a CLI to make the whole process less error-prone and easier to script. Install azwi with brew: brew install Azure/azure-workload-identity/azwi.

Run the following commands. First, we save the application name in a variable. Use any name you like.

APPLICATION_NAME=WorkloadDemo
azwi serviceaccount create phase app --aad-application-name $APPLICATION_NAME

You can now check the application registrations in Azure AD. In my case, WorkloadDemo was created.

App registration in Azure AD

If you want to grant this application access rights to resources in Azure, first grab the appId:

APPLICATION_CLIENT_ID="$(az ad sp list --display-name $APPLICATION_NAME --query '[0].appId' -otsv)"

Now you can use commands such as az role assignment create to grant access rights. For example, here is how to grant the Reader role to your current Azure CLI subscription:

SUBSCRIPTION_ID=$(az account show --query id -o tsv)

az role assignment create --assignee-object-id $APPLICATION_CLIENT_ID --role "Reader" --scope /subscriptions/$SUBSCRIPTION_ID

Step 4: Create a Kubernetes service account

Although you can create the service account with kubectl or via a YAML manifest, azwi can help here as well:

SERVICE_ACCOUNT_NAME=sademo
SERVICE_ACCOUNT_NAMESPACE=default

azwi serviceaccount create phase sa \
  --aad-application-name "$APPLICATION_NAME" \
  --service-account-namespace "$SERVICE_ACCOUNT_NAMESPACE" \
  --service-account-name "$SERVICE_ACCOUNT_NAME"

This creates a service account that looks like the below YAML manifest:

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    azure.workload.identity/client-id: <value of APPLICATION_CLIENT_ID>
  labels:
    azure.workload.identity/use: "true"
  name: sademo
  namespace: default

This is a regular Kubernetes service account. Later, you will configure your pod to use the service account.

The label is important because the webhook we installed earlier acts on service accounts with this label to perform all the behind-the-scenes magic! πŸ˜‰

Note that workload identity does not use the Kubernetes service account token. That token is used to authenticate to the Kubernetes API server. The webhook will ensure that there is another token, its path is in $AZURE_FEDERATED_TOKEN_FILE, which is the token sent to Azure AD.

Step 5: Configure the Azure AD app for token federation

The application created in step 5 needs to be configured to trust specific tokens issued by your Kubernetes cluster. When AAD receives such a token, it returns an Azure AD token that your application in Kubernetes can use to authenticate to Azure.

Although you can manually configure the Azure AD app, azwi can be used here as well:

SERVICE_ACCOUNT_NAMESPACE=default

azwi serviceaccount create phase federated-identity \
  --aad-application-name "$APPLICATION_NAME" \
  --service-account-namespace "$SERVICE_ACCOUNT_NAMESPACE" \
  --service-account-name "$SERVICE_ACCOUNT_NAME" \
  --service-account-issuer-url "$ISSUER_URL"

In the AAD app, you will see:

Azure AD app federated credentials config

You find the above by clicking Certificates & Secrets and then Federated credentials.

Step 6: Deploy a workload

Run the following command to create a deployment and apply it in one step:

cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: azcli-deployment
  namespace: default
  labels:
    app: azcli
spec:
  replicas: 1
  selector:
    matchLabels:
      app: azcli
  template:
    metadata:
      labels:
        app: azcli
    spec:
      serviceAccount: sademo
      containers:
        - name: azcli
          image: mcr.microsoft.com/azure-cli:latest
          command:
            - "/bin/bash"
            - "-c"
            - "sleep infinity"
EOF

This runs the latest version of the Azure CLI in Kubernetes.

Grab the first pod name (there is only one) and exec into the pod’s container:

POD_NAME=$(kubectl get pods -l "app=azcli" -o jsonpath="{.items[0].metadata.name}")

kubectl exec -it $POD_NAME -- bash

Step 7: Test the setup

In the container, issue the following commands:

echo $AZURE_CLIENT_ID
echo $AZURE_TENANT_ID
echo $AZURE_FEDERATED_TOKEN_FILE
cat $AZURE_FEDERATED_TOKEN_FILE
echo $AZURE_AUTHORITY_HOST

# list the standard Kubernetes service account secrets
cd /var/run/secrets/kubernetes.io/serviceaccount
ls 

# check the folder containing the AZURE_FEDERATED_TOKEN_FILE
cd /var/run/secrets/azure/tokens
ls

# you can use the AZURE_FEDERATED_TOKEN_FILE with the Azure CLI
# together with $AZURE_CLIENT_ID and $AZURE_TENANT_ID
# a password is not required since we are doing federated token exchange

az login --federated-token "$(cat $AZURE_FEDERATED_TOKEN_FILE)" \
--service-principal -u $AZURE_CLIENT_ID -t $AZURE_TENANT_ID

# list resource groups
az group list

If the last command works, that means you successfully logged on with workload identity ok AKS. You can list resource groups because you granted the Azure AD app the Reader role on your subscription.

Note that the option to use token federation was added to Azure CLI quite recently. At the time of this writing, May 2022, the image mcr.microsoft.com/azure-cli:latest surely has that capability.

Conclusion

I hope the above commands are useful if you want to quickly test or demo Kubernetes workload identity on AKS. If you spot errors, be sure to let me know!

A look at some of Azure Container App’s new features

A while ago, I created a YouTube playlist about Azure Container Apps. The videos were based on the first public preview. At the time, several features were missing or still needed to be improved (as expected with a preview release):

  • An easy way to create a container app, similar to az webapp up
  • Managed Identity support (system and user assigned)
  • Authentication support with identity providers like Microsoft, Google, Twitter
  • An easy way to follow the logs of a container from your terminal (instead of using Log Analytics queries)
  • Getting a shell to your container for troubleshooting purposes

Let’s take a look at some of these features.

az containerapp up

To manage Container Apps, you can use the containerapp Azure CLI extension. Add it with the following command:

az extension add --name containerapp --upgrade

One of the commands of this extension is up. It lets you create a container app from local source files or from GitHub. With your sources in the current folder, the simplest form of this command is:

az containerapp up --name YOURAPPNAME --source .

The command above creates the following resources:

  • a resource group: mine was called geert_baeke_rg_3837
  • a Log Analytics workspace
  • a Container Apps environment: its name is YOURAPPNAME-env
  • an Azure Container Registry: used to build the container image from a Dockerfile in your source folder
  • the container app: its name is YOURAPPNAME

The great thing here is that you do not need Docker on your local machine for this to work. Building and pushing the container image is done by an ACR task. You only need a Dockerfile in your source folder.

When you change your source code, simply run the same command to deploy your changes. A new image build and push will be started by ACR and a revision of your container app will be published.

⚠️TIP: by default, the container app does not enable ingress from the Internet. To do so, include an EXPOSE command in your Dockerfile.

If you want to try az containerapp up, you can use my super-api sample from GitHub: https://github.com/gbaeke/super-api

Use the following commands to clone the source code and create the container app:

git clone https://github.com/gbaeke/super-api.git
cd super-api
az containerapp up --name super-api --source . --ingress external --target-port 8080

Above, we added the –ingress and –target-port parameters to enable ingress. You will get a URL like https://super-api.livelyplant-fa0ceet5.eastus.azurecontainerapps.io to access the app. In your browser, you will just get: Hello from Super API. If you want a different message, you can run this command:

az containerapp up --name super-api --source . --ingress external --target-port 8080 --env-vars WELCOME=YOURMESSAGE

Running the above command will result in a new revision. Use az containerapp revision list -n super-api -g RESOURCEGROUP -o table to see the revisions of your container app.

There is much more you can do with az containerapp up:

  • Deploy directly from a container image in a registry (with the option to supply registry authentication if the registry is private)
  • Deploy to an existing container app environment
  • Deploy to an existing resource group
  • Use a GitHub repo instead of local sources which uses a workflow to deploy changes as you push them

Managed Identity

You can now easily enable managed identity on a container app. Both System assigned and User assigned are supported. Below, system assigned managed identity was enabled on super-api:

System assigned identity on super-api

Next, I granted the managed identity Reader role on my subscription:

Enabling managed identity is easy enough. In your code, however, you need to obtain a token to do the things you want to do. At a low level, you can use an HTTP call to fetch the token to access a resource like Azure Key Vault. Let’s try that and introduce a new command to get a shell to a container app:

az containerapp exec  -n super-api -g geert_baeke_rg_3837 --command sh

The above command gets a shell to the super-api container. If you want to try this, first modify the Dockerfile and remove the USER command. Otherwise, you are not root and will not be able to install curl. You will also need to use an alpine base image in the second stage instead of scratch (the scratch image does not offer a shell).

In the shell, run the following commands:

apk add curl
curl -H "X-IDENTITY-HEADER: $IDENTITY_HEADER" \
  "$IDENTITY_ENDPOINT?resource=https://vault.azure.net&api-version=2019-08-01"

The response to the above curl command will include an access token for the Azure Key Vault resource.

A container app with managed identity has several environment variables:

  • IDENTITY_ENDPOINT: http://localhost:42356/msi/token (the endpoint to request the token from)
  • IDENTITY_HEADER: used to protect against server-side request forgery (SSRF) attacks

Instead of using these values to create raw HTTP requests, you can use SDK’s instead. The documentation provides information for .NET, JavaScript, Python, Java, and PowerShell. To try something different, I used the Azure SDK for Go. Here’s a code snippet:

func (s *Server) authHandler(w http.ResponseWriter, r *http.Request) {
	// parse subscription id from request
	subscriptionId := r.URL.Query().Get("subscriptionId")
	if subscriptionId == "" {
		s.logger.Infow("Failed to get subscriptionId from request")
		w.WriteHeader(http.StatusBadRequest)
		return
	}

	client := resources.NewGroupsClient(subscriptionId)
	authorizer, err := auth.NewAuthorizerFromEnvironment()
	if err != nil {
		s.logger.Error("Error: ", zap.Error(err))
		return
	}
	client.Authorizer = authorizer

Although the NewAuthorizerFromEnvironment() call above supports managed identity, it seems it does not support the endpoint used in Container Apps and Azure Web App. The code above works fine on a virtual machine and even pod identity (v1) on AKS.

We can use another feature of az containerapp to check the logs:

az containerapp logs show -n super-api -g geert_baeke_rg_3837 --follow

"TimeStamp":"2022-05-05T10:49:59.83885","Log":"Connected to Logstream. Revision: super-api--0yp202c, Replica: super-api--0yp202c-64746cc57b-pf8xh, Container: super-api"}
{"TimeStamp":"2022-05-04T22:02:10.4278442+00:00","Log":"to super api"}
{"TimeStamp":"2022-05-04T22:02:10.427863+00:00","Log":""}
{"TimeStamp":"2022-05-04T22:02:10.4279478+00:00","Log":"read config error Config File "config" Not Found in "[/config]""}
{"TimeStamp":"2022-05-04T22:02:10.4280241+00:00","Log":"logger"}"}
{"TimeStamp":"2022-05-04T22:02:10.4282641+00:00","Log":"client initializing for: 127.0.0.1:50001"}
{"TimeStamp":"2022-05-04T22:02:10.4282792+00:00","Log":"values","welcome":"Hello from Super API","port":8080,"log":false,"timeout":15}"}
...

When I try to execute the code that’s supposed to get the token, I get the following error:

{"TimeStamp":"2022-05-05T10:51:58.9469835+00:00","Log":"{error 26 0  MSI not available}","stacktrace":"..."}

As always, it is easy to enable managed identity but tricky to do from code (sometimes πŸ˜‰). With the new feature that lets you easily grab the logs, it is simpler to check the errors you get back at runtime. Using Log Analytics queries was just not intuitive.

Conclusion

The az container up command makes it extremely simple to deploy a container app from your local machine or GitHub. It greatly enhances the inner loop experience before you start deploying your app to other environments.

The tooling now makes it easy to exec into containers and troubleshoot. Checking runtime errors from logs is now much easier as well.

Managed Identity is something we all were looking forward to. As always, it is easy to implement but do check if the SDKs you use support it. When all else fails, you can always use HTTP! πŸ˜‰

Kubernetes Workload Identity with AKS

When you run a workload, no matter how simple or complex, you often need to access protected resources in both a secure and manageable way. Often, a resource’s security is integrated with an identity store. Azure resources, for instance, can be secured with roles assigned to Azure Active Directory (AAD) users, groups, or service principals.

Although it is tempting to simply store a credential with your code, it makes your code less secure and makes tasks such as credential rotation or updates a burden. In Azure, the solution to these issues is straightforward: just use managed identity if the service that runs your code supports it. Most do! That’s also the case for Azure Kubernetes Service (AKS). It supports a feature called pod-managed identities that associates a pod with such a managed identity. From the containers running in the pod, a developer can easily request a token to access Azure resources securely. I have written about pod-managed identities before so take a look at that post to understand the concepts. The post contains some sample code for illustration purposes.

The pod-managed identity feature has been in preview forever. The current version, v1, actually will not leave the preview phase. It will be replaced by v2, which uses workload identity federation. It is important to realize that AAD workload identity federation is not limited to Kubernetes. It also works with other workloads, like GitHub workflows or even Google cloud. This also means that workload identity for Kubernetes works on other distributions, both in the cloud and on-premises. It’s not just for AKS.

Although pod-managed identities and workload identity federation achieve the same goals, they work entirely differently. Pod-managed identity is somewhat more complex because it uses Kubernetes custom resource definitions (CRDs) and requires pods that intercept IMDS traffic. Intercepting that traffic can cause issues for other pods, which means you have extra configuration work to exclude those pods.

At the time of this writing, January 2022, workload identity federation is in preview!

How does it work?

As mentioned above, workload identity federation on AKS is very different from pod-managed identity. At a basic level, all it does is token exchange. Your pod will have access to a token that your code will present to AAD. In turn, AAD, which is configured to trust that token, will issue an AAD token to access the resource protected by AAD. These tokens are JWT tokens (JSON Web Tokens).

A couple of things need to be done for this to work:

  • AKS must be configured with an OIDC issuer URL. That public URL will present information that allows AAD to verify the JWT token it receives from your app. You will need to register the feature on your subscription and add or update the aks-preview extension for Azure CLI.
  • You need to create an app registration in AAD for your service principal. We will use the Azure Portal for this. The portal has been updated to add federated credentials that work with Kubernetes. Currently, workload identity federation does not work with managed identities. Managed identities are basically a wrapper around app registrations so that you do not have to create and maintain these registrations. Managed identity support is on the roadmap.
  • You install the workload-identity-webhook chart on AKS. This is a mutating webhook that makes it easy for the developer to associate a pod with the service principal and automate the token creation.
  • You create a Kubernetes service account and configure your pod(s) to use it. The mutating webhook will spot this and configure the containers in your pod with environment variables and the federation token.

Let’s go through the steps to make this a bit clearer.

Configuring the app registration

Create an app registration and navigate to Certificates and Secrets. Click Add credential in the Federated credentials section:

Adding a federated credential

At the time of this writing, there were three supported scenarios: GitHub Actions, Kubernetes, and other. Select Kubernetes and specify the three required properties:

  • Cluster issuer URL: in the form of https://oidc.prod-aks.azure.com/SOMEGUID. Use az aks show -n CLUSTERNAME -g RESOURCEGROUP and look for issuerURL in the output
  • Namespace: the namespace that contains the service account; we will create it below
  • Service account name: the name of the Kubernetes service account

The namespace and service account name are used to create the subject identifier. The token your code presents to AAD will need that in the sub filed.

In the example below, I use the default namespace and a service account with called fed-sa:

The federated credential’s properties

Azure Active Directory, in particular this application, is now configured to trust tokens coming from our Kubernetes app. The token will need to contain the subject identifier in the sub field. The token will be signed and AAD can verify the signature from the information presented by the AKS OIDC issuer URL.

When you configure the app registration, a service principal is created with the same name. You can use it with Azure role-based access control. I gave this service principal (or app) Contributor access on my subscription (temporarily πŸ˜‰):

Service principal with access to the subscription

App, service principal, …? It’s confusing, I know. Never mind though and read on! πŸ˜‰

Installing the webhook

On your AKS cluster with the configured issuer URL, install the workload identity mutating webhook with Helm:

AZURE_TENANT_ID=YOURTENANTID 

helm repo add azure-workload-identity https://azure.github.io/azure-workload-identity/charts

helm repo update

helm install workload-identity-webhook azure-workload-identity/workload-identity-webhook \
   --namespace azure-workload-identity-system \
   --create-namespace \
   --set azureTenantID="${AZURE_TENANT_ID}"

Above, replace YOURTENANTID with the id of your Azure Active Directory tenant:

Azure AD Tenant ID in the portal

Creating a service account

In a later step, to test the setup, we will run the Azure CLI in a Kubernetes pod. To associate that pod with the AAD application and service principal, we need to create a service account and provide specific labels and annotations:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fed-sa
  namespace: default
  annotations:
    azure.workload.identity/client-id: APPID
    azure.workload.identity/tenant-id: YOURTENANTID
  labels:
    azure.workload.identity/use: "true"

Above, replace APPID with the ID of the application registration you created earlier:

Application ID of the app registration in which you configured the federated token trust

The labels and annotations for the service account and for pods are discussed here. The label on the service account is required for the webhook to know that this is a service account used with federated tokens. The annotations are optional. The tenant-id annotation defaults to the tenant id passed to the webhook Helm chart. I left it in to be explicit and to have all the environment variables I need for the Azure CLI login test.

If your pod has multiple containers, and you do not want to configure all containers with federated tokens, use the annotation azure.workload.identity/skip-containers at the pod level.

Configure a container in a pod with a federated token

We can now run a container to verify if the configuration works. The deployment below deploys an Azure CLI container. I use the latest tag which, at the time of this writing, resulted in Azure CLI version 2.32.0. Make sure you use 2.30.0 or higher. That version integrates the Microsoft Authentication Library (MSAL) as the underlying authentication library and supports logging in with a federated token.

Here is the deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: azcli-deployment
  labels:
    app: azcli
spec:
  replicas: 1
  selector:
    matchLabels:
      app: azcli
  template:
    metadata:
      labels:
        app: azcli
    spec:
      serviceAccount: fed-sa
      containers:
        - name: azcli
          image: mcr.microsoft.com/azure-cli:latest
          command:
            - "/bin/bash"
            - "-c"
            - "sleep infinity"

There is nothing special about this deployment. Instead of using the service account default, this pod is configured with the fed-sa service account. This is a normal Kubernetes service account. Because the service account has the label azure.workload.identity/use: “true”, the containers in the pod are modified by the webhook for token federation. The webhook adds several environment variables and mounts a volume based on a secret that contains the federation token. This is similar and in addition to the mounted token to access the Kubernetes API from the pod.

Here are the environment variables:

  • AZURE_AUTHORITY_HOST=https://login.microsoftonline.com/
  • AZURE_CLIENT_ID=client-id from service account annotation
  • AZURE_TENANT_ID=tenant-id from service account annotation or default from webhook
  • AZURE_FEDERATED_TOKEN_FILE=/var/run/secrets/tokens/azure-identity-token

The AZURE_FEDERATED_TOKEN_FILE contains the path to the file that contains the token (JWT) that will be presented to AAD by your application. In our case, we will configure the Azure CLI to use this token. You can get a shell to the container and cat the token:

The token (a JWT) in the token file

You can paste this token into the https://jwt.io debugger and see its content:

Token in jwt.io debugger

The token contains the issuer URL and the sub field contains a reference to the namespace and service account that we configured in the AAD app registration. Make sure there is a match!

Now we can use the Azure CLI (version >= 2.30.0) to log in using this token. Get a shell to the container and use the following command (–debug will give a lot of output):

az login --federated-token "$(cat $AZURE_FEDERATED_TOKEN_FILE)" --debug \
--service-principal -u $AZURE_CLIENT_ID -t $AZURE_TENANT_ID

We do not need to specify a password or certificate because the federated token will be used. Near the end of the output, you will see something like:

{
    "cloudName": "AzureCloud",
    "homeTenantId": "YOURTENANTID",
    "id": "...",
    "isDefault": true,
    "managedByTenants": [],
    "name": "subscription id",
    "state": "Enabled",
    "tenantId": "...",
    "user": {
      "name": "AADAPPID",
      "type": "servicePrincipal"
    }
  }

The above output shows that the user you are logged on with is the service principal associated with the app id. Let’s see if I can list AKS clusters:

Yep, I can list AKS clusters (and even create new ones πŸ˜‰)

If you are interested in developer-oriented examples, check out the Azure AD Workload Identity documentation.

Conclusion

Workload Identity Overview

Azure AD workload identity for Kubernetes is relatively easy to configure. The diagram above summarizes all the bits and pieces you need: AKS OIDC config, the webhook (to configure containers in pods), and the AAD app.

An operator can easily use the Azure CLI to verify the configuration is correct. At the time of this writing, you have to create and manage an application registration. That will change once managed identities are supported.

Compared to pod-managed identities for AKS, the architecture is cleaner. On top of that, this feature works with other Kubernetes distributions as well, giving you the same technique to access AAD-protected resources. I am looking forward to seeing this evolve and becoming GA so customers can deploy this with confidence.

Taking Azure Container Apps for a spin

At Ignite November 2021, Microsoft released Azure Container Apps as a public preview. It allows you to run containerized applications on a serverless platform, in the sense that you do not have to worry about the underlying infrastructure.

The underlying infrastructure is Kubernetes (AKS) as the control plane with additional software such as:

  • Dapr: distributed application runtime to easily work with state, pub/sub and other Dapr building blocks
  • KEDA: Kubernetes event-driven autoscaler so you can use any KEDA supported scaler, in addition to scaling based on HTTP traffic, CPU and memory
  • Envoy: used to provide ingress functionality and traffic splitting for blue-green deployment, A/B testing, etc…

Your apps actually run on Azure Container Instances (ACI). ACI was always meant to be used as raw compute to build platforms with and this is a great use case.

Note: there is some discussion in the community whether ACI (via AKS virtual nodes) is used or not; I will leave it in for now but in the end, it does not matter too much as the service is meant to hide this complexity anyway

Azure Container Apps does not care about the runtime or programming model you use. Just use whatever feels most comfortable and package it as a container image.

In this post, we will deploy an application that uses Dapr to save state to Cosmos DB. Along the way, we will explain most of the concepts you need to understand to use Azure Container Apps in your own scenarios. The code I am using is on GitHub and written in Go.

Configure the Azure CLI

In this post, we will use the Azure CLI exclusively to perform all the steps. Instead of the Azure CLI, you can also use ARM templates or Bicep. If you want to play with a sample that deploys multiple container apps and uses Bicep, be sure to check out this great Azure sample.

You will need to have the Azure CLI installed and also add the Container Apps extension:

az extension add \
  --source https://workerappscliextension.blob.core.windows.net/azure-cli-extension/containerapp-0.2.0-py2.py3-none-any.whl

The extension allows you to use commands like az containerapp create and az containerapp update.

Create an environment

An environment runs one or more container apps. A container app can run multiple containers and can have revisions. If you know how Kubernetes works, each revision of a container app is actually a scaled collection of Kubernetes pods, using the scalers discussed above. Each revision can be thought of as a separate Kubernetes Deployment/ReplicaSet that runs a specific version of your app. Whenever you modify your app, depending on the type of modification, you get a new revision. You can have multiple active revisions and set traffic weights to distribute traffic as you wish.

Container apps, revisions, pods, and containers

Note that above, although you see multiple containers in a pod in a revision, that is not the most common use case. Most of the time, a pod will have only one application container. That is entirely up to you and the rationale behind using one or more containers is similar to multi-container pods in Kubernetes.

To create an environment, be sure to register or re-register the Microsoft.Web provider. That provider has the kubeEnvironments resource type, which represents a Container App environment.

az provider register --namespace Microsoft.Web

Next, create a resource group:

az group create --name rg-dapr --location northeurope

I have chosen North Europe here, but the location of the resource group does not really matter. What does matter is that you create the environment in either North Europe or Canada Central at this point in time (November 2021).

Every environment needs to be associated with a Log Analytics workspace. You can use that workspace later to view the logs of your container apps. Let’s create such a workspace in the resource group we just created:

az monitor log-analytics workspace create \
  --resource-group rg-dapr \
  --workspace-name dapr-logs

Next, we want to retrieve the workspace client id and secret. We will need that when we create the Container Apps environment. Commands below expect the use of bash:

LOG_ANALYTICS_WORKSPACE_CLIENT_ID=`az monitor log-analytics workspace show --query customerId -g rg-dapr -n dapr-logs --out tsv`
LOG_ANALYTICS_WORKSPACE_CLIENT_SECRET=`az monitor log-analytics workspace get-shared-keys --query primarySharedKey -g rg-dapr -n dapr-logs --out tsv`

Now we can create the environment in North Europe:

az containerapp env create \
  --name dapr-ca \
  --resource-group rg-dapr \
  --logs-workspace-id $LOG_ANALYTICS_WORKSPACE_CLIENT_ID \
  --logs-workspace-key $LOG_ANALYTICS_WORKSPACE_CLIENT_SECRET \
  --location northeurope

The Container App environment shows up in the portal like so:

Container App Environment in the portal

There is not a lot you can do in the portal, besides listing the apps in the environment. Provisioning an environment is extremely quick, in my case a matter of seconds.

Deploying Cosmos DB

We will deploy a container app that uses Dapr to write key/value pairs to Cosmos DB. Let’s deploy Cosmos DB:

uniqueId=$RANDOM
az cosmosdb create \
  --name dapr-cosmosdb-$uniqueId \
  --resource-group rg-dapr \
  --locations regionName='northeurope'

az cosmosdb sql database create \
    -a dapr-cosmosdb-$uniqueId \
    -g rg-dapr \
    -n dapr-db

az cosmosdb sql container create \
    -a dapr-cosmosdb-$uniqueId \
    -g rg-dapr \
    -d dapr-db \
    -n statestore \
    -p '/partitionKey' \
    --throughput 400

The above commands create the following resources:

  • A Cosmos DB account in North Europe: note that this uses session-level consistency (remember that for later in this post πŸ˜‰)
  • A Cosmos DB database that uses the SQL API
  • A Cosmos DB container in that database, called statestore (can be anything you want)

In Cosmos DB Data Explorer, you should see:

statestore collection will be used as a State Store in Dapr

Deploying the Container App

We can use the following command to deploy the container app and enable Dapr on it:

az containerapp create \
  --name daprstate \
  --resource-group rg-dapr \
  --environment dapr-ca \
  --image gbaeke/dapr-state:1.0.0 \
  --min-replicas 1 \
  --max-replicas 1 \
  --enable-dapr \
  --dapr-app-id daprstate \
  --dapr-components ./components-cosmosdb.yaml \
  --target-port 8080 \
  --ingress external

Let’s unpack what happens when you run the above command:

  • A container app daprstate is created in environment dapr-ca
  • The container app will have an initial revision (revision 1) that runs one container in its pod; the container uses image gbaeke/dapr-state:1.0.0
  • We turn off scaling by setting min and max replicas to 1
  • We enable ingress with the type set to external. That configures a public IP address and DNS name to reach our container app on the Internet; Envoy proxy is used under the hood to achieve this; TLS is automatically configured but we do need to tell the proxy the port our app listens on (–target-port 8080)
  • Dapr is enabled and requires that our app gets a Dapr id (–enable-dapr and –dapr-app-id daprstate)

Because this app uses the Dapr SDK to write key/value pairs to a state store, we need to configure this. That is were the –dapr-components parameter comes in. The component is actually defined in a file components-cosmosdb.yaml:

- name: statestore
  type: state.azure.cosmosdb
  version: v1
  metadata:
    - name: url
      value: YOURURL
    - name: masterkey
      value: YOURMASTERKEY
    - name: database
      value: YOURDB
    - name: collection
      value: YOURCOLLECTION

In the file, the name of our state store is statestore but you can choose any name. The type has to be state.azure.cosmosdb which requires the use of several metadata fields to specify the URL to your Cosmos DB account, the key to authenticate, the database, and collection.

In the Go code, the name of the state store is configurable via environment variables or arguments and, by total coincidence, defaults to statestore πŸ˜‰.

func main() {
	fmt.Printf("Welcome to super api\n\n")

	// flags
	... code omitted for brevity
	// State store name
	f.String("statestore", "statestore", "State store name")

The flag is used in the code that writes to Cosmos DB with the Dapr SDK (s.config.Statestore in the call to daprClient.SaveState below):

// write data to Dapr statestore
	ctx := r.Context()
	if err := s.daprClient.SaveState(ctx, s.config.Statestore, state.Key, []byte(state.Data)); err != nil {
		w.WriteHeader(http.StatusInternalServerError)
		fmt.Fprintf(w, "Error writing to statestore: %v\n", err)
		return
	} else {
		w.WriteHeader(http.StatusOK)
		fmt.Fprintf(w, "Successfully wrote to statestore\n")
	}

After running the az containerapp create command, you should see the following output (redacted):

{
  "configuration": {
    "activeRevisionsMode": "Multiple",
    "ingress": {
      "allowInsecure": false,
      "external": true,
      "fqdn": "daprstate.politegrass-37c1a51f.northeurope.azurecontainerapps.io",
      "targetPort": 8080,
      "traffic": [
        {
          "latestRevision": true,
          "revisionName": null,
          "weight": 100
        }
      ],
      "transport": "Auto"
    },
    "registries": null,
    "secrets": null
  },
  "id": "/subscriptions/SUBID/resourceGroups/rg-dapr/providers/Microsoft.Web/containerApps/daprstate",
  "kind": null,
  "kubeEnvironmentId": "/subscriptions/SUBID/resourceGroups/rg-dapr/providers/Microsoft.Web/kubeEnvironments/dapr-ca",
  "latestRevisionFqdn": "daprstate--6sbsmip.politegrass-37c1a51f.northeurope.azurecontainerapps.io",
  "latestRevisionName": "daprstate--6sbsmip",
  "location": "North Europe",
  "name": "daprstate",
  "provisioningState": "Succeeded",
  "resourceGroup": "rg-dapr",
  "tags": null,
  "template": {
    "containers": [
      {
        "args": null,
        "command": null,
        "env": null,
        "image": "gbaeke/dapr-state:1.0.0",
        "name": "daprstate",
        "resources": {
          "cpu": 0.5,
          "memory": "1Gi"
        }
      }
    ],
    "dapr": {
      "appId": "daprstate",
      "appPort": null,
      "components": [
        {
          "metadata": [
            {
              "name": "url",
              "secretRef": "",
              "value": "https://ACCOUNTNAME.documents.azure.com:443/"
            },
            {
              "name": "masterkey",
              "secretRef": "",
              "value": "MASTERKEY"
            },
            {
              "name": "database",
              "secretRef": "",
              "value": "dapr-db"
            },
            {
              "name": "collection",
              "secretRef": "",
              "value": "statestore"
            }
          ],
          "name": "statestore",
          "type": "state.azure.cosmosdb",
          "version": "v1"
        }
      ],
      "enabled": true
    },
    "revisionSuffix": "",
    "scale": {
      "maxReplicas": 1,
      "minReplicas": 1,
      "rules": null
    }
  },
  "type": "Microsoft.Web/containerApps"
}

The output above gives you a hint on how to define the Container App in an ARM template. Note the template section. It defines the containers that are part of this app. We have only one container with default resource allocations. It is possible to set environment variables for your containers but there are none in this case. We will set one later.

Also note the dapr section. It defines the app’s Dapr id and the components it can use.

Note: it is not a good practice to enter secrets in configuration files as we did above. To fix that:

  • add a secret to the Container App in the az containerapp create command via the --secrets flag. E.g. --secrets cosmosdb='YOURCOSMOSDBKEY'
  • in components-cosmosdb.yaml, replace value: YOURMASTERKEY with secretRef: cosmosdb

The URL for the app is https://daprstate.politegrass-37c1a51f.northeurope.azurecontainerapps.io. When I browse to it, I just get a welcome message: Hello from Super API on Container Apps.

Every revision also gets a URL. The revision URL is https://daprstate–6sbsmip.politegrass-37c1a51f.northeurope.azurecontainerapps.io. Of course, this revision URL gives the same result. Our app has only one revision.

Save state

The application has a /state endpoint you can post a JSON payload to in the form of:

{
  "key": "keyname",
  "data": "datatostoreinkey"
}

We can use curl to try this:

curl -v -H "Content-type: application/json" -d '{ "key": "cool","data": "somedata"}' 'https://daprstate.politegrass-37c1a51f.northeurope.azurecontainerapps.io/state'

Trying the curl command will result in an error because Dapr wants to use strong consistency with Cosmos DB and we configured it for session-level consistency. That is not very relevant for now as that is related to Dapr and not Container Apps. Switching the Cosmos DB account to strong consistency will fix the error.

Update the container app

Let’s see what happens when we update the container app. We will add an environment variable WELCOME to change the welcome message that the app displays. Run the following command:

az containerapp update \
  --name daprstate \
  --resource-group rg-dapr \
  --environment-variables WELCOME='Hello from new revision'

The template section in the JSON output is now:

"template": {
    "containers": [
      {
        "args": null,
        "command": null,
        "env": [
          {
            "name": "WELCOME",
            "secretRef": null,
            "value": "Hello from new revision"
          }
        ],
        "image": "gbaeke/dapr-state:1.0.0",
        "name": "daprstate",
        "resources": {
          "cpu": 0.5,
          "memory": "1Gi"
        }
      }
    ]

It is important to realize that, when the template changes, a new revision will be created. We now have two revisions, reflected in the portal as below:

Container App with two revisions

The new revision is active and receives 100% of the traffic. When we hit the / endpoint, we get Hello from new revision.

The idea here is that you deploy a new revision and test it before you make it active. Another option is to send a small part of the traffic to the new revision and see how that goes. It’s not entirely clear to me how you can automate this, including automated tests, similar to how progressive delivery controllers like Argo Rollouts and Flagger work. Tip to the team to include this! πŸ˜‰

The az container app create and update commands can take a lot of parameters. Use az container app update –help to check what is supported. You will also see several examples.

Check the logs

Let’s check the container app logs that are sent to the Log Analytics workspace attached to the Container App environment. Make sure you still have the log analytics id in $LOG_ANALYTICS_WORKSPACE_CLIENT_ID:

az monitor log-analytics query   --workspace $LOG_ANALYTICS_WORKSPACE_CLIENT_ID   --analytics-query "ContainerAppConsoleLogs_CL | where ContainerAppName_s == 'daprstate' | project ContainerAppName_s, Log_s, TimeGenerated | take 50"   --out table

This will display both logs from the application container and the Dapr logs. One of the log entries shows that the statestore was successfully initialized:

... msg="component loaded. name: statestore, type: state.azure.cosmosdb/v1"

Conclusion

We have only scratched the surface here but I hope this post gave you some insights into concepts such as environments, container apps, revisions, ingress, the use of Dapr and logging. There is much more to look at such as virtual network integration, setting up scale rules (e.g. KEDA), automated deployments, and much more… Stay tuned!

DNS Options for Private Azure Kubernetes Service

When you deploy Azure Kubernetes Service (AKS), by default the API server is publicly made available. That means it has a public IP address and an Azure-assigned name that’s resolvable by public DNS servers. To secure access, you can use authorized IP ranges.

As an alternative, you can deploy a private AKS cluster. That means the AKS API server gets an IP address in a private Azure virtual network. Most customers I work with use this option to comply with security policies. When you deploy a private AKS cluster, you still need a fully qualified domain name (FQDN) that resolves to the private IP address. There are several options you can use:

  • System (the default option): AKS creates a Private DNS Zone in the Node Resource Group; any virtual network that is linked to that Private DNS Zone can resolve the name; the virtual network used by AKS is automatically linked to the Private DNS Zone
  • None: default to public DNS; AKS creates a name for your cluster in a public DNS zone that resolves to the private IP address
  • Custom Private DNS Zone: AKS uses a Private DNS Zone that you or another team has created beforehand; this is mostly used in enterprise scenarios when the Private DNS Zones are integrated with custom DNS servers (e.g., on AD domain controllers, Infoblox, …)

The first two options, System and None, are discussed in the video below:

Overview of the 3 DNS options with a discussion of the first two: System and None

The third option, custom Private DNS Zone, is discussed in a separate video:

Private AKS with a custom Private DNS Zone

With the custom DNS option, you cannot use any name you like. The Private DNS Zone has to be like: privatelink.<region>.azmk8s.io. For instance, if you deploy your AKS cluster in West Europe, the Private DNS Zone’s name should be privatelink.westeurope.azmk8s.io. There is an option to use a subdomain as well.

When you use the custom DNS option, you also need to use a user-assigned Managed Identity for the AKS control plane. To make the registration of the A record in the Private DNS Zone work, in addition to linking the Private DNS Zone to the virtual network, the managed identity needs the following roles (at least):

  • Private DNS Zone Contributor role on the Private DNS Zone
  • Network Contributor role on the virtual network used by AKS

To deploy a private AKS cluster with a custom Private DNS Zone, you can use the following Azure CLI command which also sets the network plugin to azure (as an example). Private cluster also works with kubenet if you prefer that model. For other examples, see Create a private Azure Kubernetes Service cluster – Azure Kubernetes Service | Microsoft Docs.

az aks create \
    --resource-group RGNAME \
    --name aks-private \
    --network-plugin azure \
    --vnet-subnet-id "resourceId of AKS subnet" \
    --docker-bridge-address 172.17.0.1/16 \
    --dns-service-ip 10.3.0.10 \
    --service-cidr 10.3.0.0/24 \
    --enable-managed-identity \
    --assign-identity "resourceId of user-assigned managed identity" \
    --enable-private-cluster \
    --load-balancer-sku standard \
    --private-dns-zone "resourceId of Private DNS Zone"

The option that is easiest to use is the None option. You do not have to worry about Private DNS Zones and it just works. That option, at the time of this writing (June 2021) is still in preview and needs to be enabled on your subscription. In most cases though, I see enterprises go for the third option where the Private DNS Zones are created beforehand and integrated with custom DNS.

Approving a private endpoint connection with Azure CLI

In my previous post, I wrote about App Services with Private Link and used Azure Front Door to publish the web app. Azure Front Door Premium (in preview), can create a Private Endpoint and link it to your web app via Azure Private Link. When that happens, you need to approve the pending connection in Private Link Center.

The pending connection would be shown here, ready for approval

Although this is easy to do, you might want to automate this approval. Automation is possible via a REST API but it is easier via Azure CLI.

To do so, first list the private endpoint connections of your resource, in my case that is a web app:

az network private-endpoint-connection list --id /subscriptions/SUBID/resourceGroups/RGNAME/providers/Microsoft.Web/sites/APPSERVICENAME

The above command will return all private endpoint connections of the resource. For each connection, you get the following information:

 {
    "id": "PE CONNECTION ID",
    "location": "East US",
    "name": "NAME",
    "properties": {
      "ipAddresses": [],
      "privateEndpoint": {
        "id": "PE ID",
        "resourceGroup": "RESOURCE GROUP NAME OF PE"
      },
      "privateLinkServiceConnectionState": {
        "actionsRequired": "None",
        "description": "Please approve this connection.",
        "status": "Pending"
      },
      "provisioningState": "Pending"
    },
    "resourceGroup": "RESOURCE GROUP NAME OF YOUR RESOURCE",
    "type": "YOUR RESOURCE TYPE"
  }

To approve the above connection, use the following command:

az network private-endpoint-connection approve --id  PE CONNECTION ID --description "Approved"

The –id in the approve command refers to the private endpoint connection ID, which looks like below for a web app:

/subscriptions/YOUR SUB ID/resourceGroups/YOUR RESOURCE GROUP/providers/Microsoft.Web/sites/YOUR APP SERVICE NAME/privateEndpointConnections/YOUR PRIVATE ENDPOINT CONNECTION NAME

After running the above command, the connection should show as approved:

Approved private endpoint connection

When you automate this in a pipeline, you can first list the private endpoint connections of your resource and filter on provisioningState=”Pending” to find the ones you need to approve.

Hope it helps!

Azure App Services with Private Link

In one of my videos on my YouTube channel, I discuss Azure App Services with Private Link. The video describes how it works and provides an example of deploying the infrastructure with Bicep. The Bicep templates are on GitHub.

If you want to jump straight to the video, here it is:

In the rest of this blog post, I provide some more background information on the different pieces of the solution.

Azure App Service

Azure App Service is a great way to host web application and APIs on Azure. It’s PaaS (platform as a service), so you do not have to deal with the underlying Windows or Linux servers as they are managed by the platform. I often see AKS (Azure Kubernetes Service) implementations to host just a couple of web APIs and web apps. In most cases, that is overkill and you still have to deal with Kubernetes upgrades, node patching or image replacements, draining and rebooting the nodes, etc… And then I did not even discuss controlling ingress and egress traffic. Even if you standardize on packaging your app in a container, Azure App Service will gladly accept the container and serve it for you.

By default, Azure App Service gives you a public IP address and FQDN (Fully Qualified Domain Name) to reach your app securely over the Internet. The default name ends with azurewebsites.net but you can easily add custom domains and certificates.

Things get a bit more complicated when you want a private IP address for your app, reachable from Azure virtual networks and on-premises networks. One solution is to use an App Service Environment. It provides a fully isolated and dedicated environment to run App Service apps such as web apps and APIs, Docker containers and Functions. You can create an internal ASE which results in an Internal Load Balancer in front of your apps that is configured in a subnet of your choice. There is no need to configure Private Endpoints to make use of Private Link. This is often called native virtual network integration.

At the network level, an App Service Environment v2, works as follows:

External ASE
ASE networking (from Microsoft website)

Looking at the above diagram, an ILB ASE (but also an External ASE) also makes it easy to connect to back-end systems such as on-premises databases. The outbound connection to internal resources originates from an IP in the chosen integration subnet.

The downside to ASE is that its isolated instances (I1, I2, I3) are rather expensive. It also takes a long time to provision an ASE but that is less of an issue. In reality though , I would like to see App Service Environments go away and replaced by “regular” App Services with toggles that give you the options you require. You would just deploy App Services and set the options you require. In any case, native virtual network integration should not depend on dedicated or shared compute. One can only dream right? πŸ˜‰

Note: App Service Environment v3, in preview at the time of this writing, provides a simplified deployment experience and also costs less. See App Service Environment v3 public preview – Azure App Service

As an alternative to an ASE for a private app, consider a non-ASE App Service that, in production, uses Premium V2 or V3 instances. The question then becomes: “How do you get a private IP address?” That’s where Private Link comes in…

Azure Private Link with App Service

Azure Private Link provides connectivity to Azure services (such as App Service) via a Private Endpoint. The Private Endpoint creates a virtual network interface card (NIC) on a subnet of your choice. Connections to the NICs IP address end up at the Private Link service the Private Endpoint is connected to. Below is an example with Azure SQL Database where one Private Endpoint is mapped, via Azure Private Link, to one database. The other databases are not reachable via the endpoint.

Private Endpoint connected to Azure SQL Database (PaaS) via Private Link (source: Microsoft website)

To create a regular App Service that is accessible via a private IP, we can do the same thing:

  • create a private endpoint in the subnet of your choice
  • connect the private endpoint to your App Service using Private Link

Both actions can be performed at the same time from the portal. In the Networking section of your App Service, click Configure your private endpoint connections. You will see the following screen:

Private Endpoint connection of App Service

Now click Add to create the Private Endpoint:

Creating the private endpoint

The above creates the private endpoint in the default subnet of the selected VNET. When the creation is finished, the private endpoint will be connected to App Service and automatically approved. There are scenarios, such as connecting private endpoints from other tenants, that require you to approve the connection first:

Automatically approved connection

When you click on the private endpoint, you will see the subnet and NIC that was created:

Private Endpoint

From the above, you can click the link to the network interface (NIC):

Network interface created by the private endpoint

Note that when your delete the Private Endpoint, the interface gets deleted as well.

Great! Now we have an IP address that we can use to reach the App Service. If you use the default name of the web app, in my case https://web-geba.azurewebsites.net, you will get:

Oops, no access on the public name (resolves to public IP)

Indeed, when you enable Private Link on App Service, you cannot access the website using its public IP. To solve this, you will need to do something at the DNS level. For the default domain, azurewebsites.net, it is recommended to use Azure Private DNS. During the creation of my Private Endpoint, I turned on that feature which resulted in:

Private DNS Zone for privatelink.azurewebsites.net

You might wonder why this is a private DNS zone for privatelink.azurewebsites.net? From the moment you enable private link on your web app, Microsoft modifies the response to the DNS query for the public name of your app. For example, if the app is web-geba.azurewebsites.net and you query DNS for that name, it will respond with a CNAME of web-geba.privatelink.azurewebsites.net. If that cannot be resolved, you will still get the public IP but that will result in a 403.

In my case, as long as the DNS servers I use can resolve web-geba.privatelink.azurewebsites.net and I can connect to 10.240.0.4, I am good to go. Note however that the DNS story, including Private DNS and your own DNS servers, is a bit more complex that just checking a box! However, that is not the focus of this blogpost so moving on… πŸ˜‰

Note: you still need to connect to the website using https://web-geba.azurewebsites.net in your browser

Outbound connections to internal resources

One of the features of App Service Environments, is the ability to connect to back-end systems in Azure VNETs or on-premises. That is the result of native VNET integration.

When you enable Private Link on a regular App Service, you do not get that. Private Link only enables private inbound connectivity but does nothing for outbound. You will need to configure something else to make outbound connections from the Web App to resources such as internal SQL Servers work.

In the network configuration of you App Service, there is another option for outbound connectivity to internal resources – VNet integration.

VNET Integration

In the Networking section of App Service, find the VNet integration section and click Click here to configure. From there, you can add a VNet to integrate with. You will need to select a subnet in that VNet for this integration to work:

Outbound connectivity for App Service to Azure VNets

There are quite some things to know when it comes to VNet integration for App Service so be sure to check the docs.

Private Link with Azure Front Door

Often, a web app is made private because you want to put a Web Application Firewall (WAF) in front of the app. Typically, that goal is achieved by putting Azure Application Gateway (AG) with WAF in front of an internal App Services Environment. As as alternative to AG, you can also use virtual appliances such as Barracuda WAF for Azure. This works because the App Services Environment is a first-class citizen of your Azure virtual network.

There are multiple ways to put a WAF in front of a (non-ASE) App Service. You can use Front Door with the App Service as the origin, as long as you restrict direct access to the origin. To that end, App Services support access restrictions.

With Azure Front Door Premium, in preview at the time of this writing (June 2021), you can use Private Link as well. In that case, Azure Front Door creates a private endpoint. You cannot control or see that private endpoint because it is managed by Front Door. Because the private endpoint is not in your tenant, you will need to approve the connection from the private endpoint to your App Service. You can do that in multiple ways. One way is Private Link Center Pending Connections:

Pending Connections

If you check the video at the top of this page, this is shown here.

Conclusion

The combination of Azure networking with App Services Environments (ASE) and “regular” App Services (non-ASE) can be pretty confusing. You have native network integration for ASE, private access with private link and private endpoints for non-ASE, private DNS for private link domains, virtual network service endpoints, VNet outbound configuration for non-ASE etc… Most of the time, when I am asked for the easiest and most cost-effective option for a private web app in PaaS, I go for a regular non-ASE App Service and use Private Link to make the app accessible from the internal network.

Azure Policy for Kubernetes: Contraints and ConstraintTemplates

In one on my videos on my YouTube channel, I talked about Kubernetes authentication and used the image below:

Securing access to the Kubernetes API Server

To secure access to the Kubernetes API server, you need to be authenticated and properly authorized to do what you need to do. The third mechanism to secure access is admission control. Simply put, admission control allows you to inspect requests to the API server and accept or deny the request based on rules you set. You will need an admission controller, which is just code that intercepts the request after authentication and authorization.

There is a list of admission controllers that are compiled-in with two special ones (check the docs):

  • MutatingAdmissionWebhook
  • ValidatingAdmissionWebhook

With the two admission controllers above, you can develop admission plugins as extensions and configure them at runtime. In this post, we will look at a ValidatingAdmissionWebhook that is used together with Azure Policy to inspect requests to the AKS API Server and either deny or audit these requests.

Note that I already have a post about Azure Policy and pod security policies here. There is some overlap between that post and this one. In this post, we will look more closely at what happens on the cluster.

Want a video instead?

Azure Policy

Azure has its own policy engine to control the Azure Resource Manager (ARM) requests you can make. A common rule in many organizations for instance is the prohibition of creation of expensive resources or even creating resources in unapproved regions. For example, a European company might want to only create resources in West Europe or North Europe. Azure Policy is the engine that can enforce such a rule. For more information, see Overview of Azure Policy. In short, you select from an ever growing list of policies or you create your own. Policies can be grouped in policy initiatives. A single policy or an initiative gets assigned to a scope, which can be a management group, a subscription or a resource group. In the portal, you then check for compliance:

Compliancy? What do I care? It’s just my personal subscription 😁

Besides checking for compliance, you can deny the requests in real time. There are also policies that can create resources when they are missing.

Azure Policy for Kubernetes

Although Azure Policy works great with Azure Resource Manager (ARM), which is basically the API that allows you to interact with Azure resources, it does not work with Kubernetes out of the box. We will need an admission controller (see above) that understands how to interpret Kubernetes API requests in addition to another component that can sync policies in Azure Policy to Kubernetes for the admission controller to pick up. There is a built-in list of supported Kubernetes policies.

For the admission controller, Microsoft uses Gatekeeper v3. There is a lot, and I do mean a LOT, to say about Gatekeeper and its history. We will not go down that path here. Check out this post for more information if you are truly curious. For us it’s enough to know that Gatekeeper v3 needs to be installed on AKS. In order to do that, we can use an AKS add-on. In fact, you should use the add-on if you want to work with Azure Policy. Installing Gatekeeper v3 on its own will not work.

Note: there are ways to configure Azure Policy to work with Azure Arc for Kubernetes and AKS Engine. In this post, we only focus on the managed Azure Kubernetes Service (AKS)

So how do we install the add-on? It is very easy to do with the portal or the Azure CLI. For all details, check out the docs. With the Azure CLI, it is as simple as:

az aks enable-addons --addons azure-policy --name CLUSTERNAME --resource-group RESOURCEGROUP

If you want to do it from an ARM template, just add the add-on to the template as shown here.

What happens after installing the add-on?

I installed the add-on without active policies. In kube-system, you will find the two pods below:

azure-policy and azure-policy-webhook

The above pods are part of the add-on. I am not entirely sure what the azure-policy-webhook does, but the azure-policy pod is responsible for checking Azure Policy for new assignments and translating that to resources that Gatekeeper v3 understands (hint: constraints). It also checks policies on the cluster and reports results back to Azure Policy. In the logs, you will see things like:

  • No audit results found
  • Schedule running
  • Creating constraint

The last line creates a constraint but what exactly is that? Constraints tell GateKeeper v3 what to check for when a request comes to the API server. An example of a constraint is that a container should not run privileged. Constraints are backed by constraint templates that contain the schema and logic of the constraint. Good to know, but where are the Gatekeeper v3 pods?

Gatekeeper pods in the gatekeeper-system namespace

Gatekeeper was automatically installed by the Azure Policy add-on and will work with the constraints created by the add-on, synced from Azure Policy. When you remove these pods, the add-on will install them again.

Creating a policy

Although you normally create policy initiatives, we will create a single policy and see what happens on the cluster. In Azure Policy, choose Assign Policy and scope the policy to the resource group of your cluster. In Policy definition, select Kubernetes cluster should not allow privileged containers. As discussed, that is one of the built-in policies:

Creating a policy that does not allow privileged containers

In the next step, set the effect to deny. This will deny requests in real time. Note that the three namespaces in Namespace exclusions are automatically added. You can add extra namespaces there. You can also specifically target a policy to one or more namespaces or even use a label selector.

Policy parameters

You can now select Review and create and then select Create to create the policy assignment. This is the result:

Policy assigned

Now we have to wait a while for the change to be picked up by the add-on on the cluster. This can take several minutes. After a while, you will see the following log entry in the azure-policy pod:

Creating constraint: azurepolicy-container-no-privilege-blablabla

You can see the constraint when you run k get constraints. The constraint is based on a constraint template. You can list the templates with k get constrainttemplates. This is the result:

constraint templates

With k get constrainttemplates k8sazurecontainernoprivilege -o yaml, you will find that the template contains some logic:

the template’s logic

The block of rego contains the logic of this template. Without knowing rego, which is the policy language used by Open Policy Agent (OPA) which is used by Gatekeeper v3 on our cluster, you can actually guess that the privileged field inside securityContext is checked. If that field is true, that’s a violation of policy. Although it is useful to understand more details about OPA and rego, Azure Policy hides the complexity for you.

Does it work?

Let’s try to deploy the following deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx:1.14.2
          ports:
            - containerPort: 80
          securityContext:
            privileged: true

After running kubectl apply -f deployment.yaml, everything seems fine. But when we run kubectl get deploy:

Pods are not coming up

Let’s run kubectl get events:

Oops…

Notice that validation.gatekeeper.sh denied the request because privileged was set to true.

Adding more policies

Azure Security Center comes with a large initiative, Azure Security Benchmark, that also includes many Kubernetes policies. All of these policies are set to audit for compliance. On my system, the initiative is assigned at the subscription level:

Azure Security Benchmark assigned at subscription level with name Security Center

The Azure Policy add-on on our cluster will pick up the Kubernetes policies and create the templates and constraints:

Several new templates created

Now we have two constraints for k8sazurecontainernoprivilege:

Two constraints: one deny and the other audit

The new constraint comes from the larger initiative. In the spec, the enforcementAction is set to dryrun (audit). Although I do not have pods that violate k8sazurecontainernoprivilege, I do have pods that violate another policy that checks for host path mapping. That is reported back by the add-on in the compliance report:

Yes, akv2k8s maps to /etc/kubernetes on the host

Conclusion

In this post, you have seen what happens when you install the AKS policy add-on and enable a Kubernetes policy in Azure Policy. The add-on creates constraints and constraint templates that Gatekeeper v3 understands. The rego in a constraint template contains logic used to define the policy. When the policy is set to deny, Gatekeeper v3, which is an admission controller denies the request in real-time. When the policy is set to audit (or dry run at the constraint level), audit results are reported by the add-on to Azure Policy.

AKS Pod Identity with the Azure SDK for Go

File:Go Logo Blue.svg - Wikimedia Commons

In an earlier post, I wrote about the use of AKS Pod Identity (Preview) in combination with the Azure SDK for Python. Although that works fine, there are some issues with that solution:

Vulnerabilities as detected by SNYK

In order to reduce the size of the image and reduce/remove the vulnerabilities, I decided to rewrite the solution in Go. Just like the Python app (with FastAPI), we will expose an HTTP endpoint that displays all resource groups in a subscription. We will use a specific pod identity that has the Contributor role at the subscription level.

If you are more into videos, here’s the video version:

The code

The code is on GitHub @ https://github.com/gbaeke/go-msi in main.go. The code is kept as simple as possible. It uses the following packages:

github.com/Azure/azure-sdk-for-go/profiles/latest/resources/mgmt/resources
github.com/Azure/go-autorest/autorest/azure/auth

The resources package is used to create a GroupsClient to work with resource groups (check the samples):

groupsClient := resources.NewGroupsClient(subID)

subID contains the subscription ID, which is retrieved via the SUBSCRIPTION_ID environment variable. The container requires that environment variable to be set.

To authenticate to Azure and obtain proper authorization, the auth package is used with the NewAuthorizerFromEnvironment() method. That method supports several authentication mechanisms, one of which is managed identities. When we run this code on AKS, the pods can use a pod identity as explained in my previous post, if the pod identity addon is installed and configured. To obtain the authorization:

authorizer, err := auth.NewAuthorizerFromEnvironment()

authorizer is then passed to groupsClient via:

groupsClient.Authorizer = authorizer

Now we can use groupsClient to iterate through the resource groups:

ctx := context.Background()
log.Println("Getting groups list...")
groups, err := groupsClient.ListComplete(ctx, "", nil)
if err != nil {
	log.Println("Error getting groups", err)
}

log.Println("Enumerating groups...")
for groups.NotDone() {
	groupList = append(groupList, *groups.Value().Name)
	log.Println(*groups.Value().Name)
	err := groups.NextWithContext(ctx)
	if err != nil {
		log.Println("error getting next group")
	}
}

Note that the groups are printed and added to the groups slice. We can now serve the groupz endpoint that lists the groups (yes, the groups are only read at startup πŸ˜€):

log.Println("Serving on 8080...")
http.HandleFunc("/groupz", groupz)
http.ListenAndServe(":8080", nil)

The result of the call to /groupz is shown below:

My resource groups mess in my test subscription πŸ˜€

Running the code in a container

We can now build a single statically linked executable with go build and package it in a scratch container. If you want to know if your executable is statically linked, run file on it (e.g. file myapp). The result should be like:

myapp: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped

Here is the multi-stage Dockerfile:

# argument for Go version
ARG GO_VERSION=1.14.5

# STAGE 1: building the executable
FROM golang:${GO_VERSION}-alpine AS build

# git required for go mod
RUN apk add --no-cache git

# certs
RUN apk --no-cache add ca-certificates

# Working directory will be created if it does not exist
WORKDIR /src

# We use go modules; copy go.mod and go.sum
COPY ./go.mod ./go.sum ./
RUN go mod download

# Import code
COPY ./ ./


# Build the statically linked executable
RUN CGO_ENABLED=0 go build \
	-installsuffix 'static' \
	-o /app .

# STAGE 2: build the container to run
FROM scratch AS final

# copy compiled app
COPY --from=build /app /app

# copy ca certs
COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/

# run binary
ENTRYPOINT ["/app"]

In the above Dockerfile, it is important to add the ca certificates to the build container and later copy them to the scratch container. The code will need to connect to https://management.azure.com and requires valid root CA certificates to do so.

When you build the container with the Dockerfile, it will result in a docker image of about 8.7MB. SNYK will not report any known vulnerabilities. Great success!

Note: container will run as root though; bad! πŸ˜€ Nico Meisenzahl has a great post on containerizing .NET Core apps which also shows how to configure the image to not run as root.

Let’s add some YAML

The GitHub repo contains a workflow that builds and pushes a container to GitHub container registry. The most recent version at the time of this writing is 0.1.1. The YAML file to deploy this container as part of a deployment is below:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mymsi-deployment
  namespace: mymsi
  labels:
    app: mymsi
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mymsi
  template:
    metadata:
      labels:
        app: mymsi
        aadpodidbinding: mymsi
    spec:
      containers:
        - name: mymsi
          image: ghcr.io/gbaeke/go-msi:0.1.1
          env:
            - name: SUBSCRIPTION_ID
              value: SUBSCRIPTION ID
            - name: AZURE_CLIENT_ID
              value: APP ID OF YOUR MANAGED IDENTITY
            - name: AZURE_AD_RESOURCE
              value: "https://management.azure.com"
          ports:
            - containerPort: 8080

It’s possible to retrieve the subscription ID at runtime (as in the Python code) but I chose to just supply it via an environment variable.

For the above manifest to work, you need to have done the following (see earlier post):

  • install AKS with the pod identity add-on
  • create a managed identity that has the necessary Azure roles (in this case, enumerate resource groups)
  • create a pod identity that references the managed identity

In this case, the created pod identity is mymsi. The aadpodidbinding label does the trick to match the identity with the pods in this deployment.

Note that, although you can specify the AZURE_CLIENT_ID as shown above, this is not really required. The managed identity linked to the mymsi pod identity will be automatically matched. In any case, the logs of the nmi pod will reflect this.

In the YAML, AZURE_AD_RESOURCE is also specified. In this case, this is not required either because the default is https://management.azure.com. We need that resource to enumerate resource groups.

Conclusion

In this post, we looked at using the Azure SDK for Go together with managed identity on AKS, via the AAD pod identity addon. Similar to the Azure SDK for Python, the Azure SDK for Go supports managed identities natively. The difference with the Python solution is the size of the image and better security. Of course, that is an advantage stemming from the use of a language like Go in combination with the scratch image.

%d bloggers like this: