AKS Workload Identity Revisited

A while ago, I blogged about Workload Identity. Since then, Microsoft simplified the configuration steps and enabled Managed Identity, in addition to app registrations.

But first, let’s take a step back. Why do you need something like workload identity in the first place? Take a look at the diagram below.

Workloads (deployed in a container or not) often need to access Azure AD protected resources. In the diagram, the workload in the container wants to read secrets from Azure Key Vault. The recommended option is to use managed identity and grant that identity the required role in Azure Key Vault. Now your code just needs to obtain credentials for that managed identity.

In Kubernetes, that last part presents a challenge. There needs to be a mechanism to map such a managed identity to a pod and allow code in the container to obtain an Azure AD authentication token. The Azure AD Pod Identity project was a way to solve this but as of 24/10/2022, AAD Pod Identity is deprecated. It is now replaced by Workload Identity. It integrates with native Kubernetes capabilities to federate with external identity providers such as Azure AD. It has the following advantages:

  • Not an AKS feature, it’s a Kubernetes feature (other cloud, on-premises, edge); similar functionality exists for GKE for instance
  • Scales better than AAD Pod Identity
  • No need for custom resource definitions
  • No need to run pods that intercept IMDS (instance metadata service) traffic; instead, there are webhook pods that run when pods are created/updated

If the above does not make much sense, check https://learn.microsoft.com/en-us/azure/aks/use-azure-ad-pod-identity. But don’t use it OK? 😉

At a basic level, Workload Identity works as follows:

  • Your AKS cluster is configured to issue tokens. Via an OIDC (OpenID Connect) discovery document, published by AKS, Azure AD can validate the tokens it receives from the cluster.
  • A Kubernetes service account is created and properly annotated and labeled. Pods are configured to use the service account via the serviceAccount field.
  • The Azure Managed Identity is configured with Federated credentials. The federated credential contains a link to the OIDC discovery document (Cluster Issuer URL) and configures the namespace and service account used by the Kubernetes pod. That generates a subject identifier like system:serviceaccount:namespace_name:service_account_name.
  • Tokens can now be generated for the configured service account and swapped for an Azure AD token that can be picked up by your workload.
  • A Kubernetes mutating webhook is the glue that makes all of this work. It ensures the token is mapped to a file in your container and sets needed environment variables.

Creating a cluster with OIDC and Workload Identity

Create a basic cluster with one worker node and both features enabled. You need an Azure subscription and the Azure CLI. Ensure the prerequisites are met and that you are logged in with az login. Run the following in a Linux shell:

RG=your_resource_group
CLUSTER=your_cluster_name

az aks create -g $RG -n $CLUSTER --node-count 1 --enable-oidc-issuer \
  --enable-workload-identity --generate-ssh-keys

After deployment, find the OIDC Issuer URL with:

export AKS_OIDC_ISSUER="$(az aks show -n $CLUSTER -g $RG --query "oidcIssuerProfile.issuerUrl" -otsv)"

When you add /.well-known/openid-configuration to that URL, you will see something like:

OIDC discovery document

The field jwks_uri contains a link to key information, used by AAD to verify the tokens issued by Kubernetes.

In earlier versions of Workload Identity, you had to install a mutating admission webhook to project the Kubernetes token to a volume in your workload. In addition, the webhook also injected several environment variables:

  • AZURE_CLIENT_ID: client ID of an AAD application or user-assigned managed identity
  • AZURE_TENANT_ID: tenant ID of Azure subscription
  • AZURE_FEDERATED_TOKEN_FILE: the path to the federated token file; you can do cat $AZURE_FEDERATED_TOKEN_FILE to see the token. Note that this is the token issued by Kubernetes, not the exchanged AAD token (exchanging the token happens in your code). The token is a jwt. You can use https://jwt.io to examine it:
Decoded jwt issued by Kubernetes

But I am digressing… In the current implementation, you do not have to install the mutating webhook yourself. When you enable workload identity with the CLI, the webhook is installed automatically. In kube-system, you will find pods starting with azure-wi-webhook-controller-manager. The webhook kicks in whenever you create or update a pod. The end result is the same. You get the projected token + the environment variables.

Creating a service account

Ok, now we have a cluster with OIDC and workload identity enabled. We know how to retrieve the issuer URL and we learned we do not have to install anything else to make this work.

You will have to configure the pods you want a token for. Not every pod has containers that need to authenticate to Azure AD. To configure your pods, you first create a Kubernetes service account. This is a standard service account. To learn about service accounts, check my YouTube video.

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    azure.workload.identity/client-id: CLIENT ID OF MANAGED IDENTITY
  labels:
    azure.workload.identity/use: "true"
  name: sademo
  namespace: default

The label ensures that the mutating webhook will do its thing when a pod uses this service account. We also indicate the managed identity we want a token for by specifying its client ID in the annotation.

Note: you need to create the managed identity yourself and grab its client id. Use the following commands:

RG=your_resource_group
IDENTITY=your_chosen_identity_name
LOCATION=your_azure_location (e.g. westeurope)

export SUBSCRIPTION_ID="$(az account show --query "id" -otsv)"

az identity create --name $IDENTITY --resource-group $RG \
  --location $LOCATION --subscription $SUBSCRIPTION_ID

export USER_ASSIGNED_CLIENT_ID="$(az identity show -n $IDENTITY -g $RG --query "clientId" -otsv)"

echo $USER_ASSIGNED_CLIENT_ID

The last command prints the id to use in the service account azure.workload.identity/client-id annotation.

Creating a pod that uses the service account

Let’s create a deployment that deploys pods with an Azure CLI image:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: azcli-deployment
  namespace: default
  labels:
    app: azcli
spec:
  replicas: 1
  selector:
    matchLabels:
      app: azcli
  template:
    metadata:
      labels:
        app: azcli
    spec:
      # needs to refer to service account used with federation
      serviceAccount: sademo
      containers:
        - name: azcli
          image: mcr.microsoft.com/azure-cli:latest
          command:
            - "/bin/bash"
            - "-c"
            - "sleep infinity"

Above, the important line is serviceAccount: sademo. When the pod is created or modified, the mutating webhook will check the service account and its annotations. If it is configured for workload identity, the webhook will do its thing: projecting the Kubernetes token file and setting the environment variables:

The webhook did its work 😉

How to verify it works?

We can use the Azure CLI support for federated tokens as follows:

az login --federated-token "$(cat $AZURE_FEDERATED_TOKEN_FILE)" \
--service-principal -u $AZURE_CLIENT_ID -t $AZURE_TENANT_ID

After running the command, the error below appears:

Oh no…

Clearly, something is wrong and there is. We have forgotten to configure the managed identity for federation. In other words, when we present our Kubernetes token, Azure AD needs information to validate it and return an AAD token.

Use the following command to create a federated credential on the user-assigned managed identity you created earlier:

RG=your_resource_group
IDENTITY=your_chosen_identity_name
AKD_OIDC_ISSUER=your_oidc_issuer
SANAME=sademo

az identity federated-credential create --name fic-sademo \
  --identity-name $IDENTITY \
  --resource-group $RG --issuer ${AKS_OIDC_ISSUER} \
  --subject system:serviceaccount:default:$SANAME

After running the above command, the Azure Managed Identity has the following configuration:

Federated credentials on the Managed Identity

More than one credential is possible. Click on the name of the federated credential. You will see:

Details of the federated credential

Above, the OIDC Issuer URL is set to point to our cluster. We expect a token with a subject identifier (sub) of system:serviceaccount:default:sademo. You can check the decoded jwt earlier in this post to see that the sub field in the token issued by Kubernetes matches the one above. It needs to match or the process will fail.

Now we can run the command again:

az login --federated-token "$(cat $AZURE_FEDERATED_TOKEN_FILE)" \
--service-principal -u $AZURE_CLIENT_ID -t $AZURE_TENANT_ID

You will be logged in to the Azure CLI with the managed identity credentials:

But what about your own apps?

Above, we used the Azure CLI. The most recent versions (>= 2.30.0) support federated credentials and use MSAL. But what about your custom code?

The code below is written in Python and uses the Python Azure identity client library with DefaultAzureCredential. This code works with managed identity in Azure Container Apps or Azure App Service and was not modified. Here’s the code for reference:

import threading
import os
import logging
import time
import signal
from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential

from azure.appconfiguration.provider import (
    AzureAppConfigurationProvider,
    SettingSelector,
    AzureAppConfigurationKeyVaultOptions
)

logging.basicConfig(encoding='utf-8', level=logging.WARNING)

def get_config(endpoint):
  selects = {SettingSelector(key_filter=f"myapp:*", label_filter="prd")}
  trimmed_key_prefixes = {f"myapp:"}
  key_vault_options = AzureAppConfigurationKeyVaultOptions(secret_resolver=retrieve_secret)
  app_config = {}
  try:
    app_config = AzureAppConfigurationProvider.load(
            endpoint=endpoint, credential=CREDENTIAL, selects=selects, key_vault_options=key_vault_options, 
            trimmed_key_prefixes=trimmed_key_prefixes)
  except Exception as ex:
    logging.error(f"error loading app config: {ex}")

  return app_config

def run():
    try:
      global CREDENTIAL 
      CREDENTIAL = DefaultAzureCredential(exclude_visual_studio_code_credential=True)
    except Exception as ex:
      logging.error(f"error setting credentials: {ex}")

    endpoint = os.getenv('AZURE_APPCONFIGURATION_ENDPOINT')

    if not endpoint:
        logging.error("Environment variable 'AZURE_APPCONFIGURATION_ENDPOINT' not set")

    app_config =  {}
    while True:
        if not app_config:
            logging.warning("trying to load app config")
            app_config = get_config(endpoint)
        else:
            config_value=app_config['appkey']
            logging.warning(f"doing useful work with {config_value}")
            # if key exists in app_config, do something with it
            if 'mysecret' in app_config:
                logging.warning(f"and hush hush, there's a secret: {app_config['mysecret']}")
        time.sleep(5)


class GracefulKiller:
  kill_now = False
  def __init__(self):
    signal.signal(signal.SIGINT, self.exit_gracefully)
    signal.signal(signal.SIGTERM, self.exit_gracefully)

  def exit_gracefully(self, *args):
    self.kill_now = True


def retrieve_secret(uri):
    try:
        # uri is in format: https://<keyvaultname>.vault.azure.net/secrets/<secretname>
        # retrieve key vault uri and secret name from uri
        vault_uri = "https://" + uri.split('/')[2]
        secret_name = uri.split('/')[-1]
        logging.warning(f"Retrieving secret {secret_name} from {vault_uri}...")

        # retrieve the secret from Key Vault; CREDENTIAL was set globally
        secret_client = SecretClient(vault_url=vault_uri, credential=CREDENTIAL)

        # get secret value from Key Vault
        secret_value = secret_client.get_secret(secret_name).value

    except Exception as ex:
        print(f"retrieving secret: {ex}")
    
    return secret_value

# main function
def main():
    # create a Daemon tread
    t = threading.Thread(daemon=True, target=run, name="worker")
    t.start()
    

    killer = GracefulKiller()
    while not killer.kill_now:
        time.sleep(1)

    logging.info("Doing some important cleanup before exiting")
    logging.info("Gracefully exiting")


if __name__ == "__main__":
    main()

On Docker Hub, the gbaeke/worker:1.0.0 image runs this code. The following manifest runs the code on Kubernetes with the same managed identity as the Azure CLI example (same service account):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: worker
  namespace: default
  labels:
    app: worker
spec:
  replicas: 1
  selector:
    matchLabels:
      app: worker
  template:
    metadata:
      labels:
        app: worker
    spec:
      # needs to refer to service account used with federation
      serviceAccount: sademo
      containers:
        - name: worker
          image: gbaeke/worker:1.0.0
          env:
            - name: AZURE_APPCONFIGURATION_ENDPOINT
              value: https://ac-appconfig-vr6774lz3bh4i.azconfig.io

Note that the code tries to connect to Azure App Configuration. The managed identity has been given the App Configuration Data Reader role on a specific instance. The code tries to read the value of key myapp:appkey with label prd from that instance:

App Config key and values

To make the code work, the environment variable AZURE_APPCONFIGURATION_ENDPOINT is set to the URL of the App Config instance.

In the container logs, we can see that the value was successfully retrieved:

Log stream of worker

And yes, the code just works! It successfully connected to App Config and retrieved the value. The environment variables, set by the webhook discussed earlier, make this work, together with the Python Azure identity library!

Conclusion

Workload Identity works like a charm and is relatively easy to configure. At the time of writing (end of November 2022), I guess we are pretty close to general availability and we finally will have a fully supported managed identity solution for AKS and beyond!

Learn to use the Dapr authorization middleware

Based on a customer conversation, I decided to look into the Dapr middleware components. More specifically, I wanted to understand how the OAuth 2.0 middleware works that enables the Authorization Code flow.

In the Authorization Code flow, an authorization code is a temporary code that a client obtains after being redirected to an authorization URL (https://login.microsoftonline.com/{tenant}/oauth2/authorize) where you provide your credentials interactively (not useful for service-service non-interactive scenarios). That code is then handed to your app which exchanges it for an access token. With the access token, the authenticated user can access your app.

Instead of coding this OAuth flow in your app, we will let the Dapr middleware handle all of that work. Our app can then pickup the token from an HTTP header. When there is a token, access to the app is granted. Otherwise, Dapr (well, the Dapr sidecar next to your app) redirects your client to the authorization server to get a code.

Let’s take a look how this all works with Azure Active Directory. Other authorization servers are supported as well: Facebook, GitHub, Google, and more.

What we will build

Some experience with Kubernetes, deployments, ingresses, Ingress Controllers and Dapr is required.

If you think the explanation below can be improved, or I have made errors, do let me know. Let’s go…

Create an app registration

Using Azure AD means we need an app registration! Other platforms have similar requirements.

First, create an app registration following this quick start. In the first step, give the app a name and, for this demo, just select Accounts in this organizational directory only. The redirect URI will be configured later so just click Register.

After following the quick start, you should have:

  • the client ID and client secret: will be used in the Dapr component
  • the Azure AD tenant ID: used in the auth and token URLs in the Dapr component; Dapr needs to know where to redirect to and where to exchange the authorization code for an access token
App registration in my Azure AD Tenant

There is no need for your app to know about these values. All work is done by Dapr and Dapr only!

We will come back to the app registration later to create a redirect URI.

Install an Ingress Controller

We will use an Ingress Controller to provide access to our app’s Dapr sidecar from the Internet, using HTTP.

In this example, we will install ingress-nginx. Use the following commands (requires Helm):

helm upgrade --install ingress-nginx ingress-nginx \
  --repo https://kubernetes.github.io/ingress-nginx \
  --namespace ingress-nginx --create-namespace

Although you will find articles about daprizing your Ingress Controller, we will not do that here. We will use the Ingress Controller simply as a way to provide HTTP access to the Dapr sidecar of our app. We do not want Dapr-to-Dapr gRPC traffic between the Ingress Controller and our app.

When ingress-nginx is installed, grab the public IP address of the service that it uses. Use kubectl get svc -n ingress-nginx. I will use the IP address with nip.io to construct a host name like app.11.12.13.14.nip.io. The nip.io service resolves such a host name to the IP address in the name automatically.

The host name will be used in the ingress and the Dapr component. In addition, use the host name to set the redirect URI of the app registration: https://app.11.12.13.14.nip.io. For example:

Added a platform configuration for a web app and set the redirect URI

Note that we are using https here. We will configure TLS on the ingress later.

Install Dapr

Install the Dapr CLI on your machine and run dapr init -k. This requires a working Kubernetes context to install Dapr to your cluster. I am using a single-node AKS cluster in Azure.

Create the Dapr component and configuration

Below is the Dapr middleware component we need. The component is called myauth. Give it any name you want. The name will later be used in a Dapr configuration that is, in turn, used by the app.

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: myauth
spec:
  type: middleware.http.oauth2
  version: v1
  metadata:
  - name: clientId
    value: "CLIENTID of your app reg"
  - name: clientSecret
    value: "CLIENTSECRET that you created on the app reg"
  - name: authURL
    value: "https://login.microsoftonline.com/TENANTID/oauth2/authorize"
  - name: tokenURL
    value: "https://login.microsoftonline.com/TENANTID/oauth2/token"
  - name: redirectURL
    value: "https://app.YOUR-IP.nip.io"
  - name: authHeaderName
    value: "authorization"
  - name: forceHTTPS
    value: "true"
scopes:
- super-api

Replace YOUR-IP with the public IP address of the Ingress Controller. Also replace the TENANTID.

With the information above, Dapr can exchange the authorization code for an access token. Note that the client secret is hard coded in the manifest. It is recommended to use a Kubernetes secret instead.

The component on its own is not enough. We need to create a Dapr configuration that references it:

piVersion: dapr.io/v1alpha1
kind: Configuration
metadata:
  name: auth
spec:
  tracing:
    samplingRate: "1"
  httpPipeline:
    handlers:
    - name: myauth # reference the oauth component here
      type: middleware.http.oauth2    

Note that the configuration is called auth. Our app will need to use this configuration later, via an annotation on the Kubernetes pods.

Both manifests can be submitted to the cluster using kubectl apply -f. It is OK to use the default namespace for this demo. Keep the configuration and component in the same namespace as your app.

Deploy the app

The app we will deploy is super-api, which has a /source endpoint to dump all HTTP headers. When authentication is successful, the authorization header will be in the list.

Here is deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: super-api-deployment
  labels:
    app: super-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: super-api
  template:
    metadata:
      labels:
        app: super-api
      annotations:
        dapr.io/enabled: "true"
        dapr.io/app-id: "super-api"
        dapr.io/app-port: "8080"
        dapr.io/config: "auth" # refer to Dapr config
        dapr.io/sidecar-listen-addresses: "0.0.0.0" # important
    spec:
      securityContext:
        runAsUser: 10000
        runAsNonRoot: true
      containers:
        - name: super-api
          image: ghcr.io/gbaeke/super:1.0.7
          securityContext:
            readOnlyRootFilesystem: true
            capabilities:
              drop:
                - all
          args: ["--port=8080"]
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
          env:
            - name: IPADDRESS
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: WELCOME
              value: "Hello from the Super API on AKS!!! IP is: $(IPADDRESS)"
            - name: LOG
              value: "true"       
          resources:
              requests:
                memory: "64Mi"
                cpu: "50m"
              limits:
                memory: "64Mi"
                cpu: "50m"
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 15
          readinessProbe:
              httpGet:
                path: /readyz
                port: 8080
              initialDelaySeconds: 5
              periodSeconds: 15

Note the annotations in the manifest above:

  • dapr.io/enabled: injects the Dapr sidecar in the pods
  • dapr.io/app-id: a Dapr app needs an id; a service will automatically be created with that id and -dapr appended; in our case the name will be super-api-dapr; our ingress will forward traffic to this service
  • dapr.io/app-port: Dapr will need to call endpoints in our app (after authentication in this case) so it needs the port that our app container uses
  • dapr.io/config: refers to the configuration we created above, which enables the http middleware defined by our OAuth component
  • dapr.io/sidecar-listen-addresses: ⚠️ needs to be set to “0.0.0.0”; without this setting, we will not be able to send requests to the Dapr sidecar directly from the Ingress Controller

Submit the app manifest with kubectl apply -f.

Check that the pod has two containers: the Dapr sidecar and your app container. Also check that there is a service called super-api-dapr. There is no need to create your own service. Our ingress will forward traffic to this service.

Create an ingress

In the same namespace as the app (default), create an ingress. This requires the ingress-nginx Ingress Controller we installed earlier:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: super-api-ingress
  namespace: default
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  tls:
    - hosts:
      - app.YOUR-IP.nip.io
      secretName: tls-secret 
  rules:
  - host: app.YOUR-IP.nip.io
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: super-api-dapr
            port: 
              number: 80

Replace YOUR-IP with the public IP address of the Ingress Controller.

For this to work, you also need a secret with a certificate. Use the following commands:

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=app.YOUR-IP.nip.io"
kubectl create secret tls tls-secret --key tls.key --cert tls.crt

Replace YOUR-IP as above.

Testing the configuration

Let’s use the browser to connect to the /source endpoint. You will need to use the Dapr invoke API because the request will be sent to the Dapr sidecar. You need to speak a language that Dapr understands! The sidecar will just call http://localhost:8080/source and send back the response. It will only call the endpoint when authentication has succeeded, otherwise you will be redirected.

Use the following URL in the browser. It’s best to use an incognito session or private window.

https://app.20.103.17.249.nip.io/v1.0/invoke/super-api/method/source

Your browser will warn you of security risks because the certificate is not trusted. Proceed anyway! 😉

Note: we could use some URL rewriting on the ingress to avoid having to use /v1.0/invoke etc… You can also use different URL formats. See the docs.

You should get an authentication screen which indicates that the Dapr configuration is doing its thing:

Redirection to the authorize URL

After successful authentication, you should see the response from the /source endpoint of super-api:

Response from /source

The response contains an Authorization header. The header contains a JWT after the word Bearer. You can paste that JWT in https://jwt.io to see its content. We can only access the app with a valid token. That’s all we do in this case, ensuring only authenticated users can access our app.

Conclusion

In this article, we used Dapr to secure access to an app without having to modify the app itself. The source code of super-api was not changed in any way to enable this functionality. Via a component and a configuration, we instructed our app’s Dapr sidecar to do all this work for us. App endpoints such as /source are only called when there is a valid token. When there is such a token, it is saved in a header of your choice.

It is important to note that we have to send HTTP requests to our app’s sidecar for this to work. To enable this, we instructed the sidecar to listen on all IP addresses of the pod, not just 127.0.0.1. That allows us to send HTTP requests to the service that Dapr creates for the app. The ingress forwards requests to the Dapr service directly. That also means that you have to call your endpoint via the Dapr invoke API. I admit that can be confusing in the beginning. 😉

Note that, at the time of this writing (June 2022), the OAuth2 middleware in Dapr is in an alpha state.

%d bloggers like this: