Dapr Service Invocation between an HTTP Python client and a GRPC Go server

Recently, I published several videos about Dapr on my Youtube channel. The videos cover the basics of state management, PubSub and service invocation.

The Getting Started with state management and service invocation:

Let’s take a closer look at service invocation with HTTP, Python and Node.

Service Invocation with HTTP

Service Invocation Diagram
Service invocation (image from Dapr docs)

The services you write (here Service A and B) talk to each other using the Dapr runtime. On Kubernetes, you talk to a Dapr sidecar deployed alongside your service container. On your development machine, you run your services via dapr run.

If you want to expose a method on Service B and you use HTTP, you just need to expose an HTTP handler or route. For example, with Express in Node you would use something like:

const express = require('express');
const app = express();
app.post('/neworder', (req, res) => { your code }

You then run your service and annotate it with the proper Dapr annotations (Kubernetes):

annotations:
        dapr.io/enabled: "true"
        dapr.io/id: "node"
        dapr.io/port: "3000"

On your local machine, you would just run the service via dapr run:

dapr run --app-id node --app-port 3000 node app.js

In the last example, the Dapr id is node and we indicate that the service is listening on port 3000. To invoke the method from service A, it can use the following code (Python example shown):

dapr_port = os.getenv("DAPR_HTTP_PORT", 3500)
dapr_url = "http://localhost:{}/v1.0/invoke/node/method/neworder".format(dapr_port)
message = {"data": {"orderId": 1234}}
response = requests.post(dapr_url, json=message, timeout=5)

As you can see, service A does not contact service B directly. It just talks to its Dapr sidecar on localhost (or Dapr on your dev machine) and asks it to invoke the neworder method via a service that uses Dapr id node. It is also clear that both service A and B use HTTP only. Because you just use HTTP to expose and invoke methods, you can use any language or framework.

You can find a complete example here with Node and Python.

Service Invocation with HTTP and GRPC

Dapr has SDKs available for C#, Go and other languages. You might prefer those over the generic HTTP approach. In the case of Go, the SDK uses GRPC to interface with the Dapr runtime. With Dapr in between, one service can use HTTP while another uses GRPC.

Let’s take a look at a service that exposes a method (HelloFromGo) from a Go application. The full example is here. Instead of creating an HTTP route with the name of your method, you use an OnInvoke handler that looks like this (only the start is shown, see the full code):

func (s *server) OnInvoke(ctx context.Context, in *commonv1pb.InvokeRequest) (*commonv1pb.InvokeResponse, error) {
	var response string

	switch in.Method {
	case "HelloFromGo":

		response = s.HelloFromGo()

Naturally, you also have to implement an HelloFromGo() method as well:

// HelloFromGo is a simple demo method to invoke
func (s *server) HelloFromGo() string {
	return "Hello"

}

Another service can use any language or framework and invoke the above method with a POST to the following URL if the Dapr id of the Go service is goserver:

http://localhost:3500/v1.0/invoke/goserver/method/HelloFromGo

A POST to the above URL tells Dapr to execute the OnInvoke method via GRPC which will run the HelloFromGo function. It is perfectly possible to include a payload in your POST and have the OnInvoke handler to process that payload. The full example is here which also includes sending and processing a JSON payload and sending back a text response. You will need to somewhat understand how GRPC works and also understand protocol buffers. A good book on GRPC is the following one: https://learning.oreilly.com/library/view/grpc-up-and/9781492058328/.

Conclusion

Dapr allows you to choose between HTTP and GRPC interfaces to interact with the runtime. You can choose whatever is most comfortable to you. One team can use HTTP with Python, JavaScript etc… while other teams use GRPC with their language of choice. Whatever you choose, the Dapr runtime will make sure service invocation just works allowing you to focus on the code.

Integrate an Azure Storage Account with Active Directory

A customer with a Windows Virtual Desktop deployment needed access to several file shares for one of their applications. The integration of Azure Storage Accounts with Active Directory allows us to provide this functionality without having to deploy and maintain file services on a virtual machine.

A sketch of the environment looks something like this:

Azure File Share integration with Active Directory

Based on the sketch above, you should think about the requirements to make this work:

  • Clients that access the file share need to be joined to a domain. This can be an Azure Active Directory Domain Services (AADDS) managed domain or just plain old Active Directory Domain Services (ADDS). The steps to integrate the storage account with AADDS or AADS are different as we will see later. I will only look at ADDS integration via a PowerShell script. In this case, the WVD virtual machines are joined to ADDS and domain controllers are available on Azure.
  • Users that access the file share need to have an account in ADDS that is synced to Azure AD (AAD). This is required because users or groups are given share-level permissions at the storage account level to their AAD identity. The NTFS-level permissions are given to the ADDS identity. Since this is a Windows Virtual Desktop deployment, that is already the case.
  • You should think about how the clients (WVD here) connect to the file share. If you only need access from Azure subnets, then VNET Service Endpoints are a good choice. This will configure direct routing to the storage account in the subnet’s route table and also provides the necessary security as public access to the storage account is blocked. You could also use Private Link or just access the storage account via public access. I do not recommend the latter so either use service endpoints or private link.

Configuring the integration

In the configuration of the storage account, you will see the following options:

Storage account AD integration options

Integration with AADDS is just a click on Enabled. For ADDS integration however, you need to follow another procedure from a virtual machine that is joined to the ADDS domain.

On that virtual machine, log on with an account that can create a computer object in ADDS in an OU that you set in the script. For best results, the account you use should be synced to AAD and should have at least the Contributor role on the storage account.

Next, download the Microsoft provided scripts from here and unzip them in a folder like C:\scripts. You should have the following scripts and modules in there:

Scripts and PowerShell module for Azure Storage Account integration

Next, add a script to the folder with the following contents and replace the <PLACEHOLDERS>:

Set-ExecutionPolicy -ExecutionPolicy Unrestricted -Scope CurrentUser

.\CopyToPSPath.ps1 

Import-Module -Name AzFilesHybrid

Connect-AzAccount

$SubscriptionId = "<YOUR SUB ID>"
$ResourceGroupName = "<YOUR RESOURCE GROUP>"
$StorageAccountName = "<YOUR STORAGE ACCOUNT NAME>"

Select-AzSubscription -SubscriptionId $SubscriptionId 

Join-AzStorageAccountForAuth `
        -ResourceGroupName $ResourceGroupName `
        -Name $StorageAccountName `
        -DomainAccountType "ComputerAccount" ` 
        -OrganizationalUnitName "<OU DISTINGUISHED NAME>

Debug-AzStorageAccountAuth -StorageAccountName $StorageAccountName -ResourceGroupName $ResourceGroupName -Verbose

Run the script from the C:\scripts folder so it can execute CopyToPSPath.ps1 and import the AzFilesHybrid module. The Join-AzStorageAccountForAuth cmdlet does the actual work. When you are asked to rerun the script, do so!

The result of the above script should be a computer account in the OU that you chose. The computer account has the name of the storage account.

In the storage account configuration, you should see the following:

The blurred section will show the domain name

Now we can proceed to granting “share-level” access rights, similar to share-level rights on a Windows file server.

Granting share-level access

Navigate to the file share and click IAM. You will see the following:

IAM on the file share level

Use + Add to add AAD users or groups. You can use the following roles:

  • Storage File Data SMB Share Reader: read access
  • Storage File Data SMB Share Contributor: read, write and delete
  • Storage File Data SMB Share Elevated Contributor: read, write and delete + modify ACLs at the NTFS level

For example, if I needed to grant read rights to the group APP_READERS in ADDS, I would grant the Storage File Data SMB Share Reader role to the APP_READERS group in Azure AD (synced from ADDS).

Like on a Windows File Server, share permissions are not sufficient. Let’s add some good old NTFS rights…

Granting NTFS Access Rights

For a smooth experience, log on to a domain-joined machine with an ADDS account that is synced to an AAD account that has at least the Contributor role on the storage account.

To grant the NTFS right, map a drive with the storage account key. Use the following command:

net use <desired-drive-letter>: \\<storage-account-name>.file.core.windows.net\<share-name> /user:Azure\<storage-account-name> <storage-account-key>

Get the storage account key from here:

Storage account access keys

Now you can open the mapped drive in Explorer and set NTFS rights. Alternatively, you can use icacls.exe or other tools.

Mapping the drive for users

Now that the storage account is integrated with ADDS, a user can log on to a domain-joined machine and mount the share without having to provide credentials. As long as the user has the necessary share and NTFS rights, she can access the data.

Mapping the drive can be done in many ways but a simple net use Z: \\storageaccountname.file.core.windows.net\share will suffice.

Securing the connection

You should configure the storage account in such a way that it only allows access from selected clients. In this case, because the clients are Windows Virtual Desktops in a specific Azure subnet, we can use Virtual Network Service Endpoints. They can be easily configured from Firewalls and Virtual Networks:

Access from selected networks only: 3 subnets in this case

Granting access to specific subnets results in the configuration of virtual network service endpoints and a modification of the subnet route table with a direct route to the storage account on the Microsoft network. Note that you are still connecting to the public IP of the storage account.

If you decide to use Private Link instead, you would get a private IP in your subnet that is mapped to the storage account. In that case, even on-premises clients could connect to the storage account over the VPN or ExpressRoute private peerings. Of course, using private link would require extra DNS configuration as well.

Some extra notes

  • when you cannot configure the integration with the PowerShell script, follow the steps to enable the integration manually; do not forget the set the Kerberos password properly
  • it is recommended to put the AD computer accounts that represent the storage accounts in a separate OU; enable a Group Policy on that OU that prevents password resets on the computer accounts

Conclusion

Although there is some work to be done upfront and there are requirements such as Azure AD and Azure AD Connect, using an Azure Storage Account to host Active Directory integrated file shares is recommended. Remember that it works with both AADDS and ADDS. In this post, we looked at ADDS only and integration via the Microsoft-provided PowerShell scripts.

Adding SMB1 protocol support to Windows Server 2019

I realize this is not a very exciting post, especially compared to my other wonderful musing on this site, but I felt I really had to write it to share the pain!

A colleague I work with needed to enable this feature on an Azure Windows Server 2019 machine to communicate with some old system that only supports Server Message Block version 1 (SMB1). Easy enough to add that right?

Trying the installation

Let’s first get some information about the feature:

Get-WindowsOptionalFeature -Online -FeatureName SMB1Protocol

The output:

Notice the State property? The feature is disabled and the payload (installation files) are not on the Azure virtual machine.

When you try to install the feature, you get:

I guess I need the Windows Server 2019 sources!

Downloading Windows Server 2019

I downloaded Windows Server 2019 (November 2019 version) from https://my.visualstudio.com/Downloads?q=SQL%20Server%202019. I am not sure if you can use the evaluation version of Windows Server 2019 because I did not try that. I downloaded the ISO to the Azure virtual machine.

Mount ISO and copy install.wim

On recent versions of Windows, you can right click an ISO and mount it. In the mounted ISO, search for install.wim and copy that file to a folder on your C: disk like c:\wim. Under c:\wim, create a folder called mount and run the following command:

dism /mount-wim /wimfile:c:\wim\install.wim /index:4 /mountdir:c:\wim\mount /readonly

The contents of install.wim is now available in c:\wim\mount. Now don’t try to enable the feature by pointing to the sources with the -source parameter of Enable-WindowsOptionalFeature. It will not work yet!

Patch the mounted files

The Azure Windows Server 2019 image (at time of writing, June 2020) has a cumulative update installed: https://support.microsoft.com/en-us/help/4551853/windows-10-update-kb4551853. For this installatin to work, I needed to download the update from https://www.catalog.update.microsoft.com/home.aspx and put it somewhere like c:\patches.

Now we can update the mounted files offline with the following command:

Dism /Add-Package /Image:"C:\wim\mount" /PackagePath="c:\patches\windows10.0-kb4551853-x64_ce1ea7def481ee2eb8bba6db49ddb42e45cba54f.msu" 

It will take a while to update! You need to do this because the files mounted from the downloaded ISO do not match the version of the Windows Server 2019 image. Without this update, the installation of the SMB1 feature will not succeed.

Enable the SMB1 feature

Now we can enable the feature with the following command:

dism /online /Enable-Feature /FeatureName:SMB1Protocol /All /source:c:\wim\mount\windows\winsxs /limitaccess 

You will need to reboot. After rebooting, from a PowerShell prompt run the command below to check the installation:

Get-WindowsOptionalFeature -Online -FeatureName SMB1Protocol

The State property should say: Enabled

Conclusion

Something this trivial took me way too long. Is there a simpler way? Let me know! πŸ‘

And by the way, don’t enable SMB1. It’s not secure at all. But in some cases, there’s just no way around it.

Adding Authentication and Authorization to an Azure Static Web App

In a previous post, we created a static web app that retrieves documents from Cosmos DB via an Azure Function. The Azure Function got deployed automatically and runs off the same domain as your app. In essence, that frees you from having to setup Azure Functions separately and configuring CORS in the process.

Instead of allowing anonymous users to call the api at https://yourwebapp/api/device, I only want to allow specific users to do so. In this post, we will explore how that works.

You can find the source code of the static web app and the API on GitHub: https://github.com/gbaeke/az-static-web-app.

More into video tutorials? Then check out the video below. I recommend 1.2x speed! πŸ˜‰

Full version about creating the app and protecting the API

Create a routes.json

To define the protected routes, you need routes.json in the root of your project:

routes.json to protect /api/*

The routes.json file serves multiple purposes. Check out how it works here. In my case, I just want to protect the /api/* routes and allow the Authenticated users role. The Authenticated role is a built-in role but you should create custom roles to protect sensitive data (more info near the end of this post). For our purposes, the platform error override is not needed and be removed. These overrides are useful though as they allow you to catch errors and act accordingly.

Push the above change to your repository for routes.json to go in effect. Once you do, access to /api/* requires authentication. Without it, you will get a 401 Unauthorized error. To fix that, invite your users and define roles.

Inviting Users

In Role Management, you can invite individual users to your app:

User gbaeke (via GitHub) user identity added

Just click Invite and fill in the blanks. Inviting a user results in an invitation link you should send the user. Below is an example for my Twitter account:

Let’s invite myself via my Twitter account

When I go to the invite link, I can authorize the app:

Authorizing Static Web Apps to access my account

After this, you will also get a Consent screen:

Granting Consent (users can always remove their data later; yeah right πŸ˜‰)

When consent is given, the application will open with authentication. I added some code to the HTML page to display when the user is authenticated. The user name can be retrieved with a call to .auth/me (see later).

App with Twitter handle shown

In the Azure Portal, the Twitter account is now shown as well.

User added to roles of the web app

Note: anyone can actually authenticate to your app; you do not have to invite them; you invite users only when you want to assign them custom roles

Simple authentication code

The HTML code in index.html contains some links to login and logout:

  • To login: a link to /.auth/login/github
  • To logout: a link to /.auth/logout

Microsoft provides these paths under /.auth automatically to support the different authentication scenarios. In my case, I only have a GitHub login. To support Twitter or Facebook logins, I would need to provide some extra logic for the user to choose the provider.

In the HTML, the buttons are shown/hidden depending on the existence of user.UserDetails. The user information is retrieved via a call to the system-provided /.auth/me with the code below that uses fetch:

async  getUser() {
     const response = await fetch("/.auth/me");
     const payload = await response.json();
     const { clientPrincipal } = payload;
     this.user = clientPrincipal;

user.UserDetails is just the username on the platform: gbaeke on GitHub, geertbaeke on Twitter, etc…

The combination of the routes.json file that protects /api/* and the authentication logic above results in the correct retrieval of the Cosmos DB documents. Note that when you are not authorized, the list is just empty with a 401 error in the console. In reality, you should catch the error and ask the user to authenticate.

One way of doing so is redirecting to a login page. Just add logic to routes.json that serves the path you want to use when the errorType is Unauthenticated as shown below:

"platformErrorOverrides": [
    {
      "errorType": "NotFound",
      "serve": "/custom-404.html"
    },
    {
      "errorType": "Unauthenticated",
      "serve": "/login"
    }
  ]

The danger of the Authenticated role

Above, we used the Authenticated role to provide access to the /api/* routes. That is actually not a good idea once you realize that non-invited users can authenticate to your app as well. As a general rule: always use a custom role to allow access to sensitive resources. Below, I changed the role in routes.json to reader. Now you can invite users and set their role to reader to make sure that only invited users can access the API!

"routes": [
      {
        "route": "/api/*",
        "allowedRoles": ["reader"]
      }

      
    ]

Below you can clearly see the effect of this. I removed GitHub user gbaeke from the list of users but I can still authenticate with the account. Because I am missing the reader role, the drop down list is not populated and a 401 error is shown:

Authenticated but not in the reader role

Conclusion

In this post, we looked at adding authentication and authorization to protect calls to our Azure Functions API. Azure Static Web Apps tries to make that process as easy as possible and we all now how difficult authentication and authorization can be in reality! And remember: protect sensitive API calls with custom roles instead of the built-in Authenticated role.

First Look at Azure Static Web Apps

Note: part 2 looks at the authentication and authorization part.

At Build 2020, Microsoft announced Azure Static Web Apps, a new way to host static web apps on Azure. In the past, static web apps, which are just a combination of HTML, JavaScript and CSS, could be hosted in a Storage Account or a regular Azure Web App.

When you compare Azure Static Web Apps with the Storage Account approach, you will notice there are many more features. Some of those features are listed below (also check the docs):

  • GitHub integration: GitHub actions are configured for you to easily deploy your app from your GitHub repository to Azure Static Web Apps
  • Integrated API support: APIs are provided by Azure Functions with an HTTP Trigger
  • Authentication support for Azure Active Directory, GitHub and other providers
  • Authorization role definitions via the portal and a roles.json file in your repository
  • Staging versions based on a pull request

It all works together as shown below:

SWAdiagram.png
Azure Static Web Apps (from https://techcommunity.microsoft.com/t5/azure-app-service/introducing-app-service-static-web-apps/ba-p/1394451)

As a Netlify user, this type of functionality is not new to me. Next to static site hosting, they also provide serverless functions, identity etc…

If you are more into video tutorials…

Creating the app and protecting calls to the API

Let’s check out an example to see how it works on Azure…

GitHub repository

The GitHub repo I used is over at https://github.com/gbaeke/az-static-web-app. You will already see the .github/workflows folder that contains the .yml file that defines the GitHub Actions. That folder will be created for you when you create the Azure Static Web App.

The static web app in this case is a simple index.html that contains HTML, JavaScript and some styling. Vue.js is used as well. When you are authenticated, the application reads a list of devices from Cosmos DB. When you select a device, the application connects to a socket.io server, waiting for messages from the chosen device. The backend for the messages come from Redis. Note that the socket.io server and Redis configuration are not described in this post. Here’s a screenshot from the app with a message from device01. User gbaeke is authenticated via GitHub. When authenticated, the device list is populated. When you log out, the device list is empty. There’s no error checking here so when the device list cannot be populated, you will see a 404 error in the console. πŸ˜‰

Azure Static Web App in action

Note: Azure Static Web Apps provides a valid certificate for your app, whether it uses a custom domain or not; in the above screenshot, Not secure is shown because the application connects to the socket.io server over HTTP and Mixed Content is allowed; that is easy to fix with SSL for the socket.io server but I chose to not configure that

The API

Although API is probably too big a word for it, the devices drop down list obtains its data from Cosmos DB, via an Azure Function. It was added from Visual Studio Code as follows:

  • add the api folder to your project
  • add a new Function Project and choose the api folder: simply use F1 in Visual Studio Code and choose Azure Functions: Create New Project… You will be asked for the folder. Choose api.
  • modify the code of the Function App to request data from Cosmos DB

To add an Azure Function in Visual Studio Code, make sure you install the Azure Functions extension and the Azure Function Core Tools. I installed the Linux version of Core Tools in WSL 2.

Adding the function (JavaScript; HTTP Trigger, anonymous, name of GetDevice) should result in the following structure:

Function app as part of the static web app (api folder)

Next, I modified function.json to include a Cosmos DB input next to the existing HTTP input and output:

{
  "bindings": [
    {
      "authLevel": "anonymous",
      "type": "httpTrigger",
      "direction": "in",
      "name": "req",
      "methods": [
        "get",
        "post"
      ],
      "route": "device"
    },
    {
      "type": "http",
      "direction": "out",
      "name": "res"
    },
    {
      "name": "devices",
      "type": "cosmosDB",
      "direction": "in",
      "databaseName": "geba",
      "collectionName": "devices",
      "sqlQuery": "SELECT c.id, c.room FROM c",
      "connectionStringSetting": "CosmosDBConnection"    
    }
  ]
}

In my case, I have a Cosmos DB database geba with a devices collection. Device documents contain an id and room field which simply get selected with the query: SELECT c.id, c.room FROM c.

Note: with route set to device, the API will need to be called with /api/device instead of /api/GetDevice.

The actual function in index.js is kept as simple as possible:

module.exports = async function (context, req) {
    context.log('Send devices from Cosmos');
  
    context.res = {
        // status: 200, /* Defaults to 200 */
        body: context.bindings.devices
    };
    
};

Yes, the above code is all that is required to retrieve the JSON output of the Cosmos DB query and set is as the HTTP response.

Note that local.settings.json contains the Cosmos DB connection string in CosmosDBConnection:

{
  "IsEncrypted": false,
  "Values": {
    "AzureWebJobsStorage": "",
    "FUNCTIONS_WORKER_RUNTIME": "node",
    "CosmosDBConnection": "AccountEndpoint=https://geba-cosmos.documents.a...;"
  }
}

You will have to make sure the Cosmos DB connection string is made known to Azure Static Web App later. During local testing, local.settings.json is used to retrieve it. local.settings.json is automatically added to .gitignore to not push it to the remote repository.

Local Testing

We can test the app locally with the Live Server extension. But first, modify .vscode/settings.json and add a proxy for your api:

"liveServer.settings.proxy": {
        "enable": true,
        "baseUri": "/api",
        "proxyUri": "http://172.28.242.32:7071/api"
    }

With the above setting, a call to /api via Live Server will be proxied to Azure Functions on your local machine. Note that the IP address refers to the IP address of WSL 2 on my Windows 10 machine. Find it by running ifconfig in WSL 2.

Before we can test the application locally, start your function app by pressing F5. You should see:

Function App started locally

Now go to index.html, right click and select Open with Live Server. The populated list of devices shows that the query to Cosmos DB works and that the API is working locally:

Test the static web app and API locally

Notes on using WSL 2:

  • for some reason, http://localhost:5500/index.html (Live Server running in WSL 2) did not work from the Windows session although it should; in the screenshot above, you see I replaced localhost with the IP address of WSL 2
  • time skew can be an issue with WSL 2; if you get an error during the Cosmos DB query of authorization token is not valid at the current time, perform a time sync with ntpdate time.windows.com from your WSL 2 session

Deploy the Static Web App

Create a new Static Web App in the portal. The first screen will be similar to the one below:

Static Web App wizard first screen

You will need to authenticate to GitHub and choose your repository and branch as shown above. Click Next. Fill in the Build step as follows:

Static Web App wizard second screen

Our app will indeed run off the root. We are not using a framework that outputs a build to a folder like dist so you can leave the artifact location blank. We are just serving index.html off the root.

Complete the steps for the website to be created. You GitHub Action will be created and run for the first time. You can easily check the GitHub Action runs from the Overview screen:

Checking the GitHub Action runs

Here’s an example of a GitHub action run:

A GitHub Action run

When the GitHub Action is finished, your website will be available on a URL provided by Azure Static Web Apps. In my case: https://polite-cliff-01b6ab303.azurestaticapps.net.

To make sure the connection to Cosmos DB works, add an Application Setting via Configuration:

Adding the Cosmos DB connection string

The Function App that previously obtained the Cosmos DB connection string from local.settings.json can now retrieve the value from Application Settings. Note that you can also change these settings via Azure CLI.

Conclusion

In this post, we created a simple web app in combination with an function app that serves as the API. You can easily create and test the web app and function app locally with the help of Live Server and a Live Server proxy. Setting up the web app is easy via the Azure Portal, which also creates a GitHub Action that takes care of deployment for you. In a next post, we will take a look at enabling authentication via the GitHub identity provider and only allowing authorized users to retrieve the list of devices.

Trying Consul Connect on your local machine

In a previous post, I talked about installing Consul on Kubernetes and using some of its features. In that post, I did not look at the service mesh functionality. Before looking at that, it is beneficial to try out the service mesh features on your local machine.

You can easily install Consul on your local machine with Chocolatey for Windows or Homebrew for Mac. On Windows, a simple choco install consul is enough. Since Consul is just a single executable, you can start it from the command line with all the options you need.

In the video below, I walk through configuring two services running as containers on my local machine: a web app that talks to Redis. We will “mesh” both services and then use an intention to deny service-to-service traffic.

Consul Service Mesh on your local machine… speed it up! ☺

In a later post and video, we will look at Consul Connect on Kubernetes. Stay tuned!

Getting started with Consul on Kubernetes

Although I have heard a lot about Hashicorp’s Consul, I have not had the opportunity to work with it and get acquainted with the basics. In this post, I will share some of the basics I have learned, hopefully giving you a bit of a head start when you embark on this journey yourself.

Want to watch a video about this instead?

What is Consul?

Basically, Consul is a networking tool. It provides service discovery and allows you to store and retrieve configuration values. On top of that, it provides service-mesh capability by controlling and encrypting service-to-service traffic. Although that looks simple enough, in complex and dynamic infrastructure spanning multiple locations such as on-premises and cloud, this can become extremely complicated. Let’s stick to the basics and focus on three things:

  • Installation on Kubernetes
  • Using the key-value store for configuration
  • Using the service catalog to retrieve service information

We will use a small Go program to illustrate the use of the Consul API. Let’s get started… πŸš€πŸš€πŸš€

Installation of Consul

I will install Consul using the provided Helm chart. Note that the installation I will perform is great for testing but should not be used for production. In production, there are many more things to think about. Look at the configuration values for hints: certificates, storage size and class, options to enable/disable, etc… That being said, the chart does install multiple servers and clients to provide high availability.

I installed Consul with Pulumi and Python. You can check the code here. You can use that code on Azure to deploy both Kubernetes and Consul in one step. The section in the code that installs Consul is shown below:

consul = v3.Chart("consul",
    config=v3.LocalChartOpts(
        path="consul-chart",
        namespace="consul",
        values={ 
            "connectInject": {
                "enabled": "true"
            },
            "client": {
                "enabled": "true",
                "grpc": "enabled"
            },
            "syncCatalog": {
                "enabled": "true"
            } 
        }        
    ),
    opts=pulumi.ResourceOptions(
        depends_on=[ns_consul],
        provider=k8s
    )    
)

The code above would be equivalent to this Helm chart installation (Helm v3):

helm install consul -f consul-helm/values.yaml \
--namespace consul ./consul-helm \
--set connectInject.enabled=true  \
--set client.enabled=true --set client.grpc=true  \
--set syncCatalog.enabled=true

Connecting to the Consul UI

The chart installs Consul in the consul namespace. You can run the following command to get to the UI:

kubectl port-forward services/consul-consul-ui 8888:80 -n consul8:80 -n consul

You will see the screen below. The list of services depends on the Kubernetes services in your system.

Consul UI with list of services

The services above include consul itself. The consul service also has health checks configured. The other services in the screenshot are Kubernetes services that were discovered by Consul. I have installed Redis in the default namespace and exposed Redis via a service called redisapp. This results in a Consul service called redisadd-default. Later, we will query this service from our Go application.

When you click Key/Value, you can see the configured keys. I have created one key called REDISPATTERN which is later used in the Go program to know the Redis channels to subscribe to. It’s just a configuration value that is retrieved at runtime.

A simple key/value pair: REDISPATTERn=*

The Key/Value pair can be created via the consul CLI, the HTTP API or via the UI (Create button in the main Key/Value screen). I created the REDISPATTERN key via the Create button.

Querying the Key/Value store

Let’s turn our attention to writing some code that retrieves a Consul key at runtime. The question of course is: “how does your application find Consul?”. Look at the diagram below:

Simplifgied diagram of Consul installation on Kubernetes via the Helm chart

Above, you see the Consul server agents, implemented as a Kubernetes StatefulSet. Each server pod has a volume (Azure disk in this case) to store data such as key/value pairs.

Your application will not connect to these servers directly. Instead, it will connect via the client agents. The client agents are implemented as a DaemonSet resulting in a client agent per Kubernetes node. The client agent pods expose a static port on the Kubernetes host (yes, you read that right). This means that your app can connect to the IP address of the host it is running on. Your app can discover that IP address via the Downward API.

The container spec contains the following code:

      containers:
      - name: realtimeapp
        image: gbaeke/realtime-go-consul:1.0.0
        env:
        - name: HOST_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.hostIP
        - name: CONSUL_HTTP_ADDR
          value: http://$(HOST_IP):8500

The HOST_IP will be set to the IP of the Kubernetes host via a reference to status.hostIP. Next, the environment variable CONSUL_HTTP_ADDR is set to the full HTTP address including port 8500. In your code, you will need to read that environment variable.

Retrieving a key/value pair

Here is some code to read a Consul key/value pair with Go. Full source code is here.

// return a Consul client based on given address
func getConsul(address string) (*consulapi.Client, error) {
	config := consulapi.DefaultConfig()
	config.Address = address
	consul, err := consulapi.NewClient(config)
	return consul, err
}

// get key/value pair from Consul client and passed key name
func getKvPair(client *consulapi.Client, key string) (*consulapi.KVPair, error) {
	kv := client.KV()
	keyPair, _, err := kv.Get(key, nil)
	return keyPair, err
}

func main() {
        // retrieve address of Consul set via downward API in spec
	consulAddress := getEnv("CONSUL_HTTP_ADDR", "")
	if consulAddress == "" {
		log.Fatalf("CONSUL_HTTP_ADDRESS environment variable not set")
	}

        // get Consul client
	consul, err := getConsul(consulAddress)
	if err != nil {
		log.Fatalf("Error connecting to Consul: %s", err)
	}

        // get key/value pair with Consul client
	redisPattern, err := getKvPair(consul, "REDISPATTERN")
	if err != nil || redisPattern == nil {
		log.Fatalf("Could not get REDISPATTERN: %s", err)
	}
	log.Printf("KV: %v %s\n", redisPattern.Key, redisPattern.Value)

... func main() continued...

The comments in the code should be self-explanatory. When the REDISPATTERN key is not set or another error occurs, the program will exit. If REDISPATTERN is set, we can use the value later:

pubsub := client.PSubscribe(string(redisPattern.Value))

Looking up a service

That’s great but how do you look up an address of a service? That’s easy with the following basic code via the catalog:

cat := consul.Catalog()
svc, _, err := cat.Service("redisapp-default", "", nil)
log.Printf("Service address and port: %s:%d\n", svc[0].ServiceAddress, 
  svc[0].ServicePort)

consul is a *consulapi.client obtained earlier. You use the Catalog() function to obtain access to catalog service functionality. In this case, we simply retrieve the address and port value of the Kubernetes service redisapp in the default namespace. We can use that information to connect to our Redis back-end.

Conclusion

It’s easy to get started with Consul on Kubernetes and to write some code to take advantage of it. Be aware though that we only scratched the surface here and that this is both a sample deployment (without TLS, RBAC, etc…) and some sample code. In addition, you should only use Consul in more complex application landscapes with many services to discover, traffic to secure and more. If you do think you need it, you should also take a look at managed Consul on Azure. It runs in your subscription but fully managed by Hashicorp! It can be integrated with Azure Kubernetes Service as well.

In a later post, I will take a look at the service mesh capabilities with Connect.

Azure SQL, Azure Active Directory and Seamless SSO: An Overview

Instead of pure lift-and-shift migrations to the cloud, we often encounter lift-shift-tinker migrations. In such a migration, you modify some of the application components to take advantage of cloud services. Often, that’s the database but it could also be your web servers (e.g. replaced by Azure Web App). When you replace SQL Server on-premises with SQL Server or Managed Instance on Azure, we often get the following questions:

  • How does Azure SQL Database or Managed Instance integrate with Active Directory?
  • How do you authenticate to these databases with an Azure Active Directory account?
  • Is MFA (multi-factor authentication) supported?
  • If the user is logged on with an Active Directory account on a domain-joined computer, is single sign-on possible?

In this post, we will look at two distinct configuration options that can be used together if required:

  • Azure AD authentication to SQL Database
  • Single sign-on to Azure SQL Database from a domain-joined computer via Azure AD Seamless SSO

In what follows, I will provide an overview of the steps. Use the links to the Microsoft documentation for the details. There are many!!! πŸ˜‰

Visually, it looks a bit like below. In the image, there’s an actual domain controller in Azure (extra Active Directory site) for local authentication to Active Directory. Later in this post, there is an example Python app that was run on a WVD host joined to this AD.

Azure AD Authentication

Both Azure SQL Database and Managed Instances can be integrated with Azure Active Directory. They cannot be integrated with on-premises Active Directory (ADDS) or Azure Active Directory Domain Services.

For Azure SQL Database, the configuration is at the SQL Server level:

SQL Database Azure AD integration

You should read the full documentation because there are many details to understand. The account you set as admin can be a cloud-only account. It does not need a specific role. When the account is set, you can logon with that account from Management Studio:

Authentication from Management Studio

There are several authentication schemes supported by Management Studio but the Universal with MFA option typically works best. If your account has MFA enabled, you will be challenged for a second factor as usual.

Once connected with the Azure AD “admin”, you can create contained database users with the following syntax:

CREATE USER [user@domain.com] FROM EXTERNAL PROVIDER;

Note that instead of a single user, you can work with groups here. Just use the group name instead of the user principal name. In the database, the user or group appears in Management Studio like so:

Azure AD user (or group) in list of database users

From an administration perspective, the integration steps are straightforward but you create your users differently. When you migrate databases to the cloud, you will have to replace the references to on-premises ADDS users with references to Azure AD users!

Seamless SSO

Now that Azure AD is integrated with Azure SQL Database, we can configure single sign-on for users that are logged on with Active Directory credentials on a domain-joined computer. Note that I am not discussing Azure AD joined or hybrid Azure AD joined devices. The case I am discussing applies to Windows Virtual Desktop (WVD) as well. WVD devices are domain-joined and need line-of-sight to Active Directory domain controllers.

Note: seamless SSO is of course optional but it is a great way to make it easier for users to connect to your application after the migration to Azure

To enable single sign-on to Azure SQL Database, we will use the Seamless SSO feature of Active Directory. That feature works with both password-synchronization and pass-through authentication. All of this is configured via Azure AD Connect. Azure AD Connect takes care of the synchronization of on-premises identities in Active Directory to an Azure Active Directory tenant. If you are not familiar with Azure AD Connect, please check the documentation as that discussion is beyond the scope of this post.

When Seamless SSO is configured, you will see a new computer account in Active Directory, called AZUREADSSOACC$. You will need to turn on advanced settings in Active Directory Users and Computers to see it. That account is important as it is used to provide a Kerberos ticket to Azure AD. For full details, check the documentation. Understanding the flow depicted below is important:

Seamless Single Sign On - Web app flow
Seamless SSO flow (from Microsoft @ https://docs.microsoft.com/en-us/azure/active-directory/hybrid/how-to-connect-sso-how-it-works)

You should also understand the security implications and rotate the Kerberos secret as discussed in the FAQ.

Before trying SSO to Azure SQL Database, log on to a domain-joined device with an identity that is synced to the cloud. Make sure, Internet Explorer is configured as follows:

Add https://autologon.microsoftazuread-sso.com to the Local Intranet zone

Check the docs for more information about the Internet Explorer setting and considerations for other browsers.

Note: you do not need to configure the Local Intranet zone if you want SSO to Azure SQL Database via ODBC (discussed below)

With the Local Intranet zone configured, you should be able to go to https://myapps.microsoft.com and only provide your Azure AD principal (e.g. first.last@yourdomain.com). You should not be asked to provide your password. If you use https://myapps.microsoft.com/yourdomain.com, you will not even be asked your username.

With that out of the way, let’s see if we can connect to Azure SQL Database using an ODBC connection. Make sure you have installed the latest ODBC Driver for SQL Server on the machine (in my case, ODBC Driver 17). Create an ODBC connection with the Azure SQL Server name. In the next step, you see the following authentication options:

ODBC Driver 17 authentication options

Although all the options for Azure Active Directory should work, we are interested in integrated authentication, based on the credentials of the logged on user. In the next steps, I only set the database name and accepted all the other options as default. Now you can test the data source:

Testing the connection

Great, but what about your applications? Depending on the application, there still might be quite some work to do and some code to change. Instead of opening that can of worms πŸ₯«, let’s see how this integrated connection works from a sample Pyhton application.

Integrated Authentication test with Python

The following Python program uses pyodbc to connect with integrated authentication:

import pyodbc 

server = 'tcp:AZURESQLSERVER.database.windows.net' 
database = 'AZURESQLDATABASE' 

cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';authentication=ActiveDirectoryIntegrated')
cursor = cnxn.cursor()

cursor.execute("SELECT * from TEST;") 
row = cursor.fetchone() 
while row: 
    print(row[0])
    row = cursor.fetchone()

My SQL Database contains a simple table called test. The logged on user has read and write access. As you can see, there is no user and password specified. In the connection string, “authentication=ActiveDirectoryIntegrated” is doing the trick. The result is just my name (hey, it’s a test):

Result returned from table

Conclusion

In this post, I have highlighted how single sign-on works for domain-joined devices when you use Azure AD Connect password synchronization in combination with the Seamless SSO feature. This scenario is supported by SQL Server ODBC driver version 17 as shown with the Python code. Although I used SQL Database as an example, this scenario also applies to a managed instance.

Progressive Delivery on Kubernetes: what are your options?

If you have ever deployed an application to Kubernetes, even a simple one, you are probably familiar with deployments. A deployment describes the pods to run, how many of them to run and how they should be upgraded. That last point is especially important because the strategy you select has an impact on the availability of the deployment. A deployment supports the following two strategies:

  • Recreate: all existing pods are killed and new ones are created; this obviously leads to some downtime
  • RollingUpdate: pods are gradually replaced which means there is a period when old and new pods coexist; this can result in issues for stateful pods or if there is no backward compatibility

But what if you want to use other methods such as BlueGreen or Canary? Although you could do that with a custom approach that uses deployments, there are some solution that provide a more automated approach. Below, I discuss two of them briefly. Videos provide a more in depth look.

Argo Rollouts

One of the solutions out there is Argo Rollouts. It is very easy to use. If you want to start slowly, with BlueGreen deployments and manual approval for instance, Argo Rollouts is recommended. It has a nice kubectl plugin and integration with Argo CD, a GitOps solution.

The following video demonstrates BlueGreen deployments:

BlueGreen deployments with Argo Rollouts

This video discusses a canary deployment with Argo Rollouts albeit a simple one without metric analysis:

Canary deployments with Argo Rollouts

This video shows the integration between Argo Rollouts and Argo CD:

Argo CD and Argo Rollouts integration

One thing to note is that, instead of a deployment, you will create a rollout object. It is easy to convert an existing deployment into a rollout. Other tools such as Flagger (see below), provide their functionality on top of an existing deployment.

For traffic splitting and metrics analysis, Argo Rollouts does not support Linkerd. More information about traffic splitting and management can be found here.

Flagger

Flagger, by Weaveworks, is another solution that provides BlueGreen and Canary deployment support to Kubernetes. In the video below, I demonstrate the basic look and feel of doing a canary deployment that includes metric analysis. Linkerd is used for gradual traffic shifting to the canary based on the built-in success rate metric of Linkerd:

Canary release with Flagger and Linkerd

If you want to get started with canary releases and easy traffic splitting and metrics, I suggest using the Flagger and Linkerd combination. This is based simply on the fact that Linkerd is much easier to install and use than Istio. Argo Rollouts in combination with Istio and Prometheus could be used to achieve exactly the same result.

Which one to use?

If you just want BlueGreen deployments with manual approvals, I would suggest using Argo Rollouts. When you integrate it with Argo CD, you can even use the Argo CD UI to promote your deployment. If you are comfortable with Istio and Prometheus, you can go a step further and add metrics analysis to automatically progress your deployment. You can also use a simple Kubernetes job to validate your deployment. Also, note that other metrics providers are supported.

Flagger supports more options for traffic splitting and metrics, due to its support for both Linkerd and Istio. Because Linkerd is so easy to use, Flagger is simpler to get started with canary releases and metrics analysis.

Trying Windows Virtual Desktop

Update: Windows Virtual Desktop 2020 Spring Update brings an new fully ARM-based deployment method and portal experience. As of May 2020, these features are in public preview. Check the documentation. Links to the documentation in this article will also reflect the update.

It was Sunday afternoon. I had some time. So I had this crazy idea to try out Windows Virtual Desktop. Just so you know, I am not really into “the desktop” and all its intricacies. I have done my fair share of Remote Desktop Services, App-V and Citrix in the past but that is probably already a decade ago.

Before moving on, review the terminology like tenants, host pools, app groups, etc…. That can be found here: https://docs.microsoft.com/en-us/azure/virtual-desktop/environment-setup

To start, I will share the (wrong) assumptions I made:

  • “I will join the virtual desktops to Azure Active Directory directly! That way I do not have to set up domain controllers and stuff. That must be supported!” – NOPE, that is not supported. You will need domain controllers synced to the Azure AD instance you will be using. You can use Azure AD Domain Services as well.
  • “I will just use the portal to provision a host pool. I have seen there’s a marketplace item for that called Windows Virtual Desktop – Provision a host pool” – NOPE, it’s not meant to work just like that. Move to the next point…
  • “I will not read the documentation. Why would I? It’s desktops we’re talking about, not Kubernetes or Istio or something!” πŸ˜‰

Let’s start with that last assumption shall we? You should definitely read the documentation, especially the following two pages:

The WVD Tenant

Use the link above to create the tenant and follow the instructions to the letter. A tenant is a group of one or more host pools. The host pools contain desktops and servers that your users will connect to. The host pool provisioning wizard in the portal will ask for this tenant.

You will need Global Admin rights in your Azure AD tenant to create this Windows Virtual Desktop tenant. That’s another problem I had since I do not have those access rights at my current employer. I used an Azure subscription tied to my employer’s Azure AD tenant. To fix that, I created an Azure trial with a new Azure AD tenant where I had full control.

Another problem I bumped into is that the account I created the Azure AD tenant with is an account in the baeke.info domain which will become an external account in the directory. You should not use such an account in the host pool provisioning wizard in the portal because it will fail.

When the tenant is created, you can use PowerShell to get information about it (Ids were changed to protect the innocent):

TenantGroupName : Default Tenant Group
AadTenantId : 1a887615-efcb-2022-9279-b9ada644332c
TenantName : BaekeTenant
Description :
FriendlyName :
SsoAdfsAuthority :
SsoClientId :
SsoClientSecret :
SsoClientSecretType : SharedKey
AzureSubscriptionId : 4d29djus-d120-4bac-8681-5e5e33ab77356
LogAnalyticsWorkspaceId :
LogAnalyticsPrimaryKey :

Great, my tenant is called BaekeTenant and the tenant group is Default Tenant Group. During host pool provisioning, Default Tenant Group is automatically proposed as the tenant group.

Service Principals and role assignments

The service principal is used to automate certain Windows Virtual Desktop management tasks. Follow https://docs.microsoft.com/en-us/azure/virtual-desktop/create-service-principal-role-powershell to the letter to create this service principal. It will need a specific role called RDS Owner, via the following PowerShell command (where $svcPrincipal and $myTenantName are variables from previous steps):

New-RdsRoleAssignment -RoleDefinitionName "RDS Owner" -ApplicationId $svcPrincipal.AppId -TenantName $myTenantName

If that role is not granted, all sorts of things might happen. I had an error after domain join, in the dscextension resource. You can check the ARM deployment for that. It’s green below but it was red quite a few times because I used an account that did not have the role:

So, if that step fails for you, you know where to look. If the joindomain resource fails, it probably has to do with the WVD hosts not being able to find the domain controllers. Check the DNS settings! Also check the AD Join section below.

During host pool provisioning, you will need the following in the last step of the wizard (see further down below):

  • The application ID: the user name of the service principal ($svcPrincipal.AppId)
  • The secret: the password of the service principal
  • The Azure AD tenant ID: you can find it in the Properties page of your Azure AD tenant

Active Directory Join

When you provision the host pool, one of the provisioning steps is joining the Windows 10 or Windows Server system to Active Directory. An Azure Active Directory Join is not supported. Because I did not want to install and configure domain controllers, I used Azure AD Domain Services to create a domain that syncs with my new Azure AD tenant:

Using Azure AD Domain Services: Windows 10 systems in the host pool will join this AD

Of course, you need to make sure that Windows Virtual Desktop machines, use DNS servers that can resolve requests for this domain. You can find this in Properties:

DNS servers to use are 10.0.0.4 and 10.0.0.5 in my case

The virtual network that contains the subnet with the virtual desktops, has the following setting for DNS:

DNS servers of VNET so virtual desktop hosts in the VNET can be joined to the Azure AD Domain Services domain

During host pool provisioning, you will be asked for an account that can do domain joins. You can also specify the domain to join if the domain suffix of the account you specify is different. The account you use should be member of the following group as well:

Member of AAD DC Administrators group

When you have all of this stuff in place, you are finally ready to provision the host pool. To recap:

  • Create a WVD tenant
  • Create a Service Principal that has the RDS Owner role
  • Make sure WVD hosts can join Active Directory (Azure AD join not supported); you can use Azure AD Domain Services if you want
  • Make sure WVD hosts use DNS servers that can resolve the necessary DNS records to find the domain controllers; I used the IP addresses of the Azure AD Domain Services domain controllers here
  • Make sure the account you use to join the domain is member of the AAD DC Administrators group

When all this is done, you are finally ready to use the marketplace item in the portal to provision the host pool:

Use the search bar to find the WVD marketplace item

Here are the screens of how I filled them out:

The basics

You can specify more users. Use a comma as separator. I only added my own account here. You can add more users later with the Add-RdsAppGroupUser, to add a user to the default Desktop Application Group. For example:

Add-RdsAppGroupUser -TenantName "BaekeTenant" -HostPoolName "baekepool" -AppGroupName "Desktop Application Group" -UserPrincipalName "other@azurebaeke.onmicrosoft.com"

And now the second step:

I only want 1 machine to test the features

Next, virtual machine settings:

VM settings

I use the Windows 10 Enterprise multi-session image from the gallery. Note you can use your own images as well. I don’t need to specify a domain because the suffix of the AD domain join UPN will be used: azurebaeke.onmicrosoft.com. The virtual network has DNS configured to use the Azure AD Domain Services servers. The domain join user is member of the AAD DC Administrators group.

And now the final screen:

At last….

The WVD BaekeTenant was created earlier with PowerShell. We also created a service principal with the RDS Owner role so we use that here. Part of this process is the deployment of the WVD session host(s) in the subnet you specified:

Deployed WVD session host

If you followed the Microsoft documentation and some of the tips in this post, you should be able to get to your desktop fairly easily. But how? Just check this to connect from your desktop, or this to connect from the browser. Just don’t try to use mstsc.exe, it’s not supported. End result: finally able to logon to the desktop…

Soon, a more integrated Azure Portal experience is coming that will (hopefully) provide better guidance with sufficient checks and tips to make this whole process a lot smoother.