From MQTT to InfluxDB with Dapr

In a previous post, we looked at using the Dapr InfluxDB component to write data to InfluxDB Cloud. In this post, we will take a look at reading data from an MQTT topic and storing it in InfluxDB. We will use Dapr 0.10, which includes both components.

To get up to speed with Dapr, please read the previous post and make sure you have an InfluxDB instance up and running in the cloud.

If you want to see a video instead:

MQTT to Influx with Dapr

Note that the video sends output to both InfluxDB and Azure SignalR. In addition, the video uses Dapr 0.8 with a custom compiled Dapr because I was still developing and testing the InfluxDB component.

MQTT Server

Although there are cloud-based MQTT servers you can use, let’s mix it up a little and run the MQTT server from Docker. If you have Docker installed, type the following:

docker run -it -p 1883:1883 -p 9001:9001 eclipse-mosquitto

The above command runs Mosquitto and exposes port 1883 on your local machine. You can use a tool such as MQTT Explorer to send data. Install MQTT Explorer on your local machine and run it. Create a connection like in the below screenshot:

MQTT Explorer connection

Now, click Connect to connect to Mosquitto. With MQTT, you send data to topics of your choice. Publish a json message to a topic called test as shown below:

Publish json data to the test topic

You can now click the topic in the list of topics and see its most recent value:

Subscribing to the test topic

Using MQTT with Dapr

You are now ready to read data from an MQTT topic with Dapr. If you have Dapr installed, you can run the following code to read from the test topic and store the data in InfluxDB:

const express = require('express');
const bodyParser = require('body-parser');

const app = express();
app.use(bodyParser.json());

const port = 3000;

// mqtt component will post messages from influx topic here
app.post('/mqtt', (req, res) => {
    console.log("MQTT Binding Trigger");
    console.log(req.body)

    // body is expected to contain room and temperature
    room = req.body.room
    temperature = req.body.temperature

    // room should not contain spaces
    room = room.split(" ").join("_")

    // create message for influx component
    message = {
        "measurement": "stat",
        "tags": `room=${room}`,
        "values": `temperature=${temperature}`
    };
    
    // send the message to influx output binding
    res.send({
        "to": ["influx"],
        "data": message
    });
});

app.listen(port, () => console.log(`Node App listening on port ${port}!`));

In this example, we use Node.js instead of Python to illustrate that Dapr works with any language. You will also need this package.json and run npm install:

{
  "name": "mqttapp",
  "version": "1.0.0",
  "description": "",
  "main": "app.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "body-parser": "^1.18.3",
    "express": "^4.16.4"
  }
}

In the previous post about InfluxDB, we used an output binding. You use an output binding by posting data to a Dapr HTTP URI.

To use an input binding like MQTT, you will need to create an HTTP server. Above, we create an HTTP server with Express, and listen on port 3000 for incoming requests. Later, we will instruct Dapr to listen for messages on an MQTT topic and, when a message arrives, post it to our server. We can then retrieve the message from the request body.

To tell Dapr what to do, we’ll create a components folder in the same folder that holds the Node.js code. Put a file in that folder with the following contents:

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: mqtt
spec:
  type: bindings.mqtt
  metadata:
  - name: url
    value: mqtt://localhost:1883
  - name: topic
    value: test

Above, we configure the MQTT component to list to topic test on mqtt://localhost:1883. The name we use (in metadata) is important because that needs to correspond to our HTTP handler (/mqtt).

Like in the previous post, there’s another file that configures the InfluxDB component:

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: influx
spec:
  type: bindings.influx
  metadata:
  - name: Url
    value: http://localhost:9999
  - name: Token
    value: ""
  - name: Org
    value: ""
  - name: Bucket
    value: ""

Replace the parameters in the file above with your own.

Saving the MQTT request body to InfluxDB

If you look at the Node.js code, you have probably noticed that we send a response body in the /mqtt handler:

res.send({
        "to": ["influx"],
        "data": message
    });

Dapr is written to accept responses that include a to and a data field in the JSON response. The above response simply tells Dapr to send the message in the data field to the configured influx component.

Does it work?

Let’s run the code with Dapr to see if it works:

dapr run --app-id mqqtinflux --app-port 3000 --components-path=./components node app.js

In dapr run, we also need to specify the port our app uses. Remember that Dapr will post JSON data to our /mqtt handler!

Let’s post some JSON with the expected fields of temperature and room to our MQTT server:

Posting data to the test topic

The Dapr logs show the following:

Logs from the APP (appear alongside the Dapr logs)

In InfluxDB Cloud table view:

Data stored in InfluxDB Cloud (posted some other data points before)

Conclusion

Dapr makes it really easy to retrieve data with input bindings and send that data somewhere else with output bindings. There are many other input and output bindings so make sure you check them out on GitHub!

Using the Dapr InfluxDB component

A while ago, I created a component that can write to InfluxDB 2.0 from Dapr. This component is now included in the 0.10 release. In this post, we will briefly look at how you can use it.

If you do not know what Dapr is, take a look at https://dapr.io. I also have some videos on Youtube about Dapr. And be sure to check out the video below as well:

Let’s jump in and use the component.

Installing Dapr

You can install Dapr on Windows, Mac and Linux by following the instructions on https://dapr.io/. Just click the Download link and select your operating system. I installed Dapr on WSL 2 (Windows Subsystem for Linux) on Windows 10 with the following command:

wget -q https://raw.githubusercontent.com/dapr/cli/master/install/install.sh -O - | /bin/bash

The above command just installs the Dapr CLI. To initialize Dapr, you need to run dapr init.

Getting an InfluxDB database

InfluxDB is a time-series database. You can easily run it in a container on your local machine but you can also use InfluxDB Cloud. In this post, we will simply use a free cloud instance. Just head over to https://cloud2.influxdata.com/signup and signup for an account. Just follow the steps and use a free account. It stores data for maximum 30 days and has some other limits as well.

You will need the following information to write data to InfluxDB:

  • Organization: this will be set to the e-mail account you signed up with; it can be renamed if you wish
  • Bucket: your data is stored in a bucket; by default you get a bucket called e-mail-prefix’s Bucket (e.g. geert.baeke’s Bucket)
  • Token: you need a token that provides the necessary access rights such as read and/or write

Let’s rename the bucket to get a feel for the user interface. Click Data, Buckets and then Settings as shown below:

Getting to the bucket settings

Click Rename and follow the steps to rename the bucket:

Renaming the bucket

Now, let’s create a token. In the Load Data screen, click Tokens. Click Generate and then click Read/Write Token. Describe the token and create it like below:

Creating a token

Now click the token you created and copy it to the clipboard. You now have the organization name, a bucket name and a token. You still need a URL to connect to but that just the URL you see in the browser (the yellow part):

URL to send your data

Your URL will depend on the cloud you use.

Python code to write to InfluxDB with Dapr

The code below requires Python 3. I used version 3.6.9 but it will work with more recent versions of course.

import time
import requests
import os

dapr_port = os.getenv("DAPR_HTTP_PORT", 3500)

dapr_url = "http://localhost:{}/v1.0/bindings/influx".format(dapr_port)
n = 0.0
while True:
    n += 1.0
    payload = { 
        "data": {
            "measurement": "temp",
            "tags": "room=dorm,building=building-a",
            "values": "sensor=\"sensor X\",avg={},max={}".format(n, n*2)
            }, 
        "operation": "create" 
    }
    print(payload, flush=True)
    try:
        response = requests.post(dapr_url, json=payload)
        print(response, flush=True)

    except Exception as e:
        print(e, flush=True)

    time.sleep(1)

The code above is just an illustration of using the InfluxDB output binding from Dapr. It is crucial to understand that a Dapr process needs to be running, either locally on your system or as a Kubernetes sidecar, that the above program communicates with. To that end, we get the Dapr port number from an environment variable or use the default port 3500.

The Python program uses the InfluxDB output binding simply by posting data to an HTTP endpoint. The endpoint is constructed as follows:

dapr_url = "http://localhost:{}/v1.0/bindings/influx".format(dapr_port)

The dapr_url above is set to a URI that uses localhost over the Dapr port and then uses the influx binding by appending /v1.0/bindings/influx. All bindings have a specific name like influx, mqtt, etc… and that name is then added to /v1.0/bindings/ to make the call work.

So far so good, but how does the binding know where to connect and what organization, bucket and token to use? That’s where the component .yaml file comes in. In the same folder where you save your Python code, create a folder called components. In the folder, create a file called influx.yaml (you can give it any name you want). The influx.yaml contents is shown below:

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: influx
spec:
  type: bindings.influx
  metadata:
  - name: Url
    value: YOUR URL
  - name: Token
    value: "YOUR TOKEN HERE"
  - name: Org
    value: "YOUR ORG"
  - name: Bucket
    value: "YOUR BUCKET"

Of course, replace the uppercase values above with your own. We will later tell Dapr to look for files like this in the components folder. Automatically, because you use the influx binding in your Python code, Dapr will go look for the file above (type: bindings.influx) and retrieve the required metadata. If any of the metadata is not set or if the file is missing or improperly formatted, you will get an error.

To actually use the binding, we need to post some data to the URI we constructed. The data we send is in the payload variable as shown below:

 payload = { 
        "data": {
            "measurement": "temp",
            "tags": "room=dorm,building=building-a",
            "values": "sensor=\"sensor X\",avg={},max={}".format(n, n*2)
            }, 
        "operation": "create" 
    }

It requires a measurement field, a tags and a values field and uses the InfluxDB line protocol to send the data. You can find more information about that here.

The data field in the payload is specific to the Influx component. The operation field is required by this Dapr component as it is written to listen for create operations.

Running the code

On your local machine, you will need to run Dapr together with your code to make it work. You use dapr run for this. To run the Python code (saved to app.py in my case), run the command below from the folder that contains the code and the components folder:

dapr run --app-id influx -d ./components python3 app.py

This starts Dapr and our application with app id influx. With -d, we point to the components file.

When you run the code, Dapr logs and your logs will be printed to the screen. In InfluxDB Cloud, we can check the data from the user interface:

Date Explorer (Note: other organization and bucket than the one used in this post)

Conclusion

Dapr can be used in the cloud and at the edge, in containers or without. In both cases, you often have to write data to databases. With Dapr, you can now easily write data as time series to InfluxDB. Note that Dapr also has an MQTT input and output binding. Using the same simple technique you learned in this post, you can easily read data from an MQTT topic and forward it to InfluxDB. In a later post, we will take a look at that scenario as well. Or check this video instead: https://youtu.be/2vCT79KG24E. Note that the video uses a custom compiled Dapr 0.8 with the InfluxDB component because this video was created during development.

First Look at Azure Static Web Apps

Note: part 2 looks at the authentication and authorization part.

At Build 2020, Microsoft announced Azure Static Web Apps, a new way to host static web apps on Azure. In the past, static web apps, which are just a combination of HTML, JavaScript and CSS, could be hosted in a Storage Account or a regular Azure Web App.

When you compare Azure Static Web Apps with the Storage Account approach, you will notice there are many more features. Some of those features are listed below (also check the docs):

  • GitHub integration: GitHub actions are configured for you to easily deploy your app from your GitHub repository to Azure Static Web Apps
  • Integrated API support: APIs are provided by Azure Functions with an HTTP Trigger
  • Authentication support for Azure Active Directory, GitHub and other providers
  • Authorization role definitions via the portal and a roles.json file in your repository
  • Staging versions based on a pull request

It all works together as shown below:

SWAdiagram.png
Azure Static Web Apps (from https://techcommunity.microsoft.com/t5/azure-app-service/introducing-app-service-static-web-apps/ba-p/1394451)

As a Netlify user, this type of functionality is not new to me. Next to static site hosting, they also provide serverless functions, identity etc…

If you are more into video tutorials…

Creating the app and protecting calls to the API

Let’s check out an example to see how it works on Azure…

GitHub repository

The GitHub repo I used is over at https://github.com/gbaeke/az-static-web-app. You will already see the .github/workflows folder that contains the .yml file that defines the GitHub Actions. That folder will be created for you when you create the Azure Static Web App.

The static web app in this case is a simple index.html that contains HTML, JavaScript and some styling. Vue.js is used as well. When you are authenticated, the application reads a list of devices from Cosmos DB. When you select a device, the application connects to a socket.io server, waiting for messages from the chosen device. The backend for the messages come from Redis. Note that the socket.io server and Redis configuration are not described in this post. Here’s a screenshot from the app with a message from device01. User gbaeke is authenticated via GitHub. When authenticated, the device list is populated. When you log out, the device list is empty. There’s no error checking here so when the device list cannot be populated, you will see a 404 error in the console. 😉

Azure Static Web App in action

Note: Azure Static Web Apps provides a valid certificate for your app, whether it uses a custom domain or not; in the above screenshot, Not secure is shown because the application connects to the socket.io server over HTTP and Mixed Content is allowed; that is easy to fix with SSL for the socket.io server but I chose to not configure that

The API

Although API is probably too big a word for it, the devices drop down list obtains its data from Cosmos DB, via an Azure Function. It was added from Visual Studio Code as follows:

  • add the api folder to your project
  • add a new Function Project and choose the api folder: simply use F1 in Visual Studio Code and choose Azure Functions: Create New Project… You will be asked for the folder. Choose api.
  • modify the code of the Function App to request data from Cosmos DB

To add an Azure Function in Visual Studio Code, make sure you install the Azure Functions extension and the Azure Function Core Tools. I installed the Linux version of Core Tools in WSL 2.

Adding the function (JavaScript; HTTP Trigger, anonymous, name of GetDevice) should result in the following structure:

Function app as part of the static web app (api folder)

Next, I modified function.json to include a Cosmos DB input next to the existing HTTP input and output:

{
  "bindings": [
    {
      "authLevel": "anonymous",
      "type": "httpTrigger",
      "direction": "in",
      "name": "req",
      "methods": [
        "get",
        "post"
      ],
      "route": "device"
    },
    {
      "type": "http",
      "direction": "out",
      "name": "res"
    },
    {
      "name": "devices",
      "type": "cosmosDB",
      "direction": "in",
      "databaseName": "geba",
      "collectionName": "devices",
      "sqlQuery": "SELECT c.id, c.room FROM c",
      "connectionStringSetting": "CosmosDBConnection"    
    }
  ]
}

In my case, I have a Cosmos DB database geba with a devices collection. Device documents contain an id and room field which simply get selected with the query: SELECT c.id, c.room FROM c.

Note: with route set to device, the API will need to be called with /api/device instead of /api/GetDevice.

The actual function in index.js is kept as simple as possible:

module.exports = async function (context, req) {
    context.log('Send devices from Cosmos');
  
    context.res = {
        // status: 200, /* Defaults to 200 */
        body: context.bindings.devices
    };
    
};

Yes, the above code is all that is required to retrieve the JSON output of the Cosmos DB query and set is as the HTTP response.

Note that local.settings.json contains the Cosmos DB connection string in CosmosDBConnection:

{
  "IsEncrypted": false,
  "Values": {
    "AzureWebJobsStorage": "",
    "FUNCTIONS_WORKER_RUNTIME": "node",
    "CosmosDBConnection": "AccountEndpoint=https://geba-cosmos.documents.a...;"
  }
}

You will have to make sure the Cosmos DB connection string is made known to Azure Static Web App later. During local testing, local.settings.json is used to retrieve it. local.settings.json is automatically added to .gitignore to not push it to the remote repository.

Local Testing

We can test the app locally with the Live Server extension. But first, modify .vscode/settings.json and add a proxy for your api:

"liveServer.settings.proxy": {
        "enable": true,
        "baseUri": "/api",
        "proxyUri": "http://172.28.242.32:7071/api"
    }

With the above setting, a call to /api via Live Server will be proxied to Azure Functions on your local machine. Note that the IP address refers to the IP address of WSL 2 on my Windows 10 machine. Find it by running ifconfig in WSL 2.

Before we can test the application locally, start your function app by pressing F5. You should see:

Function App started locally

Now go to index.html, right click and select Open with Live Server. The populated list of devices shows that the query to Cosmos DB works and that the API is working locally:

Test the static web app and API locally

Notes on using WSL 2:

  • for some reason, http://localhost:5500/index.html (Live Server running in WSL 2) did not work from the Windows session although it should; in the screenshot above, you see I replaced localhost with the IP address of WSL 2
  • time skew can be an issue with WSL 2; if you get an error during the Cosmos DB query of authorization token is not valid at the current time, perform a time sync with ntpdate time.windows.com from your WSL 2 session

Deploy the Static Web App

Create a new Static Web App in the portal. The first screen will be similar to the one below:

Static Web App wizard first screen

You will need to authenticate to GitHub and choose your repository and branch as shown above. Click Next. Fill in the Build step as follows:

Static Web App wizard second screen

Our app will indeed run off the root. We are not using a framework that outputs a build to a folder like dist so you can leave the artifact location blank. We are just serving index.html off the root.

Complete the steps for the website to be created. You GitHub Action will be created and run for the first time. You can easily check the GitHub Action runs from the Overview screen:

Checking the GitHub Action runs

Here’s an example of a GitHub action run:

A GitHub Action run

When the GitHub Action is finished, your website will be available on a URL provided by Azure Static Web Apps. In my case: https://polite-cliff-01b6ab303.azurestaticapps.net.

To make sure the connection to Cosmos DB works, add an Application Setting via Configuration:

Adding the Cosmos DB connection string

The Function App that previously obtained the Cosmos DB connection string from local.settings.json can now retrieve the value from Application Settings. Note that you can also change these settings via Azure CLI.

Conclusion

In this post, we created a simple web app in combination with an function app that serves as the API. You can easily create and test the web app and function app locally with the help of Live Server and a Live Server proxy. Setting up the web app is easy via the Azure Portal, which also creates a GitHub Action that takes care of deployment for you. In a next post, we will take a look at enabling authentication via the GitHub identity provider and only allowing authorized users to retrieve the list of devices.

Azure SQL, Azure Active Directory and Seamless SSO: An Overview

Instead of pure lift-and-shift migrations to the cloud, we often encounter lift-shift-tinker migrations. In such a migration, you modify some of the application components to take advantage of cloud services. Often, that’s the database but it could also be your web servers (e.g. replaced by Azure Web App). When you replace SQL Server on-premises with SQL Server or Managed Instance on Azure, we often get the following questions:

  • How does Azure SQL Database or Managed Instance integrate with Active Directory?
  • How do you authenticate to these databases with an Azure Active Directory account?
  • Is MFA (multi-factor authentication) supported?
  • If the user is logged on with an Active Directory account on a domain-joined computer, is single sign-on possible?

In this post, we will look at two distinct configuration options that can be used together if required:

  • Azure AD authentication to SQL Database
  • Single sign-on to Azure SQL Database from a domain-joined computer via Azure AD Seamless SSO

In what follows, I will provide an overview of the steps. Use the links to the Microsoft documentation for the details. There are many!!! 😉

Visually, it looks a bit like below. In the image, there’s an actual domain controller in Azure (extra Active Directory site) for local authentication to Active Directory. Later in this post, there is an example Python app that was run on a WVD host joined to this AD.

Azure AD Authentication

Both Azure SQL Database and Managed Instances can be integrated with Azure Active Directory. They cannot be integrated with on-premises Active Directory (ADDS) or Azure Active Directory Domain Services.

For Azure SQL Database, the configuration is at the SQL Server level:

SQL Database Azure AD integration

You should read the full documentation because there are many details to understand. The account you set as admin can be a cloud-only account. It does not need a specific role. When the account is set, you can logon with that account from Management Studio:

Authentication from Management Studio

There are several authentication schemes supported by Management Studio but the Universal with MFA option typically works best. If your account has MFA enabled, you will be challenged for a second factor as usual.

Once connected with the Azure AD “admin”, you can create contained database users with the following syntax:

CREATE USER [user@domain.com] FROM EXTERNAL PROVIDER;

Note that instead of a single user, you can work with groups here. Just use the group name instead of the user principal name. In the database, the user or group appears in Management Studio like so:

Azure AD user (or group) in list of database users

From an administration perspective, the integration steps are straightforward but you create your users differently. When you migrate databases to the cloud, you will have to replace the references to on-premises ADDS users with references to Azure AD users!

Seamless SSO

Now that Azure AD is integrated with Azure SQL Database, we can configure single sign-on for users that are logged on with Active Directory credentials on a domain-joined computer. Note that I am not discussing Azure AD joined or hybrid Azure AD joined devices. The case I am discussing applies to Windows Virtual Desktop (WVD) as well. WVD devices are domain-joined and need line-of-sight to Active Directory domain controllers.

Note: seamless SSO is of course optional but it is a great way to make it easier for users to connect to your application after the migration to Azure

To enable single sign-on to Azure SQL Database, we will use the Seamless SSO feature of Active Directory. That feature works with both password-synchronization and pass-through authentication. All of this is configured via Azure AD Connect. Azure AD Connect takes care of the synchronization of on-premises identities in Active Directory to an Azure Active Directory tenant. If you are not familiar with Azure AD Connect, please check the documentation as that discussion is beyond the scope of this post.

When Seamless SSO is configured, you will see a new computer account in Active Directory, called AZUREADSSOACC$. You will need to turn on advanced settings in Active Directory Users and Computers to see it. That account is important as it is used to provide a Kerberos ticket to Azure AD. For full details, check the documentation. Understanding the flow depicted below is important:

Seamless Single Sign On - Web app flow
Seamless SSO flow (from Microsoft @ https://docs.microsoft.com/en-us/azure/active-directory/hybrid/how-to-connect-sso-how-it-works)

You should also understand the security implications and rotate the Kerberos secret as discussed in the FAQ.

Before trying SSO to Azure SQL Database, log on to a domain-joined device with an identity that is synced to the cloud. Make sure, Internet Explorer is configured as follows:

Add https://autologon.microsoftazuread-sso.com to the Local Intranet zone

Check the docs for more information about the Internet Explorer setting and considerations for other browsers.

Note: you do not need to configure the Local Intranet zone if you want SSO to Azure SQL Database via ODBC (discussed below)

With the Local Intranet zone configured, you should be able to go to https://myapps.microsoft.com and only provide your Azure AD principal (e.g. first.last@yourdomain.com). You should not be asked to provide your password. If you use https://myapps.microsoft.com/yourdomain.com, you will not even be asked your username.

With that out of the way, let’s see if we can connect to Azure SQL Database using an ODBC connection. Make sure you have installed the latest ODBC Driver for SQL Server on the machine (in my case, ODBC Driver 17). Create an ODBC connection with the Azure SQL Server name. In the next step, you see the following authentication options:

ODBC Driver 17 authentication options

Although all the options for Azure Active Directory should work, we are interested in integrated authentication, based on the credentials of the logged on user. In the next steps, I only set the database name and accepted all the other options as default. Now you can test the data source:

Testing the connection

Great, but what about your applications? Depending on the application, there still might be quite some work to do and some code to change. Instead of opening that can of worms 🥫, let’s see how this integrated connection works from a sample Pyhton application.

Integrated Authentication test with Python

The following Python program uses pyodbc to connect with integrated authentication:

import pyodbc 

server = 'tcp:AZURESQLSERVER.database.windows.net' 
database = 'AZURESQLDATABASE' 

cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';authentication=ActiveDirectoryIntegrated')
cursor = cnxn.cursor()

cursor.execute("SELECT * from TEST;") 
row = cursor.fetchone() 
while row: 
    print(row[0])
    row = cursor.fetchone()

My SQL Database contains a simple table called test. The logged on user has read and write access. As you can see, there is no user and password specified. In the connection string, “authentication=ActiveDirectoryIntegrated” is doing the trick. The result is just my name (hey, it’s a test):

Result returned from table

Conclusion

In this post, I have highlighted how single sign-on works for domain-joined devices when you use Azure AD Connect password synchronization in combination with the Seamless SSO feature. This scenario is supported by SQL Server ODBC driver version 17 as shown with the Python code. Although I used SQL Database as an example, this scenario also applies to a managed instance.

Update to IoT Simulator

Quite a while ago, I wrote a small IoT Simulator in Go that creates or deletes multiple IoT devices in IoT Hub and sends telemetry at a preset interval. However, when you use version 0.4 of the simulator, you will encounter issues in the following cases:

  • You create a route to store telemetry in an Azure Storage account: the telemetry will be base 64 encoded
  • You create an Event Grid subscription that forwards the telemetry to an Azure Function or other target: the telemetry will be base 64 encoded

For example, in Azure Storage, when you store telemetry in JSON format, you will see something like this with versions 0.4 and older:

{"EnqueuedTimeUtc":"2020-02-10T14:13:19.0770000Z","Properties":{},"SystemProperties":{"connectionDeviceId":"dev35","connectionAuthMethod":"{\"scope\":\"hub\",\"type\":\"sas\",\"issuer\":\"iothub\",\"acceptingIpFilterRule\":null}","connectionDeviceGenerationId":"637169341138506565","contentType":"application/json","contentEncoding":"","enqueuedTime":"2020-02-10T14:13:19.0770000Z"},"Body":"eyJUZW1wZXJhdHVyZSI6MjYuNjQ1NjAwNTMyMTg0OTA0LCJIdW1pZGl0eSI6NDQuMzc3MTQxODcxODY5OH0="}

Note that the body is base 64 encoded. The encoding stems from the fact that UTF-8 encoding was not specified as can be seen in the JSON. contentEncoding is indeed empty and the contentType does not mention the character set.

To fix that, a small code change was required. Note that the code uses HTTP to send telemetry, not MQTT or AMQP:

Setting the character set as part of the content type

With the character set as UTF-8, the telemetry in the Storage Account will look like this:

{"EnqueuedTimeUtc":"2020-02-11T15:02:07.9520000Z","Properties":{},"SystemProperties":{"connectionDeviceId":"dev15","connectionAuthMethod":"{\"scope\":\"hub\",\"type\":\"sas\",\"issuer\":\"iothub\",\"acceptingIpFilterRule\":null}","connectionDeviceGenerationId":"637169341138088841","contentType":"application/json; charset=utf-8","contentEncoding":"","enqueuedTime":"2020-02-11T15:02:07.9520000Z"},"Body":{"Temperature":20.827852028684607,"Humidity":49.95058826575425}}

Note that contentEncoding is still empty here, but contentType includes the charset. That is enough for the body to be in plain text.

The change will also allow you to use queries on the body in IoT Hub message routing filters or Event Grid subscription filters.

Enjoy the new version 0.5! All three of you… 😉😉😉

Writing a Kubernetes operator with Kopf

In today’s post, we will write a simple operator with Kopf, which is a Python framework created by Zalando. A Kubernetes operator is a piece of software, running in Kubernetes, that does something application specific. To see some examples of what operators are used for, check out operatorhub.io.

Our operator will do something simple in order to easily grasp how it works:

  • the operator will create a deployment that runs nginx
  • nginx will serve a static website based on a git repository that you specify; we will use an init container to grab the website from git and store it in a volume
  • you can control the number of instances via a replicas parameter

That’s great but how will the operator know when it has to do something, like creating or updating resources? We will use custom resources for that. Read on to learn more…

Note: source files are on GitHub

Custom Resource Definition (CRD)

Kubernetes allows you to define your own resources. We will create a resource of type (kind) DemoWeb. The CRD is created with the YAML below:

# A simple CRD to deploy a demo website from a git repo
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: demowebs.baeke.info
spec:
  scope: Namespaced
  group: baeke.info
  versions:
    - name: v1
      served: true
      storage: true
  names:
    kind: DemoWeb
    plural: demowebs
    singular: demoweb
    shortNames:
      - dweb
  additionalPrinterColumns:
    - name: Replicas
      type: string
      priority: 0
      JSONPath: .spec.replicas
      description: Amount of replicas
    - name: GitRepo
      type: string
      priority: 0
      JSONPath: .spec.gitrepo
      description: Git repository with web content

For more information (and there is a lot) about CRDs, see the documentation.

Once you create the above resource with kubectl apply (or create), you can create a custom resource based on the definition:

apiVersion: baeke.info/v1
kind: DemoWeb
metadata:
  name: demoweb1
spec:
  replicas: 2
  gitrepo: "https://github.com/gbaeke/static-web.git"

Note that we specified our own API and version in the CRD (baeke.info/v1) and that we set the kind to DemoWeb. In the additionalPrinterColumns, we defined some properties that can be set in the spec that will also be printed on screen. When you list resources of kind DemoWeb, you will the see replicas and gitrepo columns:

Custom resources based on the DemoWeb CRD

Of course, creating the CRD and the custom resources is not enough. To actually create the nginx deployment when the custom resource is created, we need to write and run the operator.

Writing the operator

I wrote the operator on a Mac with Python 3.7.6 (64-bit). On Windows, for best results, make sure you use Miniconda instead of Python from the Windows Store. First install Kopf and the Kubernetes package:

pip3 install kopf kubernetes

Verify you can run kopf:

Running kopf

Let’s write the operator. You can find it in full here. Here’s the first part:

Naturally, we import kopf and other necessary packages. As noted before, kopf and kubernetes will have to be installed with pip. Next, we define a handler that runs whenever a resource of our custom type is spotted by the operator (with the @kopf.on.create decorator). The handler has two parameters:

  • spec object: allows us to retrieve our custom properties with spec.get (e.g. spec.get(‘replicas’, 1) – the second parameter is the default value)
  • **kwargs: a dictionary with lots of extra values we can use; we use it to retrieve the name of our custom resource (e.g. demoweb1); we can use that name to derive the name of our deployment and to set labels for our pods

Note: instead of using **kwargs to retrieve the name, you can also define an extra name parameter in the handler like so: def create_fn(spec, name, **kwargs); see the docs for more information

Our deployment is just yaml stored in the doc variable with some help from the Python yaml package. We use spec.get and the name variable to customise it.

After the doc variable, the following code completes the event handler:

The rest of the operator

With kopf.adopt, we make sure the deployment we create is a child of our custom resource. When we delete the custom resource, its children are also deleted.

Next, we simply use the kubernetes client to create a deployment via the apps/v1 api. The method create_namespaced_deployment takes two required parameters: the namespace and the deployment specification. Note there is only minimal error checking here. There is much more you can do with regards to error checking, retries, etc…

Now we can run the operator with:

kopf run operator-filename.py

You can perfectly run this on your local workstation if you have a working kube config pointing at a running cluster with the CRD installed. Kopf will automatically use that for authentication:

Running the operator on your workstation

Running the operator in your cluster

To run the operator in your cluster, create a Dockerfile that produces an image with Python, kopf, kubernetes and your operator in Python. In my case:

FROM python:3.7
RUN mkdir /src
ADD with_create.py /src
RUN pip install kopf
RUN pip install kubernetes
CMD kopf run /src/with_create.py --verbose

We added the verbose parameter for extra logging. Next, run the following commands to build and push the image (example with my image name):

docker build -t gbaeke/kopf-demoweb .
docker push gbaeke/kopf-demoweb

Now you can deploy the operator to the cluster:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: demowebs-operator
spec:
  replicas: 1
  strategy:
    type: Recreate
  selector:
    matchLabels:
      application: demowebs-operator
  template:
    metadata:
      labels:
        application: demowebs-operator
    spec:
      serviceAccountName: demowebs-account
      containers:
      - name: demowebs
        image: gbaeke/kopf-demoweb

The above is just a regular deployment but the serviceAccountName is extremely important. It gives kopf and your operator the required access rights to create the deployment is the target namespace. Check out the documentation to find out more about the creation of the service account and the required roles. Note that you should only run one instance of the operator!

Once the operator is deployed, you will see it running as a normal pod:

The operator is running

To see what is going on, check the logs. Let’s show them with octant:

Your operator logs

At the bottom, you see what happens when a creation event is detected for a resource of type DemoWeb. The spec is shown with the git repository and the number on replicas.

Now you can create resources of kind DemoWeb and see what happens. If you have your own git repository with some HTML in it, try to use that. Otherwise, just use mine at https://github.com/gbaeke/static-web.

Conclusion

Writing an operator is easy to do with the Kopf framework. Do note that we only touched on the basics to get started. We only have an on.create handler, and no on.update handler. So if you want to increase the number of replicas, you will have to delete the custom resource and create a new one. Based on the example though, it should be pretty easy to fix that. The git repo contains an example of an operator that also implements the on.update handler (with_update.py).

A quick tour of Kustomize

Image above from: https://kustomize.io/

When you have to deploy an application to multiple environments like dev, test and production there are many solutions available to you. You can manually deploy the app (Nooooooo! 😉), use a CI/CD system like Azure DevOps and its release pipelines (with or without Helm) or maybe even a “GitOps” approach where deployments are driven by a tool such as Flux or Argo based on a git repository.

In the latter case, you probably want to use a configuration management tool like Kustomize for environment management. Instead of explaining what it does, let’s take a look at an example. Suppose I have an app that can be deployed with the following yaml files:

  • redis-deployment.yaml: simple deployment of Redis
  • redis-service.yaml: service to connect to Redis on port 6379 (Cluster IP)
  • realtime-deployment.yaml: application that uses the socket.io library to display real-time updates coming from a Redis channel
  • realtime-service.yaml: service to connect to the socket.io application on port 80 (Cluster IP)
  • realtime-ingress.yaml: ingress resource that defines the hostname and TLS certificate for the socket.io application (works with nginx ingress controller)

Let’s call this collection of files the base and put them all in a folder:

Base files for the application

Now I would like to modify these files just a bit, to install them in a dev namespace called realtime-dev. In the ingress definition I want to change the name of the host to realdev.baeke.info instead of real.baeke.info for production. We can use Kustomize to reach that goal.

In the base folder, we can add a kustomization.yaml file like so:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- realtime-ingress.yaml
- realtime-service.yaml
- redis-deployment.yaml
- redis-service.yaml
- realtime-deployment.yaml

This lists all the resources we would like to deploy.

Now we can create a folder for our patches. The patches define the changes to the base. Create a folder called dev (next to base). We will add the following files (one file blurred because it’s not relevant to this post):

kustomization.yaml contains the following:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: realtime-dev
resources:
- ./namespace.yaml
bases:
- ../base
patchesStrategicMerge:
- realtime-ingress.yaml
 

The namespace: realtime-dev ensures that our base resource definitions are updated with that namespace. In resources, we ensure that namespace gets created. The file namespace.yaml contains the following:

apiVersion: v1
kind: Namespace
metadata:
  name: realtime-dev 

With patchesStrategicMerge we specify the file(s) that contain(s) our patches, in this case just realtime-ingress.yaml to modify the hostname:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    kubernetes.io/ingress.class: nginx
  name: realtime-ingress
spec:
  rules:
  - host: realdev.baeke.info
    http:
      paths:
      - backend:
          serviceName: realtime
          servicePort: 80
        path: /
  tls:
  - hosts:
    - realdev.baeke.info
    secretName: real-dev-baeke-info-tls

Note that we also use certmanager here to issue a certificate to use on the ingress. For dev environments, it is better to use the Let’s Encrypt staging issuer instead of the production issuer.

We are now ready to generate the manifests for the dev environment. From the parent folder of base and dev, run the following command:

kubectl kustomize dev

The above command generates the patched manifests like so:

apiVersion: v1 
kind: Namespace
metadata:      
  name: realtime-dev
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: realtime
  name: realtime
  namespace: realtime-dev
spec:
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: realtime
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: redis
  name: redis
  namespace: realtime-dev
spec:
  ports:
  - port: 6379
    targetPort: 6379
  selector:
    app: redis
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: realtime
  name: realtime
  namespace: realtime-dev
spec:
  replicas: 1
  selector:
    matchLabels:
      app: realtime
  template:
    metadata:
      labels:
        app: realtime
    spec:
      containers:
      - env:
        - name: REDISHOST
          value: redis:6379
        image: gbaeke/fluxapp:1.0.5
        name: realtime
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: 150m
            memory: 150Mi
          requests:
            cpu: 25m
            memory: 50Mi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: redis
  name: redis
  namespace: realtime-dev
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - image: redis:4-32bit
        name: redis
        ports:
        - containerPort: 6379
        resources:
          requests:
            cpu: 200m
            memory: 100Mi
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    kubernetes.io/ingress.class: nginx
  name: realtime-ingress
  namespace: realtime-dev
spec:
  rules:
  - host: realdev.baeke.info
    http:
      paths:
      - backend:
          serviceName: realtime
          servicePort: 80
        path: /
  tls:
  - hosts:
    - realdev.baeke.info
    secretName: real-dev-baeke-info-tls

Note that namespace realtime-dev is used everywhere and that the Ingress resource uses realdev.baeke.info. The original Ingress resource looked like below:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: realtime-ingress
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  tls:
  - hosts:
    - real.baeke.info
    secretName: real-baeke-info-tls
  rules:
  - host: real.baeke.info
    http:
      paths:
      - path: /
        backend:
          serviceName: realtime
          servicePort: 80 

As you can see, Kustomize has updated the host in tls: and rules: and also modified the secret name (which will be created by certmanager).

You have probably seen that Kustomize is integrated with kubectl. It’s also available as a standalone executable.

To directly apply the patched manifests to your cluster, run kubectl apply -k dev. The result:

namespace/realtime-dev created
service/realtime created
service/redis created
deployment.apps/realtime created
deployment.apps/redis created
ingress.extensions/realtime-ingress created

In another post, we will look at using Kustomize with Flux. Stay tuned!

GitOps with Weaveworks Flux – Installing and Updating Applications

In a previous post, we installed Weaveworks Flux. Flux synchronizes the contents of a git repository with your Kubernetes cluster. Flux can easily be installed via a Helm chart. As an example, we installed Traefik by adding the following yaml to the synced repository:

apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
  name: traefik
  namespace: default
  annotations:
    fluxcd.io/ignore: "false"
spec:
  releaseName: traefik
  chart:
    repository: https://kubernetes-charts.storage.googleapis.com/
    name: traefik
    version: 1.78.0
  values:
    serviceType: LoadBalancer
    rbac:
      enabled: true
    dashboard:
      enabled: true   

It does not matter where you put this file because Flux scans the complete repository. I added the file to a folder called traefik.

If you look more closely at the YAML file, you’ll notice its kind is HelmRelease. You need an operator that can handle this type of file, which is this one. In the previous post, we installed the custom resource definition and the operator manually.

Adding a custom application

Now it’s time to add our own application. You do not need to use Helm packages or the Helm operator to install applications. Regular yaml will do just fine.

The application we will deploy needs a Redis backend. Let’s deploy that first. Add the following yaml file to your repository:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
  labels:
    app: redis       
spec:
  selector:
    matchLabels:     
      app: redis
  replicas: 1        
  template:          
    metadata:
      labels:        
        app: redis
    spec:            
      containers:
      - name: redis
        image: redis
        resources:
          requests:
            cpu: 200m
            memory: 100Mi
        ports:
        - containerPort: 6379
---        
apiVersion: v1
kind: Service        
metadata:
  name: redis
  labels:            
    app: redis
spec:
  ports:
  - port: 6379       
    targetPort: 6379
  selector:          
    app: redis

After committing this file, wait a moment or run fluxctl sync. When you run kubectl get pods for the default namespace, you should see the Redis pod:

Redis is running — yay!!!

Now it’s time to add the application. I will use an image, based on the following code: https://github.com/gbaeke/realtime-go (httponly branch because master contains code to automatically request a certificate with Let’s Encrypt). I pushed the image to Docker Hub as gbaeke/fluxapp:1.0.0. Now let’s deploy the app with the following yaml:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: realtime
  labels:
    app: realtime       
spec:
  selector:
    matchLabels:     
      app: realtime
  replicas: 1        
  template:          
    metadata:
      labels:        
        app: realtime
    spec:            
      containers:
      - name: realtime
        image: gbaeke/fluxapp:1.0.0
        env:
        - name: REDISHOST
          value: "redis:6379"
        resources:
          requests:
            cpu: 50m
            memory: 50Mi
          limits:
            cpu: 150m
            memory: 150Mi
        ports:
        - containerPort: 8080
---        
apiVersion: v1
kind: Service        
metadata:
  name: realtime
  labels:            
    app: realtime
spec:
  ports:
  - port: 80       
    targetPort: 8080
  selector:          
    app: realtime
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: realtime-ingress
spec:
  rules:
  - host: realtime.IP.xip.io
    http:
      paths:
      - path: /
        backend:
          serviceName: realtime
          servicePort: 80

In the above yaml, replace IP in the Ingress specification to the IP of the external load balancer used by your Ingress Controller. Once you add the yaml to the git repository and you run fluxctl sync the application should be deployed. You see the following page when you browse to http://realtime.IP.xip.io:

Web app deployed via Flux and standard yaml

Great, v1.0.0 of the app is deployed using the gbaeke/fluxapp:1.0.0 image. But what if I have a new version of the image and the yaml specification does not change? Read on…

Upgrading the application

If you have been following along, you can now run the following command:

fluxctl list-workloads -a

This will list all workloads on the cluster, including the ones that were not installed by Flux. If you check the list, none of the workloads are automated. When a workload is automated, it can automatically upgrade the application when a new image appears. Let’s try to automate the fluxapp. To do so, you can either add annotations to your yaml or use fluxctl. Let’s use the yaml approach by adding the following to our deployment:

annotations:
    flux.weave.works/automated: "true"
    flux.weave.works/tag.realtime: semver:~1.0

Note: Flux only works with immutable tags; do not use latest

After committing the file and running fluxctl sync, you can run fluxctl list-workloads -a again. The deployment should now be automated:

fluxapp is now automated

Now let’s see what happens when we add a new version of the image with tag 1.0.1. That image uses a different header color to show the difference. Flux monitors the repository for changes. When it detects a new version of the image that matches the semver filter, it will modify the deployment. Let’s check with fluxctl list-workloads -a:

new image deployed

And here’s the new color:

New color in version 1.0.1. Exciting! 😊

But wait… what about the git repo?

With the configuration of a deploy key, Flux has access to the git repository. When a deployment is automated and the image is changed, that change is also reflected in the git repo:

Weave Flux updated the realtime yaml file

In the yaml, version 1.0.1 is now used:

Flux updated the yaml file

What if I don’t like this release? With fluxctl, you can rollback to a previous version like so:

Rolling back a release – will also update the git repo

Although this works, the deployment will be updated to 1.0.1 again since it is automated. To avoid that, first lock the deployment (or workload) and then force the release of the old image:

fluxctl lock -w=deployment/realtime

fluxctl release -n default --workload=deployment/realtime --update-image=gbaeke/fluxapp:1.0.0 --force

In your yaml, there will be an additional annotation: fluxcd.io/locked: ‘true’ and the image will be set to 1.0.0.

Conclusion

In this post, we looked at deploying and updating an application via Flux automation. You only need a couple of annotations to make this work. This was just a simple example. For an example with dev, staging and production branches and promotion from staging to production, be sure to look at https://github.com/fluxcd/helm-operator-get-started as well.

Using the OAuth Client Credentials Flow

I often get questions about protecting applications like APIs using OAuth. I guess you know the drill:

  • you have to obtain a token (typically a JWT or JSON Web Token)
  • the client submits the token to your backend (via a Authorization HTTP header)
  • the token needs to be verified (do you trust it?)
  • you need to grab some fields from the token to use in your application (claims).

When the client is a daemon or some server side process, you can use the client credentials grant flow to obtain the token from Azure AD. The flow works as follows:

OAuth Client Credentials Flow (image from Microsoft docs)

The client contacts the Azure AD token endpoint to obtain a token. The client request contains a client ID and client secret to properly authenticate to Azure AD as a known application. The token endpoint returns the token. In this post, I only focus on the access token which is used to access the resource web API. The client uses the access token in the Authorization header of requests to the API.

Let’s see how this works. Oh, and by the way, this flow should be done with Azure AD. Azure AD B2C does not support this type of flow (yet).

Create a client application in Azure AD

In Azure AD, create a new App Registration. This can be a standard app registration for Web APIs. You do not need a redirect URL or configure public clients or implicit grants.

Standard run of the mill app registration

In Certificates & secrets, create a client secret and write it down. It will not be shown anymore when you later come back to this page:

Yes, I set it to Never Expire!

From the Overview page, note the application ID (also client ID). You will need that later to request a token.

Why do we even create this application? It represents the client application that will call your APIs. With this application, you control the secret that the client application uses but also the access rights to the APIs as we will see later. The client application will request a token, specifying the client ID and the client secret. Let’s now create another application that represents the backend API.

Create an API application in Azure AD

This is another App Registration, just like the app registration for the client. In this case, it represents the API. Its settings are a bit different though. There is no need to specify redirect URIs or other settings in the Authentication setting. There is also no need for a client secret. We do want to use the Expose an API page though:

Expose API page

Make sure you get the application ID URI. In the example above, it is api://06b2a484-141c-42d3-9d73-32bec5910b06 but you can change that to something more descriptive.

When you use the client credentials grant, you do not use user scopes. As such, the Scopes defined by this API list is empty. Instead, you want to use application roles which are defined in the manifest:

Application role in the manifest

There is one role here called invokeRole. You need to generate a GUID manually and use that as the id. Make sure allowedMemberTypes contains Application.

Great! But now we need to grant the client the right to obtain a token for one or more of the roles. You do that in the client application, in API Permissions:

Client application is granted access to the invokeRole application role of the API application

To grant the permission, just click Add a permission, select My APIs, click your API and select the role:

Selecting the role

Delegated permissions is greyed out because there are no user scopes. Application permissions is active because we defined an application role on the API application.

Obtaining a token

The server-side application only needs to do one call to the token endpoint to obtain the access token. Here is an example call with curl:

curl -d "grant_type=client_credentials&client_id=f1f695cb-2d00-4c0f-84a5-437282f3f3fd&client_secret=SECRET&audience=api%3A%2F%2F06b2a484-141c-42d3-9d73-32bec5910b06&scope=api%3A%2F%2F06b2a484-141c-42d3-9d73-32bec5910b06%2F.default" -X POST "https://login.microsoftonline.com/019486dd-8ffb-45a9-9232-4132babb1324/oauth2/v2.0/token"

Ouch, lots of gibberish here. Let’s break it down:

  • the POST needs to send URL encoded data in the body; curl’s -d takes care of that but you need to perform the URL encoding yourself
  • grant_type: client_credentials to indicate you want to use this flow
  • client_id: the application ID of the client app registration in Azure AD
  • client_secret: URL encoded secret that you generated when you created the client app registration
  • audience: the resource you want an access token for; it is the URL encoding of api://06b2a484-141c-42d3-9d73-32bec5910b06 as set in Expose an API
  • scope: this one is a bit special; for the v2 endpoint that we use here it needs to be api://06b2a484-141c-42d3-9d73-32bec5910b06/.default (but URL encoded); the scope (or roles) that the client application has access to will be included in the token

The POST goes to the Azure AD v2.0 token endpoint. There is also a v1 endpoint which would require other fields. See the Microsoft docs for more info. Note that I also updated the application manifests to issue v2 tokens via the accessTokenAcceptedVersion field (set to 2).

The result of the call only results in an access token (no refresh token in the client credentials flow). Something like below with the token shortened:

{"token_type":"Bearer","expires_in":3600,"ext_expires_in":3600,"access_token":"eyJ0e..."}

The access_token can be decoded on https://jwt.ms:

Decoded token

Note that the invokeRole is present because the client application was granted access to that role. We also know the application ID that represents the API, which is in the aud field. The azp field contains the application ID of the client application.

Great, we can now use this token to call our API. The raw HTTP request would be in this form.

GET https://somehost/calc/v1/add/1/1 HTTP/1.1 
Host: somehost 
Authorization: Bearer eyJ0e...

Of course, your application needs to verify the token somehow. This can be done in your application or in an intermediate layer such as API Management. We will take a look at how to do this with API Management in a later post.

Conclusion

Authentication, authorization and, on a broader scale, identity can be very challenging. Technically though, a flow such as the client credentials flow, is fairly simple to implement once you have done it a few times. Hopefully, if you are/were struggling with this type of flow, this post has given you some pointers!

Exposing a local endpoint to the Internet with inlets

A while ago, I learned about inlets by Alex Ellis. It allows you to expose an endpoint on your internal network via a tunnel to an exit node. To actually reach your internal website, you navigate to the public IP and port of the exit node. Something like this:

Internet user --> public IP:port of exit node -- tunnel --> your local endpoint

On both the exit node and your local network, you need to run inlets. Let’s look at an example. Suppose I want to expose my Magnificent Image Classifier 😀 running on my local machine to the outside world. The classifier is actually just a container you can run as follows:

docker run -p 9090:9090 -d gbaeke/nasnet

The container image is big so it will take while to start. When the container is started, just navigate to http://localhost:9090 to see the UI. You can upload a picture to classify it.

So far so good. Now you need an exit node with a public IP. I deployed a small Azure B-series Linux VM (B1s; 7 euros/month). SSH into that VM and install the inlets CLI (yeah, I know piping a script to sudo sh is dangerous 😏):

curl -sLS https://get.inlets.dev | sudo sh

Now run the inlets server (from instructions here):

export token=$(head -c 16 /dev/urandom | shasum | cut -d" " -f1) 
inlets server --port=9090 --token="$token"

The first line just generates a random token. You can use any token you want or even omit a token (not recommended). The second command runs the server on port 9090. It’s the same port as my local endpoint but that is not required. You can use any valid port.

TIP: the Azure VM had a network security group (NSG) configured so I had to add TCP port 9090 to the allow list

Now that the server is running, let’s run the client. Install inlets like above or use brew install inlets on a Mac and run the following commands:

export REMOTE="IP OF EXIT NODE:9090"
export TOKEN="TOKEN FROM SERVER"  
inlets client \
   --remote=$REMOTE \  
   --upstream=http://127.0.0.1:9090  
   --token $TOKEN

The inlets client will establish a web sockets connection to the inlets server on the exit node. The –upstream option is used to specify the local endpoint. In my case, that’s the classifier container (nasnet-go).

I can now browse to the public IP and port of the inlets server to see the classifier UI:

The inlets server will show the logs:

I think inlets is a fantastic tool that is useful in many scenarios. I have used ngrok in the past but it has some limits. You can pay to remove those limits. Inlets, on the other hand, is fully open source and not limited in any way. Be sure to check out the inlets GitHub page which has lots more details. Highly recommended!!!