Trying out WebAssembly on Azure Kubernetes Service

Introduction

In October 2021, Microsoft announced the public preview of AKS support for deploying WebAssembly System Interface (WASI) workloads in Kubernetes. You can read the announcement here. In short, that means we can run another type of workload on Kubernetes, besides containers!

WebAssembly is maybe best known for the ability to write code with languages such as C#, Go and Rust that can run in the browser, alongside JavaScript code. One example of this is Blazor, which allows you to build client web apps with C#.

Besides the browser, there are ways to run WebAssembly modules directly on the operating system. Because WebAssembly modules do not contain machine code suitable for a specific operating system and CPU architecture, you will need a runtime that can interpret the WebAssembly byte code. At the same time, WebAssembly modules should be able to interface with the operating system, for instance to access files. In other words, WebAssembly code should be able to access specific parts of the operating system outside the sandbox it is running in by default.

The WebAssembly System Interface (or WASI) allows WebAssembly modules to interact with the outside world. It allows you to declare what the module is allowed to see and access.

One example of a standalone runtime that can run WebAssembly modules is wasmtime. It supports interacting with the host environment via WASI as discussed above. For example, you can specify access to files on the host via the –dir flag and be very specific about what files and folders are allowed.

An example with Rust

In what follows, we will create Hello World-style application with Rust. You do not have to know anything about Rust to follow along. As a matter of fact, I do not know that much about Rust either. I just want a simple app to run on Azure Kubernetes Service later. Here’s the source code:

use std::env;

fn main() {
  println!("Content-Type: text/plain\n");
  println!("Hello, world!");

  printenv();
  
}

fn printenv() {
  for (key, value) in env::vars() {
    println!("{}: {}", key, value);
  }
}

Note: Because I am a bit more comfortable with Go, I first created a demo app with Go and used TinyGo to build the WebAssembly module. That worked great with wasmtime but did not work well on AKS. There is probably a good explanation for that. I will update this post when I learn more.

To continue with the Rust application, it is pretty clear what it does: it prints the Content-Type for a HTTP response, a Hello, World! message, and all environment variables. Why we set the Content-Type will become clearer later on!

To build this app, we need to target wasm32-wasi to build a WebAssembly module that supports WASI as well. You can run the following commands to do so (requires that Rust is installed on your system):

rustup target add wasm32-wasi
cargo build --release --target wasm32-wasi

The rustup command should only be run once. It adds wasm32-wasi as a supported target. The cargo build command then builds the WebAssembly module. On my system, that results in a file in the target/wasm32-wasi/release folder called sample.wasm (name comes from a setting in cargo.toml) . With WebAssembly support in VS Code, I can right click the file and use Show WebAssembly:

Showing the WebAssembly Module in VS Code (WebAssembly Toolkit for VS Code extension)

We can run this module with cargo run but that runs the app directly on the operating system. In my case that’s Ubuntu in Windows 11’s WSL2. To run the WebAssembly module , you can use wasmtime:

wasmtime sample.wasm

The module will not read the environment variables from the host. Instead, you pass environment variables from the wasmtime cli like so (command and result shown below):

wasmtime --env test=hello sample.wasm

Content-Type: text/plain

Hello, world!
test: hello

Publishing to Azure Container Registry

A WebAssembly can be published to Azure Container Registry with wasm-to-oci (see GitHub repo). The command below publishes our module:

wasm-to-oci push sample.wasm <ACRNAME>.azurecr.io/sample:1.0.0

Make sure you are logged in to ACR with az acr login -n <ACRNAME>. I also enabled anonymous pull on ACR to not run into issues with pulls from WASI-enabled AKS pools later. Indeed, AKS will be able to pull these artefacts to run them on a WASI node.

Here is the artefact as shown in ACR:

WASM module in ACR with mediaType = application/vnd.wasm.content.layer.v1+wasm

Running the module on AKS

To run WebAssembly modules on AKS nodes, you need to enable the preview as described here. After enabling the preview, I deployed a basic Kubernetes cluster with one node. It uses kubenet by default. That’s good because Azure CNI is not supported by WASI node pools.

az aks create -n wademo -g rg-aks --node-count 1

After finishing the deployment, I added a WASI nodepool:

az aks nodepool add \
    --resource-group rg-aks \
    --cluster-name wademo \
    --name wasipool \
    --node-count 1 \
    --workload-runtime wasmwasi

The aks-preview extension (install or update it!!!) for the Azure CLI supports the –workload-runtime flag. It can be set to wasmwasi to deploy nodes that can execute WebAssembly modules. The piece of technology that enables this is the krustlet project as described here: https://krustlet.dev. Krustlet is basically a WebAssembly kubelet. It stands for Kubernetes Rust Kubelet.

After running the above commands, the command kubectl get nodes -o wide will look like below:

NAME                                STATUS   ROLES   AGE    VERSION         INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
aks-nodepool1-23291395-vmss000000   Ready    agent   3h6m   v1.20.9         10.240.0.4    <none>        Ubuntu 18.04.6 LTS   5.4.0-1059-azure   containerd://1.4.9+azure
aks-wasipool-23291395-vmss000000    Ready    agent   3h2m   1.0.0-alpha.1   10.240.0.5    <none>        <unknown>            <unknown>          mvp

As you can see it’s early days here! 😉 But we do have a node that can run WebAssembly! Let’s try to run our module by deploying a pod via the manifest below:

apiVersion: v1
kind: Pod
metadata:
  name: sample
  annotations:
    alpha.wagi.krustlet.dev/default-host: "0.0.0.0:3001"
    alpha.wagi.krustlet.dev/modules: |
      {
        "sample": {"route": "/"}
      }
spec:
  hostNetwork: true
  containers:
    - name: sample
      image: <ARCNAME>.azurecr.io/sample:1.0.0
      imagePullPolicy: Always
  nodeSelector:
    kubernetes.io/arch: wasm32-wagi
  tolerations:
    - key: "node.kubernetes.io/network-unavailable"
      operator: "Exists"
      effect: "NoSchedule"
    - key: "kubernetes.io/arch"
      operator: "Equal"
      value: "wasm32-wagi"
      effect: "NoExecute"
    - key: "kubernetes.io/arch"
      operator: "Equal"
      value: "wasm32-wagi"
      effect: "NoSchedule"

Wait a moment! There is a new acronym here: WAGI! WASI has no network primitives such as sockets so you should not expect to build a full webserver with it. WAGI, which stands for WebAssembly Gateway Interface, allows you to run WASI modules as HTTP handlers. It is heavily based on CGI, the Common Gateway Interface that allows mapping HTTP requests to executables (e.g. a Windows or Linux executable) via something like IIS or Apache.

We will need a way to map a route such as / to a module and the response to a requests should be HTTP responses. That is why we set the Content-Type in the example by simply printing it to stdout. WAGI will also set several environment variables with information about the incoming request. That is the reason we print all the environment variables. This feels a bit like the early 90’s to me when CGI was the hottest web tech in town! 😂

The mapping of routes to modules is done via annotations, as shown in the YAML. This is similar to the modules.toml file used to start a Wagi server manually. Because the WASI nodes are tainted, tolerations are used to allow the pod to be scheduled on such nodes. With the nodeSelector, the pod needs to be scheduled on such a node.

To run the WebAssembly module, apply the manifest above to the cluster as usual (assuming the manifest is in pod.yaml:

kubectl apply -f pod.yaml

Now run kubectl get pods. If the status is Registered vs Running, this is expected. The pod will not be ready either:

NAME    READY   STATUS       RESTARTS   AGE
sample  0/1     Registered   0          108m

In order to reach the workload from the Internet, you need to install nginx with a value.yaml file that contains the internal IP address of the WASI node as documented here.

After doing that, I can curl the public IP address of the nginx service of type LoadBalancer:

~ curl IP

Hello, world!
HTTP_ACCEPT: */*
QUERY_STRING: 
SERVER_PROTOCOL: HTTP/1.0
GATEWAY_INTERFACE: CGI/1.1
REQUEST_METHOD: GET
SERVER_PORT: 3001
REMOTE_ADDR: 10.240.0.4
X_FULL_URL: http://10.240.0.5:3001/
X_RAW_PATH_INFO: 
CONTENT_TYPE: 
SERVER_NAME: 10.240.0.5
SCRIPT_NAME: /
AUTH_TYPE: 
PATH_TRANSLATED: 
PATH_INFO: 
CONTENT_LENGTH: 0
X_MATCHED_ROUTE: /
REMOTE_HOST: 10.240.0.4
REMOTE_USER: 
SERVER_SOFTWARE: WAGI/1
HTTP_HOST: 10.240.0.5:3001
HTTP_USER_AGENT: curl/7.58.0

As you can see, WAGI has set environment variables that allows your handler to know more about the incoming request such as the HTTP User Agent.

Conclusion

Although WebAssembly is gaining in popularity to build browser-based applications, it is still early days for running these workloads on Kubernetes. WebAssembly will not replace containers anytime soon. In fact, that is not the actual goal. It just provides an additional choice that might make sense for some applications in the future. And as always, the future will arrive sooner than expected!

Update to IoT Simulator

Quite a while ago, I wrote a small IoT Simulator in Go that creates or deletes multiple IoT devices in IoT Hub and sends telemetry at a preset interval. However, when you use version 0.4 of the simulator, you will encounter issues in the following cases:

  • You create a route to store telemetry in an Azure Storage account: the telemetry will be base 64 encoded
  • You create an Event Grid subscription that forwards the telemetry to an Azure Function or other target: the telemetry will be base 64 encoded

For example, in Azure Storage, when you store telemetry in JSON format, you will see something like this with versions 0.4 and older:

{"EnqueuedTimeUtc":"2020-02-10T14:13:19.0770000Z","Properties":{},"SystemProperties":{"connectionDeviceId":"dev35","connectionAuthMethod":"{\"scope\":\"hub\",\"type\":\"sas\",\"issuer\":\"iothub\",\"acceptingIpFilterRule\":null}","connectionDeviceGenerationId":"637169341138506565","contentType":"application/json","contentEncoding":"","enqueuedTime":"2020-02-10T14:13:19.0770000Z"},"Body":"eyJUZW1wZXJhdHVyZSI6MjYuNjQ1NjAwNTMyMTg0OTA0LCJIdW1pZGl0eSI6NDQuMzc3MTQxODcxODY5OH0="}

Note that the body is base 64 encoded. The encoding stems from the fact that UTF-8 encoding was not specified as can be seen in the JSON. contentEncoding is indeed empty and the contentType does not mention the character set.

To fix that, a small code change was required. Note that the code uses HTTP to send telemetry, not MQTT or AMQP:

Setting the character set as part of the content type

With the character set as UTF-8, the telemetry in the Storage Account will look like this:

{"EnqueuedTimeUtc":"2020-02-11T15:02:07.9520000Z","Properties":{},"SystemProperties":{"connectionDeviceId":"dev15","connectionAuthMethod":"{\"scope\":\"hub\",\"type\":\"sas\",\"issuer\":\"iothub\",\"acceptingIpFilterRule\":null}","connectionDeviceGenerationId":"637169341138088841","contentType":"application/json; charset=utf-8","contentEncoding":"","enqueuedTime":"2020-02-11T15:02:07.9520000Z"},"Body":{"Temperature":20.827852028684607,"Humidity":49.95058826575425}}

Note that contentEncoding is still empty here, but contentType includes the charset. That is enough for the body to be in plain text.

The change will also allow you to use queries on the body in IoT Hub message routing filters or Event Grid subscription filters.

Enjoy the new version 0.5! All three of you… 😉😉😉

Streamlined Kubernetes Development with Draft

A longer time ago, I wrote a post about draft. Draft is a tool to streamline your Kubernetes development experience. It basically automates, based on your code, the creation of a container image, storing the image in a registry and installing a container based on that image using a Helm chart. Draft is meant to be used during the development process while you are still messing around with your code. It is not meant as a deployment mechanism in production.

The typical workflow is the following:

  • in the folder with your source files, run draft create
  • to build, push and install the container run draft up; in the background a Helm chart is used
  • to see the logs and connect to the app in your container over an SSH tunnel, run draft connect
  • modify your code and run draft up again
  • rinse and repeat…

Let’s take a look at how it works in a bit more detail, shall we?

Prerequisites

Naturally, you need a Kubernetes cluster with kubectl, the Kubernetes cli, configured to use that cluster.

Next, install Helm on your system and install Tiller, the server-side component of Helm on the cluster. Full installation instructions are here. If your cluster uses rbac, check out how to configure the proper service account and role binding. Run helm init to initialize Helm locally and install Tiller at the same time.

Now install draft on your system. Check out the quickstart for installation instructions. Run draft init to initialize it.

Getting some source code

Let’s use a small Go program to play with draft. You can use the realtime-go repository. Clone it to your system and checkout the httponly branch:

git clone https://github.com/gbaeke/realtime-go.git
git checkout httponly

You will need a redis server as a back-end for the realtime server. Let’s install that the quick and dirty way:

kubectl run redis --image=redis --replicas=1 
kubectl expose deploy/redis –port 6379  

Running draft create

In the realtime-go folder, run draft create. You should get the following output:

draft create output

The command tries to detect the language and it found several. In this case, because there is no pack for Coq (what is that? 😉) and HTML, it used Go. Knowing the language, draft creates a simple Dockerfile if there is no such file in the folder:

FROM golang
ENV PORT 8080
EXPOSE 8080

WORKDIR /go/src/app
COPY . .

RUN go get -d -v ./...
RUN go install -v ./...

CMD ["app"] 

Usually, I do not use the Dockerfile created by draft. If there already is a Dockerfile in the folder, draft will use that one. That’s what happened in our case because the folder contains a 2-stage Dockerfile.

Draft created some other files as well:

  • draft.toml: configuration file (more info); can be used to create environments like staging and production with different settings such as the Kubernetes namespace to deploy to or the Dockerfile to use
  • draft.tasks.toml: run commands before or after you deploy your container with draft (more info); we could have used this to install and remove the redis container
  • .draftignore: yes, to ignore stuff

Draft also created a charts folder that contains the Helm chart that draft will use to deploy your container. It can be modified to suit your particular needs as we will see later.

Helm charts folder and a partial view on the deployment.yaml file in the chart

Setting the container registry

In older versions of draft, the source files were compressed and sent to a sever-side component that created the container. At present though, the container is built locally and then pushed to a registry of your choice. If you want to use Azure Container Registry (ACR), run the following commands (set and login):

draft config set registry REGISTRYNAME.azurecr.io
az acr login -n REGISTRYNAME

Note that you need the Azure CLI for the last command. You also need to set the subscription to the one that contains the registry you reference.

With this configuration, you need Docker on your system. Docker will build and push the container. If you want to build in the cloud, you can use ACR Build Tasks. To do that, use these commands:

draft config set container-builder acrbuild
draft config set registry REGISTRYNAME.azurecr.io
draft config set resource-group-name RESOURCEGROUPNAME

Make sure your are logged in to the subscription (az login) and login to ACR as well before continuing. In this example, I used ACR build tasks.

Note: because ACR build tasks do not cache intermediate layers, this approach can lead to longer build times; when the image is small as in this case, doing a local build and push is preferred!

Running draft up

We are now ready to run draft up. Let’s do so and see what happens:

results of draft up

YES!!!! Draft built the container image and released it. Run helm ls to check the release. It did not have to push the image because it was built in ACR and pushed from there. Let’s check the ACR build logs in the portal (you can also use the draft logs command):

acr build log for the 2-stage Docker build

Fixing issues

Although the container is properly deployed (check it with helm ls), if you run kubectl get pods you will notice an error:

container error

In this case, the container errors out because it cannot find the redis host, which is a dependency. We can tell the container to look for redis via a REDISHOST environment variable. You can add it to deployment.yaml in the chart like so:

environment variable in deployment.yaml

After this change, just run draft up again and hope for the best!

Running draft connect

With the realtime-go container up and running, run draft connect:

output of draft connect

This maps a local port on your system to the remote port over an ssh tunnel. In addition, it streams the logs from the container. You can now connect to http://localhost:18181 (or whatever port you’ll get):

Great success! The app is running

If you want a public IP for your service, you can modify the Helm chart. In values.yaml, set service.type to LoadBalancer instead of ClusterIP and run draft up again. You can verify the external IP by running kubectl get svc.

Conclusion

Working with draft while your are working on one or more containers and still hacking away at your code really is smooth sailing. If you are not using it yet, give it a go and see if you like it. I bet you will!

Creating and containerizing a TensorFlow Go application

In an earlier post, I discussed using a TensorFlow model from a Go application. With the TensorFlow bindings for Go, you can load a model that was exported with TensorFlow’s SavedModelBuilder module. That module saves a “snapshot” of a trained model which can be used for inference.

In this post, we will actually use the model in a web application. The application presents the user with a page to upload an image:

The upload page

The class and its probability is displayed, including the processed image:

Clearly a hen!

The source code of the application can be found at https://github.com/gbaeke/nasnet-go. If you just want to try the application, use Docker and issue the following command (replace port 80 with another port if there is a conflict):

docker run -p 80:9090 -d gbaeke/nasnet

The image is around 2.55GB in size so be patient when you first run the application. When the container has started, open your browser at http://localhost to see the upload page.

To quickly try it, you can run the container on Azure Container Instances. If you use the Portal, specify port 9090 as the container port.

Nasnet container in ACI

A closer look at the appN

**UPDATE**: since first publication, the http handler code was moved into from main.go to handlers/handlers.go

In the init() function, the nasnet model is loaded with tf.LoadSavedModel. The ImageNet categories are also loaded with a call to getCategories() and stored in categories which is a map of int to a string array.

In main(), we simply print the TensorFlow version (1.12). Next, http.HandleFunc is used to setup a handler (upload func) when users connect to the root of the web app.

Naturally, most of the logic is in the upload function. In summary, it does the following:

  • when users just navigate to the page (HTTP GET verb), render the upload.gtpl template; that template contains the upload form and uses a bit of bootstrap to make it just a bit better looking (and that’s already an overstatement); to learn more about Go web templates, see this link.
  • when users submit a file (POST), the following happens:
    • read the image
    • convert the image to a tensor with the getTensor function; getTensor returns a *tf.Tensor; the tensor is created from a [1][224][224][3] array; note that each pixel value gets normalized by subtracting by 127.5 and then dividing by 127.5 which is the same preprocessing applied as in Keras (divide by 127.5 and subtract 1)
    • run a session by inputting the tensor and getting the categories and probabilities as output
    • look for the highest probability and save it, together with the category name in a variable of type ResultPageData (a struct)
    • the struct data is used as input for the response.gtpl template

Note that the image is also shown in the output. The processed image (resized to 224×224) gets converted to a base64-encoded string. That string can be used in HTML image rendering as follows (where {{.Picture}} in the template will be replaced by the encoded string):

 <img src="data:image/jpg;base64,{{.Picture}}"> 

Note that the application lacks sufficient error checking to gracefully handle the upload of non-image files. Maybe I’ll add that later! 😉

Containerization

To containerize the application, I used the Dockerfile from https://github.com/tinrab/go-tensorflow-image-recognition but removed the step that downloads the InceptionV3 model. My application contains a ready to use NasnetMobile model.

The container image is based on tensorflow/tensorflow:1.12.0. It is further modified as required with the TensorFlow C API and the installation of Go. As discussed earlier, I uploaded a working image on Docker Hub.

Conclusion

Once you know how to use TensorFlow models from Go applications, it is easy to embed them in any application, from command-line tools to APIs to web applications. Although this application does server-side processing, you can also use a model directly in the browser with TensorFlow.js or ONNX.js. For ONNX, try https://microsoft.github.io/onnxjs-demo/#/resnet50 to perform image classification with ResNet50 in the browser. You will notice that it will take a while to get started due to the model being downloaded. Once the model is downloaded, you can start classifying images. Personally, I prefer the server-side approach but it all depends on the scenario.

Using TensorFlow models in Go

Image via www.vpnsrus.com

In earlier posts, I discussed hosting a deep learning model such as Resnet50 on Kubernetes or Azure Container Instances. The model can then be used as any API which receives input as JSON and returns a result as JSON.

Naturally, you can also run the model in offline scenarios and directly from your code. In this post, I will take a look at calling a TensorFlow model from Go. If you want to follow along, you will need Linux or MacOS because the Go module does not support Windows.

Getting Ready

I installed an Ubuntu Data Science Virtual Machine on Azure and connected to it with X2Go:

Data Science Virtual Machine (Ubuntu) with X2Go

The virtual machine has all the required machine learning tools installed such as TensorFlow and Python. It also has Visual Studio Code. There are some extra requirements though:

  • Go: follow the instructions here to download and install Go
  • TensorFlow C API: follow the instructions here to download and install the C API; the TensorFlow package for Go requires this; it is recommended to also build and run the Hello from TensorFlow C program to verify that the library works (near the bottom of the instructions page)

After installing Go and the TensorFlow C API, install the TensorFlow Go package with the following command:

go get github.com/tensorflow/tensorflow/tensorflow/go

Test the package with go test:

go test github.com/tensorflow/tensorflow/tensorflow/go

The above command should return:

ok      github.com/tensorflow/tensorflow/tensorflow/go  0.104s

The go get command installed the package in $HOME/go/src/github.com if you did not specify a custom $GOPATH (see this wiki page for more info).

Getting a model

A model describes how the input (e.g. an image for image classification) gets translated to an output (e.g. a list of classes with probabilities). The model contains thousands or even millions of parameters which means a model can be quite large. In this example, we will use NASNetMobile which can be used to classify images.

Now we need some code to save the model in TensorFlow format so that it can be used from a Go program. The code below is based on the sample code on the NASNetMobile page from modeldepot.io. It also does a quick test inference on a cat image.

import keras
from keras.applications.nasnet import NASNetMobile
from keras.preprocessing import image
from keras.applications.xception import preprocess_input, decode_predictions
import numpy as np
import tensorflow as tf
from keras import backend as K

sess = tf.Session()
K.set_session(sess)

model = NASNetMobile(weights="imagenet")
img = image.load_img('cat.jpg', target_size=(224,224))
img_arr = np.expand_dims(image.img_to_array(img), axis=0)
x = preprocess_input(img_arr)
preds = model.predict(x)
print('Prediction:', decode_predictions(preds, top=5)[0])

#save the model for use with TensorFlow
builder = tf.saved_model.builder.SavedModelBuilder("nasnet")

#Tag the model, required for Go
builder.add_meta_graph_and_variables(sess, ["atag"])
builder.save()
sess.close()

On the Ubuntu Data Science Virtual Machine, the above code should execute without any issues because all Python packages are already installed. I used the py35 conda environment. Use activate py35 to make sure you are in that environment.

The above code results in a nasnet folder, which contains the saved_model.pb file for the graph structure. The actual weights are in the variables subfolder. In total, the nasnet folder is around 38MB.

Great! Now we need a way to use the model from our Go program.

Using the saved model from Go

The model can be loaded with the LoadSavedModel function of the TensorFlow package. That package is imported like so:

import (
tf "github.com/tensorflow/tensorflow/tensorflow/go"
)

LoadSavedModel is used like so:

model, err := tf.LoadSavedModel("nasnet",
[]string{"atag"}, nil)
if err != nil {
log.Fatal(err)
}

The above code simply tries to load the model from the nasnet folder. We also need to specify the tag.

Next, we need to load an image and convert the image to a tensor with the following dimensions [1][224][224][3]. This is similar to my earlier ResNet50 post.

Now we need to pass the tensor to the model as input, and retrieve the class predictions as output. The following code achieves this:

output, err := model.Session.Run(
map[tf.Output]*tf.Tensor{
model.Graph.Operation("input_1").Output(0): input,
},
[]tf.Output{
model.Graph.Operation("predictions/Softmax").Output(0),
},
nil,
)
if err != nil {
log.Fatal(err)
}

What the heck is this? The run method is defined as follows:

func (s *Session) Run(feeds map[Output]*Tensor, fetches []Output, targets []*Operation) ([]*Tensor, error)

When you build a model, you can give names to tensors and operations. In this case the input tensor (of dimensions [1][224][224][3]) is called input_1 and needs to be specified as a map. The inference operation is called predictions/Softmax and the output needs to be specified as an array.

The actual predictions can be retrieved from the output variable:

predictions, ok := output[0].Value().([][]float32)
if !ok {
log.Fatal(fmt.Sprintf("output has unexpected type %T", output[0].Value()))
}

If you are not very familiar with Go, the code above uses type assertion to verify that predictions is a 2-dimensional array of float32. If the type assertion succeeds, the predictions variable will contain the actual predictions: [[<probability class 1 (tench)>, <probability class 2 (goldfish)>, …]]

You can now simply find the top prediction(s) in the array and match them with the list of classes for NASNet (actually the ImageNet classes). I get the following output with a cat image:

Yep, it’s a tabby!

If you are wondering what image I used:

Tabby?

Conclusion

With Go’s TensorFlow bindings, you can load TensorFlow models from disk and use them for inference locally, without having to call a remote API. We used Python to prepare the model with some help from Keras.

Building a real-time messaging server in Go

Often, I need a simple real-time server and web interface that shows real-time events. Although there are many options available like socket.io for Node.js or services like Azure SignalR and PubNub, I decided to create a real-time server in Go with a simple web front-end:

The impressive UI of the real-time web front-end

For a real-time server in Go, there are several options. You could use Gorilla WebSocket of which there is an excellent tutorial, and use native WebSockets in the browser. There’s also Glue. However, if you want to use the socket.io client, you can use https://github.com/googollee/go-socket.io. It is an implementation, although not a complete one, of socket.io. For production scenarios, I recommend using socket.io with Node.js because it is heavily used, has more features, better documentation, etc…

With that out of the way, let’s take a look at the code. Some things to note in advance:

  • the code uses the concept of rooms (as in a chat room); clients can join a room and only see messages for that room; you can use that concept to create a “room” for a device and only subscribe to messages for that device
  • the code use the excellent https://github.com/mholt/certmagic to enable https via a Let’s Encrypt certificate (DNS-01 verification)
  • the code uses Redis as the back-end; applications send messages to Redis via a PubSub channel; the real-time Go server checks for messages via a subscription to one or more Redis channels

The code is over at https://github.com/gbaeke/realtime-go.

Server

Let’s start with the imports. Naturally we need Redis support, the actual go-socket.io packages and certmagic. The cloudflare package is needed because my domain baeke.info is managed by CloudFlare. The package gives certmagic the ability to create the verification record that Let’s Encrypt will check before issuing the certificate:

import (
"log"
"net/http"
"os"

"github.com/go-redis/redis"
socketio "github.com/googollee/go-socket.io"
"github.com/mholt/certmagic"
"github.com/xenolf/lego/providers/dns/cloudflare"
)

Next, the code checks if the RTHOST environment variable is set. RTHOST should contain the hostname you request the certificate for (e.g. rt.baeke.info).

Let’s check the block of code that sets up the Redis connection.

// redis connection
client := redis.NewClient(&redis.Options{
Addr: getEnv("REDISHOST", "localhost:6379"),
})

// subscribe to all channels
pubsub := client.PSubscribe("*")
_, err := pubsub.Receive()
if err != nil {
panic(err)
}

// messages received on a Go channel
ch := pubsub.Channel()

First, we create a new Redis client. We either use the address in the REDISHOST environment variable or default to localhost:6379. I will later run this server on Azure Container Instances (ACI) in a multi-container setup that also includes Redis.

With the call to PSubscribe, a pattern subscribe is used to subscribe to all PubSub channels (*). If the subscribe succeeds, a Go channel is setup to actually receive messages on.

Now that the Redis connection is configured, let’s turn to socket.io:

server, err := socketio.NewServer(nil)
if err != nil {
log.Fatal(err)
}

server.On("connection", func(so socketio.Socket) {
log.Printf("New connection from %s ", so.Id())

so.On("channel", func(channel string) {
log.Printf("%s joins channel %s\n", so.Id(), channel)
so.Join(channel)
})

so.On("disconnection", func() {
log.Printf("disconnect from %s\n", so.Id())
})
})

The above code is pretty simple. We create a new socket.io server and subsequently setup event handlers for the following events:

  • connection: code that runs when a web client connects; gives us the socket the client connects on which is further used by the channel and disconnection handler
  • channel: this handler runs when a client sends a message of the chosen type channel; the channel contains the name of the socket.io room to join; this is used by the client to indicate what messages to show (e.g. just for device01); in the browser, the client sends a channel message that contains the text “device01”
  • disconnection: code to run when the client disconnects from the socket

Naturally, something crucial is missing. We need to check Redis for messages in Redis channels and broadcast them to matching socket.io “channels”. This is done in a Go routine that runs concurrently with the main code:

 go func(srv *socketio.Server) {
   for msg := range ch {
      log.Println(msg.Channel, msg.Payload)
      srv.BroadcastTo(msg.Channel, "message", msg.Payload)
   }
 }(server)

The anonymous function accepts a parameter of type socketio.Server. We use the BroadcastTo method of socketio.Server to broadcast messages arriving on the Redis PubSub channels to matching socket.io channels. Note that we send a message of type “message” so the client will have to check for “message” coming in as well. Below is a snippet of client-side code that does that. It adds messages to the messages array defined on the Vue.js app:

socket.on('message', function(msg){
app.messages.push(msg)
}

The rest of the server code basically configures certmagic to request the Let’s Encrypt certificate and sets up the http handlers for the static web client and the socket.io server:

// certificate magic
certmagic.Agreed = true
certmagic.CA = certmagic.LetsEncryptStagingCA

cloudflare, err := cloudflare.NewDNSProvider()
if err != nil {
log.Fatal(err)
}

certmagic.DNSProvider = cloudflare

mux := http.NewServeMux()
mux.Handle("/socket.io/", server)
mux.Handle("/", http.FileServer(http.Dir("./assets")))

certmagic.HTTPS([]string{rthost}, mux)

Let’s try it out! The GitHub repository contains a file called multi.yaml, which deploys both the socket.io server and Redis to Azure Container Instances. The following images are used:

  • gbaeke/realtime-go-le: built with this Dockerfile; the image has a size of merely 14MB
  • redis: the official Redis image

To make it work, you will need to update the environment variables in multi.yaml with the domain name and your CloudFlare credentials. If you do not use CloudFlare, you can use one of the other providers. If you want to use the Let’s Encrypt production CA, you will have to change the code, rebuild the container, store it in your registry and modify multi.yaml accordingly.

In Azure Container Instances, the following is shown:

socket.io and Redis container in ACI

To test the setup, I can send a message with redis-cli, from a console to the realtime-redis container:

Testing with redis-cli in the Redis container

You should be aware that using CertMagic with ephemeral storage is NOT a good idea due to potential Let’s Encrypt rate limiting. You should store the requested certificates in persistent storage like an Azure File Share and mount it at /.local/share/certmagic!

Client

The client is a Vue.js app. It was not created with the Vue cli so it just grabs the Vue.js library from the content delivery network (CDN) and has all logic in a single page. The socket.io library (v1.3.7) is also pulled from the CDN. The socket.io client code is kept at a minimum for demonstration purposes:

 var socket = io();
socket.emit('channel','device01');
socket.on('message', function(msg){
app.messages.push(msg)
})

When the page loads, the client emits a channel message to the server with a payload of device01. As you have seen in the server section, the server reacts to this message by joining this client to a socket.io room, in this case with name device01.

Whenever the client receives a message from the server, it adds the message to the messages array which is bound to a list item (li) with a v-for directive.

Surprisingly easy no? With a few lines of code you have a fully functional real-time messaging solution!

Running a GoCV application in a container

In earlier posts (like here and here) I mentioned GoCV. GoCV allows you to use the popular OpenCV library from your Go programs. To avoid installing OpenCV and having to compile it from source, a container that runs your GoCV app can be beneficial. This post provides information about doing just that.

The following GitHub repository, https://github.com/denismakogon/gocv-alpine, contains all you need to get started. It’s for OpenCV 3.4.2 so you will run into issues when you want to use OpenCV 4.0. The pull request, https://github.com/denismakogon/gocv-alpine/pull/7, contains the update to 4.0 but it has not been merged yet. I used the proposed changes in the pull request to build two containers:

  • the build container: gbaeke/gocv-4.0.0-build
  • the run container: gbaeke/gocv-4.0.0-run

They are over on Docker Hub, ready for use. To actually use the above images in a typical two-step build, I used the following Dockerfile:

FROM gbaeke/gocv-4.0.0-build as build       
RUN go get -u -d gocv.io/x/gocv
RUN go get -u -d github.com/disintegration/imaging
RUN go get -u -d github.com/gbaeke/emotion
RUN cd $GOPATH/src/github.com/gbaeke/emotion && go build -o $GOPATH/bin/emo ./main.go

FROM gbaeke/gocv-4.0.0-run
COPY --from=build /go/bin/emo /emo
ADD haarcascade_frontalface_default.xml /

ENTRYPOINT ["/emo"]

The above Dockerfile uses the webcam emotion detection program from https://github.com/gbaeke/emotion. To run it on a Linux system, use the following command:

docker run -it --rm --device=/dev/video0 --env SCOREURI="YOUR-SCORE-URI" --env VIDEO=0 gbaeke/emo

The SCOREURI environment variable needs to refer to the score URI offered by the ONNX FER+ container as discussed in Detecting Emotions with FER+. With VIDEO=0 the GUI window that shows the webcam video stream is turned off (required). Detected emotions will be logged to the console.

To be able to use the actual webcam of the host, the –device flag is used to map /dev/video0 from the host to the container. That works well on a Linux host and was tested on a laptop running Ubuntu 16.04.

ResNet50v2 classification in Go with a local container

To quickly go to the code, go here. Otherwise, keep reading…

In a previous blog post, I wrote about classifying images with the ResNet50v2 model from the ONNX Model Zoo. In that post, the container ran on a Kubernetes cluster with GPU nodes. The nodes had an NVIDIA v100 GPU. The actual classification was done with a simple Python script with help from Keras and Numpy. Each inference took around 25 milliseconds.

In this post, we will do two things:

  • run the scoring container (CPU) on a local machine that runs Docker
  • perform the scoring (classification) in Go

Installing the scoring container locally

I pushed the scoring container with the ONNX ResNet50v2 image to the following location: https://cloud.docker.com/u/gbaeke/repository/docker/gbaeke/onnxresnet50v2. Run the container with the following command:

docker run -d -p 5001:5001 gbaeke/onnxresnet50

The container will be pulled and started. The scoring URI is on http://localhost:5001/score.

Note that in the previous post, Azure Machine Learning deployed two containers: the scoring container (the one described above) and a front-end container. In that scenario, the front-end container handles the HTTP POST requests (optionally with SSL) and route the request to the actual scoring container.

The scoring container accepts the same payload as the front-end container. That means it can be used on its own, as we are doing now.

Note that you can also use IoT Edge, as explained in an earlier post. That actually shows how easy it is to push AI models to the edge and use them locally, befitting your business case.

Scoring with Go

To actually classify images, I wrote a small Go program to do just that. Although there are some scientific libraries for Go, they are not really needed in this case. That means we do have to create the 4D tensor payload and interpret the softmax result manually. If you check the code, you will see that is not awfully difficult.

The code can be found in the following GitHub repository: https://github.com/gbaeke/resnet-score.

Remember that this model expects the input as a 4D tensor with the following dimensions:

  • dimension 0: batch (we only send one image here)
  • dimension 1: channels (one for each; RGB)
  • dimension 2: height
  • dimension 3: width

The 4D tensor needs to be serialized to JSON in a field called data. We send that data with HTTP POST to the scoring URI at http://localhost:5001/score.

The response from the container will be JSON with two fields: a result field with the 1000 softmax values and a time field with the inference time. We can use the following two structs for marshaling and unmarshaling

Input and output of the model

Note that this model expects pictures to be scaled to 224 by 224 as reflected by the height and width dimensions of the uint8 array. The rest of the code is summarized below:

  • read the image; the path of the image is passed to the code via the -image command line parameter
  • the image is resized with the github.com/disintegration/imaging package (linear method)
  • the 4D tensor is populated by iterating over all pixels of the image, extracting r,g and b and placing them in the BCHW array; note that the r,g and b values are uint16 and scaled to fit in a uint8
  • construct the input which is a struct of type InputData
  • marshal the InputData struct to JSON
  • POST the JSON to the local scoring URI
  • read the HTTP response and unmarshal the response in a struct of type OutputData
  • find the highest probability in the result and note the index where it was found
  • read the 1000 ImageNet categories from imagenet_class_index.json and marshal the JSON into a map of string arrays
  • print the category using the index with the highest probability and the map

What happens when we score the image below?

What is this thing?

Running the code gives the following result:

$ ./class -image images/cassette.jpg

Highest prob is 0.9981583952903748 at 481 (inference time: 0.3309464454650879 )
Probably [n02978881 cassette

The inference time is 1/3 of a second on my older Linux laptop with a dual-core i7.

Try it yourself by running the container and the class program. Download it from here (Linux).

Deploying Azure Cognitive Services Containers with IoT Edge

Introduction

Azure Cognitive Services is a collection of APIs that make your applications smarter. Some of those APIs are listed below:

  • Vision: image classification, face detection (including emotions), OCR
  • Language: text analytics (e.g. key phrase or sentiment analysis), language detection and translation

To use one of the APIs you need to provision it in an Azure subscription. After provisioning, you will get an endpoint and API key. Every time you want to classify an image or detect sentiment in a piece of text, you will need to post an appropriate payload to the cloud endpoint and pass along the API key as well.

What if you want to use these services but you do not want to pass your payload to a cloud endpoint for compliance or latency reasons? In that case, the Cognitive Services containers can be used. In this post, we will take a look at the Text Analytics containers, specifically the one for Sentiment Analysis. Instead of deploying the container manually, we will deploy the container with IoT Edge.

IoT Edge Configuration

To get started, create an IoT Hub. The free tier will do just fine. When the IoT Hub is created, create an IoT Edge device. Next, configure your actual edge device to connect to IoT Hub with the connection string of the device you created in IoT Hub. Microsoft have a great tutorial to do all of the above, using a virtual machine in Azure as the edge device. The tutorial I linked to is the one for an edge device running Linux. When finished, the device should report its status to IoT Hub:

If you want to install IoT Edge on an existing device like a laptop, follow the procedure for Linux x64.

Once you have your edge device up and running, you can use the following command to obtain the status of your edge device: sudo systemctl status iotedge. The result:

Deploy Sentiment Analysis container

With the IoT Edge daemon up and running, we can deploy the Sentiment Analysis container. In IoT Hub, select your IoT Edge device and select Set modules:

In Set Modules you have the ability to configure the modules for this specific device. Modules are always deployed as containers and they do not have to be specifically designed or developed for use with IoT Edge. In the three step wizard, add the Sentiment Analysis container in the first step. Click Add and then select IoT Edge Module. Provide the following settings:

Although the container can freely be pulled from the Image URI, the container needs to be configured with billing info and an API key. In the Billing environment variable, specify the endpoint URL for the API you configured in the cloud. In ApiKey set your API key. Note that the container always needs to be connected to the cloud to verify that you are allowed to use the service. Remember that although your payload is not sent to the cloud, your container usage is. The full container create options are listed below:

{
"Env": [
"Eula=accept",
"Billing=https://westeurope.api.cognitive.microsoft.com/text/analytics/v2.0",
"ApiKey=<yourKEY>"
],
"HostConfig": {
"PortBindings": {
"5000/tcp": [
{
"HostPort": "5000"
}
]
}
}
}

In HostConfig we ask the container runtime (Docker) to map port 5000 of the container to port 5000 of the host. You can specify other create options as well.

On the next page, you can configure routing between IoT Edge modules. Because we do not use actual IoT Edge modules, leave the configuration as shown below:

Now move to the last page in the Set Modules wizard to review the configuration and click Submit.

Give the deployment some time to finish. After a while, check your edge device with the following command: sudo iotedge list. Your TextAnalytics container should be listed. Alternatively, use sudo docker ps to list the Docker containers on your edge device.

Testing the Sentiment Analysis container

If everything went well, you should be able to go to http://localhost:5000/swagger to see the available endpoints. Open Sentiment Analysis to try out a sample:

You can use curl to test as well:

curl -X POST "http://localhost:5000/text/analytics/v2.0/sentiment" -H  "accept: application/json" -H  "Content-Type: application/json-patch+json" -d "{  \"documents\": [    {      \"language\": \"en\",      \"id\": \"1\",      \"text\": \"I really really despise this product!! DO NOT BUY!!\"    }  ]}"

As you can see, the API expects a JSON payload with a documents array. Each document object has three fields: language, id and text. When you run the above command, the result is:

{"documents":[{"id":"1","score":0.0001703798770904541}],"errors":[]}

In this case, the text I really really despise this product!! DO NOT BUY!! clearly results in a very bad score. As you might have guessed, 0 is the absolute worst and 1 is the absolute best.

Just for fun, I created a small Go program to test the API:

The Go program can be found here: https://github.com/gbaeke/sentiment. You can download the executable for Linux with: wget https://github.com/gbaeke/sentiment/releases/download/v0.1/ta. Make ta executable and use ./ta –help for help with the parameters.

Summary

IoT Edge is a great way to deploy containers to edge devices running Linux or Windows. Besides deploying actual IoT Edge modules, you can deploy any container you want. In this post, we deployed a Cognitive Services container that does Sentiment Analysis at the edge.

%d bloggers like this: