Dapr Service Invocation between an HTTP Python client and a GRPC Go server

Recently, I published several videos about Dapr on my Youtube channel. The videos cover the basics of state management, PubSub and service invocation.

The Getting Started with state management and service invocation:

Let’s take a closer look at service invocation with HTTP, Python and Node.

Service Invocation with HTTP

Service invocation (from Dapr docs)

The services you write (here Service A and B) talk to each other using the Dapr runtime. On Kubernetes, you talk to a Dapr sidecar deployed alongside your service container. On your development machine, you run your services via dapr run.

If you want to expose a method on Service B and you use HTTP, you just need to expose an HTTP handler or route. For example, with Express in Node you would use something like:

const express = require('express');
const app = express();
app.post('/neworder', (req, res) => { your code }

You then run your service and annotate it with the proper Dapr annotations (Kubernetes):

annotations:
        dapr.io/enabled: "true"
        dapr.io/id: "node"
        dapr.io/port: "3000"

On your local machine, you would just run the service via dapr run:

dapr run --app-id node --app-port 3000 node app.js

In the last example, the Dapr id is node and we indicate that the service is listening on port 3000. To invoke the method from service A, it can use the following code (Python example shown):

dapr_port = os.getenv("DAPR_HTTP_PORT", 3500)
dapr_url = "http://localhost:{}/v1.0/invoke/node/method/neworder".format(dapr_port)
message = {"data": {"orderId": 1234}}
response = requests.post(dapr_url, json=message, timeout=5)

As you can see, service A does not contact service B directly. It just talks to its Dapr sidecar on localhost (or Dapr on your dev machine) and asks it to invoke the neworder method via a service that uses Dapr id node. It is also clear that both service A and B use HTTP only. Because you just use HTTP to expose and invoke methods, you can use any language or framework.

You can find a complete example here with Node and Python.

Service Invocation with HTTP and GRPC

Dapr has SDKs available for C#, Go and other languages. You might prefer those over the generic HTTP approach. In the case of Go, the SDK uses GRPC to interface with the Dapr runtime. With Dapr in between, one service can use HTTP while another uses GRPC.

Let’s take a look at a service that exposes a method (HelloFromGo) from a Go application. The full example is here. Instead of creating an HTTP route with the name of your method, you use an OnInvoke handler that looks like this (only the start is shown, see the full code):

func (s *server) OnInvoke(ctx context.Context, in *commonv1pb.InvokeRequest) (*commonv1pb.InvokeResponse, error) {
	var response string

	switch in.Method {
	case "HelloFromGo":

		response = s.HelloFromGo()

Naturally, you also have to implement an HelloFromGo() method as well:

// HelloFromGo is a simple demo method to invoke
func (s *server) HelloFromGo() string {
	return "Hello"

}

Another service can use any language or framework and invoke the above method with a POST to the following URL if the Dapr id of the Go service is goserver:

http://localhost:3500/v1.0/invoke/goserver/method/HelloFromGo

A POST to the above URL tells Dapr to execute the OnInvoke method via GRPC which will run the HelloFromGo function. It is perfectly possible to include a payload in your POST and have the OnInvoke handler to process that payload. The full example is here which also includes sending and processing a JSON payload and sending back a text response. You will need to somewhat understand how GRPC works and also understand protocol buffers. A good book on GRPC is the following one: https://learning.oreilly.com/library/view/grpc-up-and/9781492058328/.

Conclusion

Dapr allows you to choose between HTTP and GRPC interfaces to interact with the runtime. You can choose whatever is most comfortable to you. One team can use HTTP with Python, JavaScript etc… while other teams use GRPC with their language of choice. Whatever you choose, the Dapr runtime will make sure service invocation just works allowing you to focus on the code.

Update to IoT Simulator

Quite a while ago, I wrote a small IoT Simulator in Go that creates or deletes multiple IoT devices in IoT Hub and sends telemetry at a preset interval. However, when you use version 0.4 of the simulator, you will encounter issues in the following cases:

  • You create a route to store telemetry in an Azure Storage account: the telemetry will be base 64 encoded
  • You create an Event Grid subscription that forwards the telemetry to an Azure Function or other target: the telemetry will be base 64 encoded

For example, in Azure Storage, when you store telemetry in JSON format, you will see something like this with versions 0.4 and older:

{"EnqueuedTimeUtc":"2020-02-10T14:13:19.0770000Z","Properties":{},"SystemProperties":{"connectionDeviceId":"dev35","connectionAuthMethod":"{\"scope\":\"hub\",\"type\":\"sas\",\"issuer\":\"iothub\",\"acceptingIpFilterRule\":null}","connectionDeviceGenerationId":"637169341138506565","contentType":"application/json","contentEncoding":"","enqueuedTime":"2020-02-10T14:13:19.0770000Z"},"Body":"eyJUZW1wZXJhdHVyZSI6MjYuNjQ1NjAwNTMyMTg0OTA0LCJIdW1pZGl0eSI6NDQuMzc3MTQxODcxODY5OH0="}

Note that the body is base 64 encoded. The encoding stems from the fact that UTF-8 encoding was not specified as can be seen in the JSON. contentEncoding is indeed empty and the contentType does not mention the character set.

To fix that, a small code change was required. Note that the code uses HTTP to send telemetry, not MQTT or AMQP:

Setting the character set as part of the content type

With the character set as UTF-8, the telemetry in the Storage Account will look like this:

{"EnqueuedTimeUtc":"2020-02-11T15:02:07.9520000Z","Properties":{},"SystemProperties":{"connectionDeviceId":"dev15","connectionAuthMethod":"{\"scope\":\"hub\",\"type\":\"sas\",\"issuer\":\"iothub\",\"acceptingIpFilterRule\":null}","connectionDeviceGenerationId":"637169341138088841","contentType":"application/json; charset=utf-8","contentEncoding":"","enqueuedTime":"2020-02-11T15:02:07.9520000Z"},"Body":{"Temperature":20.827852028684607,"Humidity":49.95058826575425}}

Note that contentEncoding is still empty here, but contentType includes the charset. That is enough for the body to be in plain text.

The change will also allow you to use queries on the body in IoT Hub message routing filters or Event Grid subscription filters.

Enjoy the new version 0.5! All three of you… 😉😉😉

Streamlined Kubernetes Development with Draft

A longer time ago, I wrote a post about draft. Draft is a tool to streamline your Kubernetes development experience. It basically automates, based on your code, the creation of a container image, storing the image in a registry and installing a container based on that image using a Helm chart. Draft is meant to be used during the development process while you are still messing around with your code. It is not meant as a deployment mechanism in production.

The typical workflow is the following:

  • in the folder with your source files, run draft create
  • to build, push and install the container run draft up; in the background a Helm chart is used
  • to see the logs and connect to the app in your container over an SSH tunnel, run draft connect
  • modify your code and run draft up again
  • rinse and repeat…

Let’s take a look at how it works in a bit more detail, shall we?

Prerequisites

Naturally, you need a Kubernetes cluster with kubectl, the Kubernetes cli, configured to use that cluster.

Next, install Helm on your system and install Tiller, the server-side component of Helm on the cluster. Full installation instructions are here. If your cluster uses rbac, check out how to configure the proper service account and role binding. Run helm init to initialize Helm locally and install Tiller at the same time.

Now install draft on your system. Check out the quickstart for installation instructions. Run draft init to initialize it.

Getting some source code

Let’s use a small Go program to play with draft. You can use the realtime-go repository. Clone it to your system and checkout the httponly branch:

git clone https://github.com/gbaeke/realtime-go.git
git checkout httponly

You will need a redis server as a back-end for the realtime server. Let’s install that the quick and dirty way:

kubectl run redis --image=redis --replicas=1 
kubectl expose deploy/redis –port 6379  

Running draft create

In the realtime-go folder, run draft create. You should get the following output:

draft create output

The command tries to detect the language and it found several. In this case, because there is no pack for Coq (what is that? 😉) and HTML, it used Go. Knowing the language, draft creates a simple Dockerfile if there is no such file in the folder:

FROM golang
ENV PORT 8080
EXPOSE 8080

WORKDIR /go/src/app
COPY . .

RUN go get -d -v ./...
RUN go install -v ./...

CMD ["app"] 

Usually, I do not use the Dockerfile created by draft. If there already is a Dockerfile in the folder, draft will use that one. That’s what happened in our case because the folder contains a 2-stage Dockerfile.

Draft created some other files as well:

  • draft.toml: configuration file (more info); can be used to create environments like staging and production with different settings such as the Kubernetes namespace to deploy to or the Dockerfile to use
  • draft.tasks.toml: run commands before or after you deploy your container with draft (more info); we could have used this to install and remove the redis container
  • .draftignore: yes, to ignore stuff

Draft also created a charts folder that contains the Helm chart that draft will use to deploy your container. It can be modified to suit your particular needs as we will see later.

Helm charts folder and a partial view on the deployment.yaml file in the chart

Setting the container registry

In older versions of draft, the source files were compressed and sent to a sever-side component that created the container. At present though, the container is built locally and then pushed to a registry of your choice. If you want to use Azure Container Registry (ACR), run the following commands (set and login):

draft config set registry REGISTRYNAME.azurecr.io
az acr login -n REGISTRYNAME

Note that you need the Azure CLI for the last command. You also need to set the subscription to the one that contains the registry you reference.

With this configuration, you need Docker on your system. Docker will build and push the container. If you want to build in the cloud, you can use ACR Build Tasks. To do that, use these commands:

draft config set container-builder acrbuild
draft config set registry REGISTRYNAME.azurecr.io
draft config set resource-group-name RESOURCEGROUPNAME

Make sure your are logged in to the subscription (az login) and login to ACR as well before continuing. In this example, I used ACR build tasks.

Note: because ACR build tasks do not cache intermediate layers, this approach can lead to longer build times; when the image is small as in this case, doing a local build and push is preferred!

Running draft up

We are now ready to run draft up. Let’s do so and see what happens:

results of draft up

YES!!!! Draft built the container image and released it. Run helm ls to check the release. It did not have to push the image because it was built in ACR and pushed from there. Let’s check the ACR build logs in the portal (you can also use the draft logs command):

acr build log for the 2-stage Docker build

Fixing issues

Although the container is properly deployed (check it with helm ls), if you run kubectl get pods you will notice an error:

container error

In this case, the container errors out because it cannot find the redis host, which is a dependency. We can tell the container to look for redis via a REDISHOST environment variable. You can add it to deployment.yaml in the chart like so:

environment variable in deployment.yaml

After this change, just run draft up again and hope for the best!

Running draft connect

With the realtime-go container up and running, run draft connect:

output of draft connect

This maps a local port on your system to the remote port over an ssh tunnel. In addition, it streams the logs from the container. You can now connect to http://localhost:18181 (or whatever port you’ll get):

Great success! The app is running

If you want a public IP for your service, you can modify the Helm chart. In values.yaml, set service.type to LoadBalancer instead of ClusterIP and run draft up again. You can verify the external IP by running kubectl get svc.

Conclusion

Working with draft while your are working on one or more containers and still hacking away at your code really is smooth sailing. If you are not using it yet, give it a go and see if you like it. I bet you will!

Using TensorFlow models in Go

Image via www.vpnsrus.com

In earlier posts, I discussed hosting a deep learning model such as Resnet50 on Kubernetes or Azure Container Instances. The model can then be used as any API which receives input as JSON and returns a result as JSON.

Naturally, you can also run the model in offline scenarios and directly from your code. In this post, I will take a look at calling a TensorFlow model from Go. If you want to follow along, you will need Linux or MacOS because the Go module does not support Windows.

Getting Ready

I installed an Ubuntu Data Science Virtual Machine on Azure and connected to it with X2Go:

Data Science Virtual Machine (Ubuntu) with X2Go

The virtual machine has all the required machine learning tools installed such as TensorFlow and Python. It also has Visual Studio Code. There are some extra requirements though:

  • Go: follow the instructions here to download and install Go
  • TensorFlow C API: follow the instructions here to download and install the C API; the TensorFlow package for Go requires this; it is recommended to also build and run the Hello from TensorFlow C program to verify that the library works (near the bottom of the instructions page)

After installing Go and the TensorFlow C API, install the TensorFlow Go package with the following command:

go get github.com/tensorflow/tensorflow/tensorflow/go

Test the package with go test:

go test github.com/tensorflow/tensorflow/tensorflow/go

The above command should return:

ok      github.com/tensorflow/tensorflow/tensorflow/go  0.104s

The go get command installed the package in $HOME/go/src/github.com if you did not specify a custom $GOPATH (see this wiki page for more info).

Getting a model

A model describes how the input (e.g. an image for image classification) gets translated to an output (e.g. a list of classes with probabilities). The model contains thousands or even millions of parameters which means a model can be quite large. In this example, we will use NASNetMobile which can be used to classify images.

Now we need some code to save the model in TensorFlow format so that it can be used from a Go program. The code below is based on the sample code on the NASNetMobile page from modeldepot.io. It also does a quick test inference on a cat image.

import keras
from keras.applications.nasnet import NASNetMobile
from keras.preprocessing import image
from keras.applications.xception import preprocess_input, decode_predictions
import numpy as np
import tensorflow as tf
from keras import backend as K

sess = tf.Session()
K.set_session(sess)

model = NASNetMobile(weights="imagenet")
img = image.load_img('cat.jpg', target_size=(224,224))
img_arr = np.expand_dims(image.img_to_array(img), axis=0)
x = preprocess_input(img_arr)
preds = model.predict(x)
print('Prediction:', decode_predictions(preds, top=5)[0])

#save the model for use with TensorFlow
builder = tf.saved_model.builder.SavedModelBuilder("nasnet")

#Tag the model, required for Go
builder.add_meta_graph_and_variables(sess, ["atag"])
builder.save()
sess.close()

On the Ubuntu Data Science Virtual Machine, the above code should execute without any issues because all Python packages are already installed. I used the py35 conda environment. Use activate py35 to make sure you are in that environment.

The above code results in a nasnet folder, which contains the saved_model.pb file for the graph structure. The actual weights are in the variables subfolder. In total, the nasnet folder is around 38MB.

Great! Now we need a way to use the model from our Go program.

Using the saved model from Go

The model can be loaded with the LoadSavedModel function of the TensorFlow package. That package is imported like so:

import (
tf "github.com/tensorflow/tensorflow/tensorflow/go"
)

LoadSavedModel is used like so:

model, err := tf.LoadSavedModel("nasnet",
[]string{"atag"}, nil)
if err != nil {
log.Fatal(err)
}

The above code simply tries to load the model from the nasnet folder. We also need to specify the tag.

Next, we need to load an image and convert the image to a tensor with the following dimensions [1][224][224][3]. This is similar to my earlier ResNet50 post.

Now we need to pass the tensor to the model as input, and retrieve the class predictions as output. The following code achieves this:

output, err := model.Session.Run(
map[tf.Output]*tf.Tensor{
model.Graph.Operation("input_1").Output(0): input,
},
[]tf.Output{
model.Graph.Operation("predictions/Softmax").Output(0),
},
nil,
)
if err != nil {
log.Fatal(err)
}

What the heck is this? The run method is defined as follows:

func (s *Session) Run(feeds map[Output]*Tensor, fetches []Output, targets []*Operation) ([]*Tensor, error)

When you build a model, you can give names to tensors and operations. In this case the input tensor (of dimensions [1][224][224][3]) is called input_1 and needs to be specified as a map. The inference operation is called predictions/Softmax and the output needs to be specified as an array.

The actual predictions can be retrieved from the output variable:

predictions, ok := output[0].Value().([][]float32)
if !ok {
log.Fatal(fmt.Sprintf("output has unexpected type %T", output[0].Value()))
}

If you are not very familiar with Go, the code above uses type assertion to verify that predictions is a 2-dimensional array of float32. If the type assertion succeeds, the predictions variable will contain the actual predictions: [[<probability class 1 (tench)>, <probability class 2 (goldfish)>, …]]

You can now simply find the top prediction(s) in the array and match them with the list of classes for NASNet (actually the ImageNet classes). I get the following output with a cat image:

Yep, it’s a tabby!

If you are wondering what image I used:

Tabby?

Conclusion

With Go’s TensorFlow bindings, you can load TensorFlow models from disk and use them for inference locally, without having to call a remote API. We used Python to prepare the model with some help from Keras.

Building a real-time messaging server in Go

Often, I need a simple real-time server and web interface that shows real-time events. Although there are many options available like socket.io for Node.js or services like Azure SignalR and PubNub, I decided to create a real-time server in Go with a simple web front-end:

The impressive UI of the real-time web front-end

For a real-time server in Go, there are several options. You could use Gorilla WebSocket of which there is an excellent tutorial, and use native WebSockets in the browser. There’s also Glue. However, if you want to use the socket.io client, you can use https://github.com/googollee/go-socket.io. It is an implementation, although not a complete one, of socket.io. For production scenarios, I recommend using socket.io with Node.js because it is heavily used, has more features, better documentation, etc…

With that out of the way, let’s take a look at the code. Some things to note in advance:

  • the code uses the concept of rooms (as in a chat room); clients can join a room and only see messages for that room; you can use that concept to create a “room” for a device and only subscribe to messages for that device
  • the code use the excellent https://github.com/mholt/certmagic to enable https via a Let’s Encrypt certificate (DNS-01 verification)
  • the code uses Redis as the back-end; applications send messages to Redis via a PubSub channel; the real-time Go server checks for messages via a subscription to one or more Redis channels

The code is over at https://github.com/gbaeke/realtime-go.

Server

Let’s start with the imports. Naturally we need Redis support, the actual go-socket.io packages and certmagic. The cloudflare package is needed because my domain baeke.info is managed by CloudFlare. The package gives certmagic the ability to create the verification record that Let’s Encrypt will check before issuing the certificate:

import (
"log"
"net/http"
"os"

"github.com/go-redis/redis"
socketio "github.com/googollee/go-socket.io"
"github.com/mholt/certmagic"
"github.com/xenolf/lego/providers/dns/cloudflare"
)

Next, the code checks if the RTHOST environment variable is set. RTHOST should contain the hostname you request the certificate for (e.g. rt.baeke.info).

Let’s check the block of code that sets up the Redis connection.

// redis connection
client := redis.NewClient(&redis.Options{
Addr: getEnv("REDISHOST", "localhost:6379"),
})

// subscribe to all channels
pubsub := client.PSubscribe("*")
_, err := pubsub.Receive()
if err != nil {
panic(err)
}

// messages received on a Go channel
ch := pubsub.Channel()

First, we create a new Redis client. We either use the address in the REDISHOST environment variable or default to localhost:6379. I will later run this server on Azure Container Instances (ACI) in a multi-container setup that also includes Redis.

With the call to PSubscribe, a pattern subscribe is used to subscribe to all PubSub channels (*). If the subscribe succeeds, a Go channel is setup to actually receive messages on.

Now that the Redis connection is configured, let’s turn to socket.io:

server, err := socketio.NewServer(nil)
if err != nil {
log.Fatal(err)
}

server.On("connection", func(so socketio.Socket) {
log.Printf("New connection from %s ", so.Id())

so.On("channel", func(channel string) {
log.Printf("%s joins channel %s\n", so.Id(), channel)
so.Join(channel)
})

so.On("disconnection", func() {
log.Printf("disconnect from %s\n", so.Id())
})
})

The above code is pretty simple. We create a new socket.io server and subsequently setup event handlers for the following events:

  • connection: code that runs when a web client connects; gives us the socket the client connects on which is further used by the channel and disconnection handler
  • channel: this handler runs when a client sends a message of the chosen type channel; the channel contains the name of the socket.io room to join; this is used by the client to indicate what messages to show (e.g. just for device01); in the browser, the client sends a channel message that contains the text “device01”
  • disconnection: code to run when the client disconnects from the socket

Naturally, something crucial is missing. We need to check Redis for messages in Redis channels and broadcast them to matching socket.io “channels”. This is done in a Go routine that runs concurrently with the main code:

 go func(srv *socketio.Server) {
   for msg := range ch {
      log.Println(msg.Channel, msg.Payload)
      srv.BroadcastTo(msg.Channel, "message", msg.Payload)
   }
 }(server)

The anonymous function accepts a parameter of type socketio.Server. We use the BroadcastTo method of socketio.Server to broadcast messages arriving on the Redis PubSub channels to matching socket.io channels. Note that we send a message of type “message” so the client will have to check for “message” coming in as well. Below is a snippet of client-side code that does that. It adds messages to the messages array defined on the Vue.js app:

socket.on('message', function(msg){
app.messages.push(msg)
}

The rest of the server code basically configures certmagic to request the Let’s Encrypt certificate and sets up the http handlers for the static web client and the socket.io server:

// certificate magic
certmagic.Agreed = true
certmagic.CA = certmagic.LetsEncryptStagingCA

cloudflare, err := cloudflare.NewDNSProvider()
if err != nil {
log.Fatal(err)
}

certmagic.DNSProvider = cloudflare

mux := http.NewServeMux()
mux.Handle("/socket.io/", server)
mux.Handle("/", http.FileServer(http.Dir("./assets")))

certmagic.HTTPS([]string{rthost}, mux)

Let’s try it out! The GitHub repository contains a file called multi.yaml, which deploys both the socket.io server and Redis to Azure Container Instances. The following images are used:

  • gbaeke/realtime-go-le: built with this Dockerfile; the image has a size of merely 14MB
  • redis: the official Redis image

To make it work, you will need to update the environment variables in multi.yaml with the domain name and your CloudFlare credentials. If you do not use CloudFlare, you can use one of the other providers. If you want to use the Let’s Encrypt production CA, you will have to change the code, rebuild the container, store it in your registry and modify multi.yaml accordingly.

In Azure Container Instances, the following is shown:

socket.io and Redis container in ACI

To test the setup, I can send a message with redis-cli, from a console to the realtime-redis container:

Testing with redis-cli in the Redis container

You should be aware that using CertMagic with ephemeral storage is NOT a good idea due to potential Let’s Encrypt rate limiting. You should store the requested certificates in persistent storage like an Azure File Share and mount it at /.local/share/certmagic!

Client

The client is a Vue.js app. It was not created with the Vue cli so it just grabs the Vue.js library from the content delivery network (CDN) and has all logic in a single page. The socket.io library (v1.3.7) is also pulled from the CDN. The socket.io client code is kept at a minimum for demonstration purposes:

 var socket = io();
socket.emit('channel','device01');
socket.on('message', function(msg){
app.messages.push(msg)
})

When the page loads, the client emits a channel message to the server with a payload of device01. As you have seen in the server section, the server reacts to this message by joining this client to a socket.io room, in this case with name device01.

Whenever the client receives a message from the server, it adds the message to the messages array which is bound to a list item (li) with a v-for directive.

Surprisingly easy no? With a few lines of code you have a fully functional real-time messaging solution!

Running a GoCV application in a container

In earlier posts (like here and here) I mentioned GoCV. GoCV allows you to use the popular OpenCV library from your Go programs. To avoid installing OpenCV and having to compile it from source, a container that runs your GoCV app can be beneficial. This post provides information about doing just that.

The following GitHub repository, https://github.com/denismakogon/gocv-alpine, contains all you need to get started. It’s for OpenCV 3.4.2 so you will run into issues when you want to use OpenCV 4.0. The pull request, https://github.com/denismakogon/gocv-alpine/pull/7, contains the update to 4.0 but it has not been merged yet. I used the proposed changes in the pull request to build two containers:

  • the build container: gbaeke/gocv-4.0.0-build
  • the run container: gbaeke/gocv-4.0.0-run

They are over on Docker Hub, ready for use. To actually use the above images in a typical two-step build, I used the following Dockerfile:

FROM gbaeke/gocv-4.0.0-build as build       
RUN go get -u -d gocv.io/x/gocv
RUN go get -u -d github.com/disintegration/imaging
RUN go get -u -d github.com/gbaeke/emotion
RUN cd $GOPATH/src/github.com/gbaeke/emotion && go build -o $GOPATH/bin/emo ./main.go

FROM gbaeke/gocv-4.0.0-run
COPY --from=build /go/bin/emo /emo
ADD haarcascade_frontalface_default.xml /

ENTRYPOINT ["/emo"]

The above Dockerfile uses the webcam emotion detection program from https://github.com/gbaeke/emotion. To run it on a Linux system, use the following command:

docker run -it --rm --device=/dev/video0 --env SCOREURI="YOUR-SCORE-URI" --env VIDEO=0 gbaeke/emo

The SCOREURI environment variable needs to refer to the score URI offered by the ONNX FER+ container as discussed in Detecting Emotions with FER+. With VIDEO=0 the GUI window that shows the webcam video stream is turned off (required). Detected emotions will be logged to the console.

To be able to use the actual webcam of the host, the –device flag is used to map /dev/video0 from the host to the container. That works well on a Linux host and was tested on a laptop running Ubuntu 16.04.

Detecting emotions with FER+

In an earlier post, I discussed classifying images with the ResNet50v2 model. Azure Machine Learning Service was used to create a container image that used the ONNX ResNet50v2 model and the ONNX Runtime for scoring.

Continuing on that theme, I created a container image that uses the ONNX FER+ model that can detect emotions in an image. The container image also uses the ONNX Runtime for scoring.

You might wonder why you would want to detect emotions this way when there are many services available that can do this for you with a simple API call! You could use Microsoft’s Face API or Amazon’s Rekognition for example. While those services are easy to use and provide additional features, they do come at a cost. If all you need is basic detection of emotions, using this FER+ container is sufficient and cost effective.

Azure Face API (image from Microsoft website)

A notebook to create the image and deploy a container to Azure Container Instances (ACI) can be found here. The notebook uses the Azure Machine Learning SDK to register the model to an Azure Machine Learning workspace, build a container image from that model and deploy the container to ACI. The scoring script score.py is shown below.

score.py

The model expects an 64×64 gray scale image of a face in an array with the following dimensions: [1][1][64][64]. The output is JSON with a results array that contains the probabilities for each emotion and a time field with the inference time.

The emotion probabilities are in this order:

0: "neutral", 1: "happy", 2: "surprise", 3: "sadness", 4: "anger", 5: "disgust", 6: "fear", 7: "contempt

To actually capture the emotions, I wrote a small demo program in Go that uses OpenCV (via GoCV). You can find it on GitHub: https://github.com/gbaeke/emotion. You will need to install OpenCV and GoCV. Find the instructions here: https://gocv.io/getting-started/linux/. There are similar instructions for Mac and Windows but I have not tried those

The program is still a little rough around the edges but it does the trick. The scoring URI is hard coded to http://localhost:5002/score. With Docker installed, use the following command to install the scoring container:

 docker run -d -p 5002:5001 gbaeke/onnxferplus

Have fun with it!

ResNet50v2 classification in Go with a local container

To quickly go to the code, go here. Otherwise, keep reading…

In a previous blog post, I wrote about classifying images with the ResNet50v2 model from the ONNX Model Zoo. In that post, the container ran on a Kubernetes cluster with GPU nodes. The nodes had an NVIDIA v100 GPU. The actual classification was done with a simple Python script with help from Keras and Numpy. Each inference took around 25 milliseconds.

In this post, we will do two things:

  • run the scoring container (CPU) on a local machine that runs Docker
  • perform the scoring (classification) in Go

Installing the scoring container locally

I pushed the scoring container with the ONNX ResNet50v2 image to the following location: https://cloud.docker.com/u/gbaeke/repository/docker/gbaeke/onnxresnet50v2. Run the container with the following command:

docker run -d -p 5001:5001 gbaeke/onnxresnet50

The container will be pulled and started. The scoring URI is on http://localhost:5001/score.

Note that in the previous post, Azure Machine Learning deployed two containers: the scoring container (the one described above) and a front-end container. In that scenario, the front-end container handles the HTTP POST requests (optionally with SSL) and route the request to the actual scoring container.

The scoring container accepts the same payload as the front-end container. That means it can be used on its own, as we are doing now.

Note that you can also use IoT Edge, as explained in an earlier post. That actually shows how easy it is to push AI models to the edge and use them locally, befitting your business case.

Scoring with Go

To actually classify images, I wrote a small Go program to do just that. Although there are some scientific libraries for Go, they are not really needed in this case. That means we do have to create the 4D tensor payload and interpret the softmax result manually. If you check the code, you will see that is not awfully difficult.

The code can be found in the following GitHub repository: https://github.com/gbaeke/resnet-score.

Remember that this model expects the input as a 4D tensor with the following dimensions:

  • dimension 0: batch (we only send one image here)
  • dimension 1: channels (one for each; RGB)
  • dimension 2: height
  • dimension 3: width

The 4D tensor needs to be serialized to JSON in a field called data. We send that data with HTTP POST to the scoring URI at http://localhost:5001/score.

The response from the container will be JSON with two fields: a result field with the 1000 softmax values and a time field with the inference time. We can use the following two structs for marshaling and unmarshaling

Input and output of the model

Note that this model expects pictures to be scaled to 224 by 224 as reflected by the height and width dimensions of the uint8 array. The rest of the code is summarized below:

  • read the image; the path of the image is passed to the code via the -image command line parameter
  • the image is resized with the github.com/disintegration/imaging package (linear method)
  • the 4D tensor is populated by iterating over all pixels of the image, extracting r,g and b and placing them in the BCHW array; note that the r,g and b values are uint16 and scaled to fit in a uint8
  • construct the input which is a struct of type InputData
  • marshal the InputData struct to JSON
  • POST the JSON to the local scoring URI
  • read the HTTP response and unmarshal the response in a struct of type OutputData
  • find the highest probability in the result and note the index where it was found
  • read the 1000 ImageNet categories from imagenet_class_index.json and marshal the JSON into a map of string arrays
  • print the category using the index with the highest probability and the map

What happens when we score the image below?

What is this thing?

Running the code gives the following result:

$ ./class -image images/cassette.jpg

Highest prob is 0.9981583952903748 at 481 (inference time: 0.3309464454650879 )
Probably [n02978881 cassette

The inference time is 1/3 of a second on my older Linux laptop with a dual-core i7.

Try it yourself by running the container and the class program. Download it from here (Linux).

%d bloggers like this: