ResNet50v2 classification in Go with a local container

To quickly go to the code, go here. Otherwise, keep reading…

In a previous blog post, I wrote about classifying images with the ResNet50v2 model from the ONNX Model Zoo. In that post, the container ran on a Kubernetes cluster with GPU nodes. The nodes had an NVIDIA v100 GPU. The actual classification was done with a simple Python script with help from Keras and Numpy. Each inference took around 25 milliseconds.

In this post, we will do two things:

  • run the scoring container (CPU) on a local machine that runs Docker
  • perform the scoring (classification) in Go

Installing the scoring container locally

I pushed the scoring container with the ONNX ResNet50v2 image to the following location: https://cloud.docker.com/u/gbaeke/repository/docker/gbaeke/onnxresnet50v2. Run the container with the following command:

docker run -d -p 5001:5001 gbaeke/onnxresnet50

The container will be pulled and started. The scoring URI is on http://localhost:5001/score.

Note that in the previous post, Azure Machine Learning deployed two containers: the scoring container (the one described above) and a front-end container. In that scenario, the front-end container handles the HTTP POST requests (optionally with SSL) and route the request to the actual scoring container.

The scoring container accepts the same payload as the front-end container. That means it can be used on its own, as we are doing now.

Note that you can also use IoT Edge, as explained in an earlier post. That actually shows how easy it is to push AI models to the edge and use them locally, befitting your business case.

Scoring with Go

To actually classify images, I wrote a small Go program to do just that. Although there are some scientific libraries for Go, they are not really needed in this case. That means we do have to create the 4D tensor payload and interpret the softmax result manually. If you check the code, you will see that is not awfully difficult.

The code can be found in the following GitHub repository: https://github.com/gbaeke/resnet-score.

Remember that this model expects the input as a 4D tensor with the following dimensions:

  • dimension 0: batch (we only send one image here)
  • dimension 1: channels (one for each; RGB)
  • dimension 2: height
  • dimension 3: width

The 4D tensor needs to be serialized to JSON in a field called data. We send that data with HTTP POST to the scoring URI at http://localhost:5001/score.

The response from the container will be JSON with two fields: a result field with the 1000 softmax values and a time field with the inference time. We can use the following two structs for marshaling and unmarshaling

Input and output of the model

Note that this model expects pictures to be scaled to 224 by 224 as reflected by the height and width dimensions of the uint8 array. The rest of the code is summarized below:

  • read the image; the path of the image is passed to the code via the -image command line parameter
  • the image is resized with the github.com/disintegration/imaging package (linear method)
  • the 4D tensor is populated by iterating over all pixels of the image, extracting r,g and b and placing them in the BCHW array; note that the r,g and b values are uint16 and scaled to fit in a uint8
  • construct the input which is a struct of type InputData
  • marshal the InputData struct to JSON
  • POST the JSON to the local scoring URI
  • read the HTTP response and unmarshal the response in a struct of type OutputData
  • find the highest probability in the result and note the index where it was found
  • read the 1000 ImageNet categories from imagenet_class_index.json and marshal the JSON into a map of string arrays
  • print the category using the index with the highest probability and the map

What happens when we score the image below?

What is this thing?

Running the code gives the following result:

$ ./class -image images/cassette.jpg

Highest prob is 0.9981583952903748 at 481 (inference time: 0.3309464454650879 )
Probably [n02978881 cassette

The inference time is 1/3 of a second on my older Linux laptop with a dual-core i7.

Try it yourself by running the container and the class program. Download it from here (Linux).

Deploying Azure Cognitive Services Containers with IoT Edge

Introduction

Azure Cognitive Services is a collection of APIs that make your applications smarter. Some of those APIs are listed below:

  • Vision: image classification, face detection (including emotions), OCR
  • Language: text analytics (e.g. key phrase or sentiment analysis), language detection and translation

To use one of the APIs you need to provision it in an Azure subscription. After provisioning, you will get an endpoint and API key. Every time you want to classify an image or detect sentiment in a piece of text, you will need to post an appropriate payload to the cloud endpoint and pass along the API key as well.

What if you want to use these services but you do not want to pass your payload to a cloud endpoint for compliance or latency reasons? In that case, the Cognitive Services containers can be used. In this post, we will take a look at the Text Analytics containers, specifically the one for Sentiment Analysis. Instead of deploying the container manually, we will deploy the container with IoT Edge.

IoT Edge Configuration

To get started, create an IoT Hub. The free tier will do just fine. When the IoT Hub is created, create an IoT Edge device. Next, configure your actual edge device to connect to IoT Hub with the connection string of the device you created in IoT Hub. Microsoft have a great tutorial to do all of the above, using a virtual machine in Azure as the edge device. The tutorial I linked to is the one for an edge device running Linux. When finished, the device should report its status to IoT Hub:

If you want to install IoT Edge on an existing device like a laptop, follow the procedure for Linux x64.

Once you have your edge device up and running, you can use the following command to obtain the status of your edge device: sudo systemctl status iotedge. The result:

Deploy Sentiment Analysis container

With the IoT Edge daemon up and running, we can deploy the Sentiment Analysis container. In IoT Hub, select your IoT Edge device and select Set modules:

In Set Modules you have the ability to configure the modules for this specific device. Modules are always deployed as containers and they do not have to be specifically designed or developed for use with IoT Edge. In the three step wizard, add the Sentiment Analysis container in the first step. Click Add and then select IoT Edge Module. Provide the following settings:

Although the container can freely be pulled from the Image URI, the container needs to be configured with billing info and an API key. In the Billing environment variable, specify the endpoint URL for the API you configured in the cloud. In ApiKey set your API key. Note that the container always needs to be connected to the cloud to verify that you are allowed to use the service. Remember that although your payload is not sent to the cloud, your container usage is. The full container create options are listed below:

{
"Env": [
"Eula=accept",
"Billing=https://westeurope.api.cognitive.microsoft.com/text/analytics/v2.0",
"ApiKey=<yourKEY>"
],
"HostConfig": {
"PortBindings": {
"5000/tcp": [
{
"HostPort": "5000"
}
]
}
}
}

In HostConfig we ask the container runtime (Docker) to map port 5000 of the container to port 5000 of the host. You can specify other create options as well.

On the next page, you can configure routing between IoT Edge modules. Because we do not use actual IoT Edge modules, leave the configuration as shown below:

Now move to the last page in the Set Modules wizard to review the configuration and click Submit.

Give the deployment some time to finish. After a while, check your edge device with the following command: sudo iotedge list. Your TextAnalytics container should be listed. Alternatively, use sudo docker ps to list the Docker containers on your edge device.

Testing the Sentiment Analysis container

If everything went well, you should be able to go to http://localhost:5000/swagger to see the available endpoints. Open Sentiment Analysis to try out a sample:

You can use curl to test as well:

curl -X POST "http://localhost:5000/text/analytics/v2.0/sentiment" -H  "accept: application/json" -H  "Content-Type: application/json-patch+json" -d "{  \"documents\": [    {      \"language\": \"en\",      \"id\": \"1\",      \"text\": \"I really really despise this product!! DO NOT BUY!!\"    }  ]}"

As you can see, the API expects a JSON payload with a documents array. Each document object has three fields: language, id and text. When you run the above command, the result is:

{"documents":[{"id":"1","score":0.0001703798770904541}],"errors":[]}

In this case, the text I really really despise this product!! DO NOT BUY!! clearly results in a very bad score. As you might have guessed, 0 is the absolute worst and 1 is the absolute best.

Just for fun, I created a small Go program to test the API:

The Go program can be found here: https://github.com/gbaeke/sentiment. You can download the executable for Linux with: wget https://github.com/gbaeke/sentiment/releases/download/v0.1/ta. Make ta executable and use ./ta –help for help with the parameters.

Summary

IoT Edge is a great way to deploy containers to edge devices running Linux or Windows. Besides deploying actual IoT Edge modules, you can deploy any container you want. In this post, we deployed a Cognitive Services container that does Sentiment Analysis at the edge.

Using certmagic to add SSL to webhookd

A while ago, I stumbled upon https://github.com/ncarlier/webhookd. It is a simple webhook server, written in Go, that can execute shell scripts. To use it, simply install it on a Linux box and execute it. By default, the executable looks at the ./scripts folder and maps each shell script to a URL you can call. It is well documented so do take a look at the GitHub page for full details on its configuration.

Out of the box, webhookd supports basic authentication if you supply a .htpasswd file. It does not, however, support SSL. That can be fixed in several ways though. In my case, I wanted one executable that supports SSL with Let’s Encrypt certificates. As it turns out, there is a great solution for that: https://github.com/mholt/certmagic.

To simplify using webhookd together with certmagic, I forked the webhookd repo and added certmagic support. The fork is here: https://github.com/gbaeke/webhookd. To use it, use go get github.com/gbaeke/webhookd and work from there. The fork could be improved by adding extra parameters for e-mail address and DNS name. For now, change the code by following the steps below:

  • In main.go, search for mail@mail.com and replace it with a valid e-mail address; although not required it is a good practice to supply an e-mail address to the folks at Let’s Encrypt
  • In main.go, search for www.example.com and replace it with a valid DNS name
  • The DNS name you use needs to resolve to the IP address of the machine that runs webhookd; it should be a public IP address because the code uses the HTTP challenger
  • The machine that runs webhookd should expose port 80 and port 443
  • If you want to use the Let’s Encrypt staging servers during testing (recommended) change certmagic.LetsEncryptProductionCA to certmagic.LetsEncryptStagingCA

In my case, the machine that runs webhookd is a small Linux machine running on Microsoft Azure. The DNS name is actually a CNAME record that is an alias for the DNS name of the virtual machine (e.g. vmname.westeurope.cloudapp.azure.com). You are now ready to build webhookd with go build. When it’s ready, just execute webhookd. When you run this the first time, certmagic will notice there is no certificate and will start to talk to Let’s Encrypt using the ACME protocol. By default, HTTP verification is used which just means Let’s Encrypt will tell certmagic to host a file over plain HTTP. When Let’s Encrypt can retrieve that file, it concludes you must be the owner of the DNS name used in the certificate. The certificate will be issued and stored on the file system under $home/.local/share/certmagic/acme.

You will get some messages regarding the certificate request process as shown below. When the cached certificate is found and it is valid, you will just get the Serving HTTP->HTTPS message:

image

Note that you will not be able to bind to low ports like 80 and 443 as a non-root user. So either run webhookd as root or use setcap. For instance sudo setcap cap_net_bind_service=+ep /path/to/webhookd. After running the setcap command, you can run webhookd as a non-root user and it will be able to bind to port 80 and 443.

I also have basic authentication enabled for a user called api. To test the configuration, I can use curl like so:

image

Due to the use of the Let’s Encrypt production CA, there is no need to use the –insecure flag with curl. The certificate is fully trusted on my (Windows) machine. If you pulled down the complete repository, the scripts folder contains a shell script called echo.sh. That script is automatically made available as /echo. Everything the script echoes to stdout is used as output of the HTTP call. Simple but effective!

In a follow-up post, we will take a look at using webhookd to deploy Azure resources using a managed identity and the Azure CLI. Stay tuned!

Draft: a simpler way to deploy to Kubernetes during development

If you work with containers and work with Kubernetes, Draft makes it easier to deploy your code while you are in the earlier development stages. You use Draft while you are working on your code but before you commit it to version control. The idea is simple:

  • You have some code written in something like Node.js, Go or another supported language
  • You then use draft create to containerize the application based on Draft packs; several packs come with the tool and provide a Dockerfile and a Helm chart depending on the development language
  • You then use draft up to deploy the application to Kubernetes; the application is made accessible via a public URL

Let’s demonstrate how Draft is used, based on a simple Go application that is just a bit more complex than the Go example that comes with Draft. I will use the go-data service that I blogged about earlier. You can find the source code on GitHub. The go-data service is a very simple REST API. By calling the endpoint /data/{deviceid}, it will check if a “device” exists and then actually return no data. Hey, it’s just a sample! The service uses the Gorilla router but also Go Micro to call a device service running in the Kubernetes cluster. If the device service does not run, the data service will just report that the device does not exist.

Note that this post does not cover how to install Draft and its prerequisites like Helm and a Kubernetes Ingress Controller. You will also need a Kubernetes cluster (I used Azure ACS) and a container registry (I used Docker Hub). I installed all client-side components in the Windows 10 Linux shell which works great!

The only thing you need on your development box that has Helm and Draft installed is main.go and an empty glide.yaml file. The first command to run is draft create

This results in several files and folders being created, based on the Golang Draft pack. Draft detected you used Go because of glide.yaml. No Docker container is created at this point.

  • Dockerfile: a simple Dockerfile that builds an image based on the golang:onbuild image
  • draft.toml: the Draft configuration file that contains the name of the application (set randomly), the namespace to deploy to and if the folder needs to be watched for changes after you do draft up
  • chart folder: contains the Helm chart for your application; you might need to make changes here if you want to modify the Kubernetes deployment as we will do soon

When you deploy, Draft will do several things. It will package up the chart and your code and send it to the Draft server-side component running in Kubernetes. It will then instruct Draft to build your container, push it to a configured registry and then install the application in Kubernetes. All those tasks are performed by the Draft server component, not your client!

In my case, after running draft up, I get the following on my prompt (after the build, push and deploy steps):

image

In my case, the name of the application was set to exacerbated-ragdoll (in draft.toml). Part of what makes Draft so great is that it then makes the service available using that name and the configured domain. That works because of the following:

  • During installation of Draft, you need to configure an Ingress Controller in Kubernetes; you can use a Helm chart to make that easy; the Ingress Controller does the magic of mapping the incoming request to the correct application
  • When you configure Draft for the first time with draft init you can pass the domain (in my case baeke.info); this requires a wildcard A record (e.g. *.baeke.info) that points to the public IP of the Ingress Controller; note that in my case, I used Azure Container Services which makes that IP the public IP of an Azure load balancer that load balances traffic between the Ingress Controller instances (ngnix)

So, with only my source code and a few simple commands, the application was deployed to Kubernetes and made available on the Internet! There is only one small problem here. If you check my source code, you will see that there is no route for /. The Draft pack for Golang includes a livenessProbe on / and a readinessProbe on /. The probes are in deployment.yaml which is the file that defines the Kubernetes deployment. You will need to change the path in livenessProbe and readinessProbe to point to /data/device like so:

- containerPort: {{ .Values.service.internalPort }}
livenessProbe:
  httpGet:
   path: /data/device
   port: {{ .Values.service.internalPort }}
  readinessProbe:
   httpGet:
   path: /data/device
   port: {{ .Values.service.internalPort }}

If you already deployed the application but Draft is still watching the folder, you can simply make the above changes and save the deployment.yaml file (in chart/templates). The container will then be rebuilt and the deployment will be updated. When you now check the service with curl, you should get something like:

curl http://exacerbated-ragdoll.baeke.info/data/device1

Device active:  false
Oh and, no data for you!

To actually make the Go Micro features work, we will have to make another change to deployment.yaml. We will need to add an environment variable that instructs our code to find other services developed with Go Micro using the kubernetes registry:

- name: {{ .Chart.Name }}
  image: "{{ .Values.image.registry }}/{{ .Values.image.org }}/{{ .Values.image.name }}:{{ .Values.image.tag }}"
  imagePullPolicy: {{ .Values.image.pullPolicy }}
  env:
   - name: MICRO_REGISTRY
     value: kubernetes

To actually test this, use the following command to deploy the device service.

kubectl create -f https://raw.githubusercontent.com/gbaeke/go-device/master/go-device-dep.yaml

You can then check if it works by running the curl command again. It should now return the following:

Device active:  true
Oh and, no data for you!

Hopefully, you have seen how you can work with Draft from your development box and that you can modify the files generated by Draft to control how your application gets deployed. In our case, we had to modify the health checks to make sure the service can be reached. In addition, we had to add an environment variable because the code uses the Go Micro microservices framework.