Detecting emotions with FER+

In an earlier post, I discussed classifying images with the ResNet50v2 model. Azure Machine Learning Service was used to create a container image that used the ONNX ResNet50v2 model and the ONNX Runtime for scoring.

Continuing on that theme, I created a container image that uses the ONNX FER+ model that can detect emotions in an image. The container image also uses the ONNX Runtime for scoring.

You might wonder why you would want to detect emotions this way when there are many services available that can do this for you with a simple API call! You could use Microsoft’s Face API or Amazon’s Rekognition for example. While those services are easy to use and provide additional features, they do come at a cost. If all you need is basic detection of emotions, using this FER+ container is sufficient and cost effective.

Azure Face API (image from Microsoft website)

A notebook to create the image and deploy a container to Azure Container Instances (ACI) can be found here. The notebook uses the Azure Machine Learning SDK to register the model to an Azure Machine Learning workspace, build a container image from that model and deploy the container to ACI. The scoring script score.py is shown below.

score.py

The model expects an 64×64 gray scale image of a face in an array with the following dimensions: [1][1][64][64]. The output is JSON with a results array that contains the probabilities for each emotion and a time field with the inference time.

The emotion probabilities are in this order:

0: "neutral", 1: "happy", 2: "surprise", 3: "sadness", 4: "anger", 5: "disgust", 6: "fear", 7: "contempt

To actually capture the emotions, I wrote a small demo program in Go that uses OpenCV (via GoCV). You can find it on GitHub: https://github.com/gbaeke/emotion. You will need to install OpenCV and GoCV. Find the instructions here: https://gocv.io/getting-started/linux/. There are similar instructions for Mac and Windows but I have not tried those

The program is still a little rough around the edges but it does the trick. The scoring URI is hard coded to http://localhost:5002/score. With Docker installed, use the following command to install the scoring container:

 docker run -d -p 5002:5001 gbaeke/onnxferplus

Have fun with it!

Recognizing images with Azure Machine Learning and the ONNX ResNet50v2 model

Featured image from: https://medium.com/comet-app/review-of-deep-learning-algorithms-for-object-detection-c1f3d437b852

In a previous post, I discussed the creation of a container image that uses the ResNet50v2 model for image classification. If you want to perform tasks such as localization or segmentation, there are other models that serve that purpose. The image was built with GPU support. Adding GPU support was pretty easy:

  • Use the enable_gpu flag in the Azure Machine Learning SDK or check the GPU box in the Azure Portal; the service will build an image that supports NVIDIA cuda
  • Add GPU support in your score.py file and/or conda dependencies file (scoring script uses the ONNX runtime, so we added the onnxruntime-gpu package)

In this post, we will deploy the image to a Kubernetes cluster with GPU nodes. We will use Azure Kubernetes Service (AKS) for this purpose. Check my previous post if you want to use NVIDIA V100 GPUs. In this post, I use hosts with one V100 GPU.

To get started, make sure you have the Kubernetes cluster deployed and that you followed the steps in my previous post to create the GPU container image. Make sure you attached the cluster to the workspace’s compute.

Deploy image to Kubernetes

Click the container image you created from the previous post and deploy it to the Kubernetes cluster you attached to the workspace by clicking + Create Deployment:

Starting the deployment from the image in the workspace

The Create Deployment screen is shown. Select AKS as deployment target and select the Kubernetes cluster you attached. Then press Create.

Azure Machine Learning now deploys the containers to Kubernetes. Note that I said containers in plural. In addition to the scoring container, another frontend container is added as well. You send your requests to the front-end container using HTTP POST. The front-end container talks to the scoring container over TCP port 5001 and passes the result back. The front-end container can be configured with certificates to support SSL.

Check the deployment and wait until it is healthy. We did not specify advanced settings during deployment so the default settings were chosen. Click the deployment to see the settings:

Deployment settings including authentication keys and scoring URI

As you can see, the deployment has authentication enabled. When you send your HTTP POST request to the scoring URI, make sure you pass an authentication header like so: bearer primary-or-secondary-key. The primary and secondary key are in the settings above. You can regenerate those keys at any time.

Checking the deployment

From the Azure Cloud Shell, issue the following commands in order to list the pods deployed to your Kubernetes cluster:

  • az aks list -o table
  • az aks get-credentials -g RESOURCEGROUP -n CLUSTERNAME
  • kubectl get pods
Listing the deployed pods

Azure Machine Learning has deployed three front-ends (default; can be changed via Advanced Settings during deployment) and one scoring container. Let’s check the container with: kubectl get pod onnxgpu-5d6c65789b-rnc56 -o yaml. Replace the container name with yours. In the output, you should find the following:

resources:
limits:
nvidia.com/gpu: "1"
requests:
cpu: 100m
memory: 500m
nvidia.com/gpu: "1"

The above allows the pod to use the GPU on the host. The nvidia drivers on the host are mapped to the pod with a volume:

volumeMounts:
- mountPath: /usr/local/nvidia
name: nvidia

Great! We did not have to bother with doing this ourselves. Let’s now try to recognize an image by sending requests to the front-end pods.

Recognizing images

To recognize an image, we need to POST a JSON payload to the scoring URI. The scoring URI can be found in the deployment properties in the workspace. In my case, the URI is:

http://23.97.218.34/api/v1/service/onnxgpu/score

The JSON payload needs to be in the below format:

{"data": [[[[143.06100463867188, 130.22100830078125, 122.31999969482422, ... ]]]]} 

The data field is a multi-dimensional array, serialized to JSON. The shape of the array is (1,3,224,224). The dimensions correspond to the batch size, channels (RGB), height and width.

You only have to read an image and put the pixel values in the array! Easy right? Well, as usual the answer is: “it depends”! The easiest way to do it, according to me, is with Python and a collection of helper packages. The code is in the following GitHub gist: https://gist.github.com/gbaeke/b25849f3813e9eb984ee691659d1d05a. You need to run the code on a machine with Python 3 installed. Make sure you also install Keras and NumPy (pip3 install keras / pip3 install numpy). The code uses two images, cat.jpg and car.jpg but you can use your own. When I run the code, I get the following result:

Using TensorFlow backend.
channels_last
Loading and preprocessing image… cat.jpg
Array shape (224, 224, 3)
Array shape afer moveaxis: (3, 224, 224)
Array shape after expand_dims (1, 3, 224, 224)
prediction time (as measured by the scoring container) 0.025304794311523438
Probably a: Egyptian_cat 0.9460222125053406
Loading and preprocessing image… car.jpg
Array shape (224, 224, 3)
Array shape afer moveaxis: (3, 224, 224)
Array shape after expand_dims (1, 3, 224, 224)
prediction time (as measured by the scoring container) 0.02526378631591797
Probably a: sports_car 0.948998749256134

It takes about 25 milliseconds to classify an image, or 40 images/second. By increasing the number of GPUs and scoring containers (we only deployed one), we can easily scale out the solution.

With a bit of help from Keras and NumPy, the code does the following:

  • check the image format reported by the keras back-end: it reports channels_last which means that, by default, the RGB channels are the last dimensions of the image array
  • load the image; the resulting array has a (224,224,3) shape
  • our container expects the channels_first format; we use moveaxis to move the last axis to the front; the array now has a (3,224,224) shape
  • our container expects a first dimension with a batch size; we use expand_dims to end up with a (1,3,224,224) shape
  • we convert the 4D array to a list and construct the JSON payload
  • we send the payload to the scoring URI and pass an authorization header
  • we get a JSON response with two fields: result and time; we print the inference time as reported by the container
  • from keras.applications.resnet50, we use the decode_predictions class to process the result field; result contains the 1000 values computed by the softmax function in the container; decode_predictions knows the categories and returns the first five
  • we print the name and probability of the category with the highest probability (item 0)

What happens when you use a scoring container that uses the CPU? In that case, you could run the container in Azure Container Instances (ACI). Using ACI is much less costly! In ACI with the default setting of 0.1 CPU, it will take around 2 seconds to score an image. Ouch! With a full CPU (in ACI), the scoring time goes down to around 180-220ms per image. To achieve better results, simply increase the number of CPUs. On the Standard_NC6s_v3 Kubernetes node with 6 cores, scoring time with CPU hovers around 60ms.

Conclusion

In this post, you have seen how Azure Machine Learning makes it straightforward to deploy GPU scoring images to a Kubernetes cluster with GPU nodes. The service automatically configures the resource requests for the GPU and maps the NVIDIA drivers to the scoring container. The only thing left to do is to start scoring images with the service. We have seen how easy that is with a bit of help from Keras and NumPy. In practice, always start with CPU scoring and scale out that solution to match your requirements. But if you do need GPUs for scoring, Azure Machine Learning makes it pretty easy to do so!

Creating a GPU container image for scoring with Azure Machine Learning

In a previous post, I discussed how you can add an existing Kubernetes cluster to an Azure Machine Learning workspace. Adding an existing cluster is necessary when the workspace does not support auto creation of a cluster. That is the case when you want to use the Standard_NC6s_v3 virtual machine image. I also used a container for scoring pictures with the ResNet50v2 model from the ONNX Model Zoo. Now we will take a look at actually creating that container image with GPU support. Note that in many cases, inference with CPUs is more than sufficient but the GPU case is more interesting to look at!

To get started, you need an Azure subscription with an Azure Machine Learning workspace. Take a look here for instructions.

Once you have a workspace, there are a few steps to take. If you look at the diagram at the top of this post, we will perform the steps starting from Register and manage your model:

  • Register model: we will add the Resnet50v2 model from the ONNX Model Zoo; we are using this existing model instead of our own; ResNet50v2 can recognize pictures in 1000 categories
  • Create container image: from the model in the workspace, we create a container image with GPU support
  • Deploy container image: from the image in the workspace, we deploy the image to compute that supports GPUs

Machine Learning SDK

The Azure Machine Learning service has a Machine Learning SDK for Python. All the steps discussed above can be performed with code. You can find an example of the Python code to use in the following Jupyter notebook hosted on Azure Notebooks: https://gebaml-geba.notebooks.azure.com/j/notebooks/ONNXResnet.ipynb. Note that the Azure Notebooks service is still in preview and a bit rough around the edges. The Machine Learning SDK is available by default in Azure Notebooks.

At the beginning of the notebook, we import azureml.core which allows you to check the version of the SDK (among other things):

Registering the model

First, we download the model to the notebook project. In the notebook, the urllib module is used to download the compressed version of the ResNet50v2 model. The tarball is untarred in resnet50v2/resnet50v2.onnx. You should see the model as a complex function with, in this case, millions of parameters (weights). The input to the function are the pixels of your picture (their red, green and blue values). The output of the function is a category: cat, guitar, …

Now that we have the model, we need to add it to the workspace, which means we also have to authenticate. Create a file called config.json with the following contents:

{
"subscription_id": "your Azure subscription ID", "resource_group": "your Azure ML resource group",
"workspace_name": "your Azure ML workspace name"
}

With the Workspace class from azureml.core we authenticate to Azure and grab a reference to the workspace with the ws variable. The Workspace.from_config() function searches for the config.json file.

Now we can finally register the model in the workspace using Model.register:

The above is the same as adding a model using the Azure Portal. You might hit file upload limits in the portal so adding the model via code is the better approach. Your model is now registered in the workspace:

Creating a GPU container image from the model

Now that we have the model, we can create the container image. The model will be included in the image which will add about 100MB to its size. The container image in Azure Machine Learning is created from four settings/artifacts:

  • model: registered in the workspace
  • score file: a file score.py with an init() and run() function; helper functions can also be included
  • dependency file: used to indicate the Python modules that need to be installed in the image (see https://conda.io/docs/)
  • GPU support: set to True or False

You will find the score file in the notebook. It was copied from a Microsoft supplied sample. If you do not have some experience with Machine Learning and neural networks (in this case), it will be difficult to create this from scratch. The ResNet50v2 model expects a 4-dimensional tensor with the following dimensions:

  • 0: batch (1 when you send 1 image)
  • 1: channels (3 channels for red, green and blue; RGB)
  • 2: height (224 pixels)
  • 3: width (224 pixels)

For inference, you will actually send the above data in a JSON payload as the data field. The preprocess() function in score.py grabs the data field and converts it to a NumPy array. The data is then normalized by dividing each pixel by 255, subtracting the mean values (of each channel) and dividing by the standard deviation (of each channel) . The normalized data is then sent to the model which outputs an array with 1000 probabilities that sum to 1 (via a softmax function).

Why are there a thousand probabilities? The model was trained on a thousand different categories of images and for each of these categories, a probability is output. After inference we will need a list of these categories so we can find the one that matches with our uploaded image and that has the highest probability!

This particular score.py file uses the ONNX runtime for inference. To enable GPU support, make sure you include the onnxruntime-gpu package in your conda dependencies as shown below:

With score.py and myenv.yml, the container image with GPU support can be created. Note that we are specifying the score.py file, the conda file and the model. GPU support is enabled as well via enable_gpu=True.

The code above should result in the following image in your workspace (after several minutes of building):

In the background, this image is stored in the container registry that got created when you deployed the Azure Machine Learning workspace. You are now ready for the third step, deploying the image to compute that supports GPUs (for instance Kubernetes). That step, together with some code to actually recognize images, will be for another post. In that post, we will also compare CPU to GPU speed.

Conclusion

In this post, we looked at creating a scoring (inference) container image with GPU support. Instead of creating and using our own model, we used the ResNet50v2 model from the ONNX Model Zoo. The model file, together with a score.py file and conda dependency file was used to build a container image. Azure Machine Learning builds the container image for you and stores it in a container registry. Although Azure Machine Learning takes care of most of the infrastructure work, you still need to know how to write the scoring file. In this post, the scoring file uses the ONNX runtime but you can use other runtimes or frameworks such as TensorFlow or MXNET.