Creating and deploying a model with Azure Machine Learning Service

In this post, we will take a look at creating a simple machine learning model for text classification and deploying it as a container with Azure Machine Learning service. This post is not intended to discuss the finer details of creating a text classification model. In fact, we will use the Keras library and its Reuters newswire dataset to create a simple dense neural network. You can find many online examples based on this dataset. For further information, be sure to check out and buy ūüĎć Deep Learning with Python by Fran√ßois Chollet, the creator of Keras and now at Google. It contains a section that explains using this dataset in much more detail!

Machine Learning service workspace

To get started, you need an Azure subscription. Once you have the subscription, create a Machine Learning service workspace. Below, you see such a workspace:

My Machine Learning service workspace (gebaml)

Together with the workspace, you also get a storage account, a key vault, application insights and a container registry. In later steps, we will create a container and store it in this registry. That all happens behind the scenes though. You will just write a few simple lines of code to make that happen!

Note the Authoring (Preview) section! These were added just before Build 2019 started. For now, we will not use them.

Azure Notebooks

To create the model and interact with the workspace, we will use a free Jupyter notebook in Azure Notebooks. At this point in time (8 May 2019), Azure Notebooks is still in preview. To get started, find the link below in the Overview section of the Machine Learning service workspace:

Getting Started with Notebooks

To quickly get the notebook, you can clone my public project: ‚Ź©‚Ź©‚Ź© https://notebooks.azure.com/geba/projects/textclassificationblog.

Creating the model

When you open the notebook, you will see the following first four cells:

Getting the dataset

It’s always simple if a prepared dataset is handed to you like in the above example. Above, you simply use the reuters class of keras.datasets and use the load_data method to get the data and directly assign it to variables to hold the train and test data plus labels.

In this case, the data consists of newswires with a corresponding label that indicates the category of the newswire (e.g. an earnings call newswire). There are 46 categories in this dataset. In the real world, you would have the newswire in text format. In this case, the newswire has already been converted (preprocessed) for you in an array of integers, with each integer corresponding to a word in a dictionary.

A bit further in the notebook, you will find a Vectorization section:

Vectorization

In this section, the train and test data is vectorized using a one-hot encoding method. Because we specified, in the very first cell of the notebook, to only use the 10000 most important words each article can be converted to a vector with 10000 values. Each value is either 1 or 0, indicating the word is in the text or not.

This bag-of-words approach is one of the ways to represent text in a data structure that can be used in a machine learning model. Besides vectorizing the training and test samples, the categories are also one-hot encoded.

Now the dense neural network model can be created:

Dense neural net with Keras

The above code defines a very simple dense neural network. A dense neural network is not necessarily the best type but that’s ok for this post. The specifics are not that important. Just note that the nn variable is our model. We will use this variable later when we convert the model to the ONNX format.

The last cell (16 above) does the actual training in 9 epochs. Training will be fast because the dataset is relatively small and the neural network is simple. Using the Azure Notebooks compute is sufficient. After 9 epochs, this is the result:

Training result

Not exactly earth-shattering: 78% accuracy on the test set!

Saving the model in ONNX format

ONNX is an open format to store deep learning models. When your model is in that format, you can use the ONNX runtime for inference.

Converting the Keras model to ONNX is easy with the onnxmltools:

Converting the Keras model to ONNX

The result of the above code is a file called reuters.onnx in your notebook project.

Predict with the ONNX model

Let’s try to predict the category of the first newswire in the test set. Its real label is 3, which means it’s a newswire about an earnings call (earn class):

Inferencing with the ONNX model

We will use similar code later in score.py, a file that will be used in a container we will create to expose the model as an API. The code is pretty simple: start an inference session based on the reuters.onnx file, grab the input and output and use run to predict. The resulting array is the output of the softmax layer and we use argmax to extract the category with the highest probability.

Saving the model to the workspace

With the model in reuters.onnx, we can add it to the workspace:

Saving the model in the workspace

You will need a file in your Azure Notebook project called config.json with the following contents:

{
     "subscription_id": "<subscription-id>",
     "resource_group": "<resource-group>",
     "workspace_name": "<workspace-name>" 
} 

With that file in place, when you run cell 27 (see above), you will need to authenticate to Azure to be able to interact with the workspace. The code is pretty self-explanatory: the reuters.onnx model will be added to the workspace:

Models added to the workspace

As you can see, you can save multiple versions of the model. This happens automatically when you save a model with the same name.

Creating the scoring container image

The scoring (or inference) container image is used to expose an API to predict categories of newswires. Obviously, you will need to give some instructions how scoring needs to be done. This is done via score.py:

score.py

The code is similar to the code we wrote earlier to test the ONNX model. score.py needs an init() and run() function. The other functions are helper functions. In init(), we need to grab a reference to the ONNX model. The ONNX model file will be placed in the container during the build process. Next, we start an InferenceSession via the ONNX runtime. In run(), the code is similar to our earlier example. It predicts via session.run and returns the result as JSON. We do not have to worry about the rest of the code that runs the API. That is handled by Machine Learning service.

Note: using ONNX is not a requirement; we could have persisted and used the native Keras model for instance

In this post, we only need score.py since we do not train our model via Azure Machine learning service. If you train a model with the service, you would create a train.py file to instruct how training should be done based on data in a storage account for instance. You would also provision compute resources for training. In our case, that is not required so we train, save and export the model directly from the notebook.

Training and scoring with Machine Learning service

Now we need to create an environment file to indicate the required Python packages and start the image build process:

Create an environment yml file via the API and build the container

The build process is handled by the service and makes sure the model file is in the container, in addition to score.py and myenv.yml. The result is a fully functional container that exposes an API that takes an input (a newswire) and outputs an array of probabilities. Of course, it is up to you to define what the input and output should be. In this case, you are expected to provide a one-hot encoded article as input.

The container image will be listed in the workspace, potentially multiple versions of it:

Container images for the reuters ONNX model

Deploy to Azure Container Instances

When the image is ready, you can deploy it via the Machine Learning service to Azure Container Instances (ACI) or Azure Kubernetes Service (AKS). To deploy to ACI:

Deploying to ACI

When the deployment is finished, the deployment will be listed:

Deployment (ACI)

When you click on the deployment, the scoring URI will be shown (e.g. http://IPADDRESS:80/score). You can now use Postman or any other method to score an article. To quickly test the service from the notebook:

Testing the service

The helper method run of aci_service will post the JSON in test_sample to the service. It knows the scoring URI from the deployment earlier.

Conclusion

Containerizing a machine learning model and exposing it as an API is made surprisingly simple with Azure Machine learning service. It saves time so you can focus on the hard work of creating a model that performs well in the field. In this post, we used a sample dataset and a simple dense neural network to illustrate how you can build such a model, convert it to ONNX format and use the ONNX runtime for scoring.

Creating and containerizing a TensorFlow Go application

In an earlier post, I discussed using a TensorFlow model from a Go application. With the TensorFlow bindings for Go, you can load a model that was exported with TensorFlow’s SavedModelBuilder module. That module saves a “snapshot” of a trained model which can be used for inference.

In this post, we will actually use the model in a web application. The application presents the user with a page to upload an image:

The upload page

The class and its probability is displayed, including the processed image:

Clearly a hen!

The source code of the application can be found at https://github.com/gbaeke/nasnet-go. If you just want to try the application, use Docker and issue the following command (replace port 80 with another port if there is a conflict):

docker run -p 80:9090 -d gbaeke/nasnet

The image is around 2.55GB in size so be patient when you first run the application. When the container has started, open your browser at http://localhost to see the upload page.

To quickly try it, you can run the container on Azure Container Instances. If you use the Portal, specify port 9090 as the container port.

Nasnet container in ACI

A closer look at the appN

**UPDATE**: since first publication, the http handler code was moved into from main.go to handlers/handlers.go

In the init() function, the nasnet model is loaded with tf.LoadSavedModel. The ImageNet categories are also loaded with a call to getCategories() and stored in categories which is a map of int to a string array.

In main(), we simply print the TensorFlow version (1.12). Next, http.HandleFunc is used to setup a handler (upload func) when users connect to the root of the web app.

Naturally, most of the logic is in the upload function. In summary, it does the following:

  • when users just navigate to the page (HTTP GET verb), render the upload.gtpl template; that template contains the upload form and uses a bit of bootstrap to make it just a bit better looking (and that’s already an overstatement); to learn more about Go web templates, see this link.
  • when users submit a file (POST), the following happens:
    • read the image
    • convert the image to a tensor with the getTensor function; getTensor returns a *tf.Tensor; the tensor is created from a [1][224][224][3] array; note that each pixel value gets normalized by subtracting by 127.5 and then dividing by 127.5 which is the same preprocessing applied as in Keras (divide by 127.5 and subtract 1)
    • run a session by inputting the tensor and getting the categories and probabilities as output
    • look for the highest probability and save it, together with the category name in a variable of type ResultPageData (a struct)
    • the struct data is used as input for the response.gtpl template

Note that the image is also shown in the output. The processed image (resized to 224×224) gets converted to a base64-encoded string. That string can be used in HTML image rendering as follows (where {{.Picture}} in the template will be replaced by the encoded string):

 <img src="data:image/jpg;base64,{{.Picture}}"> 

Note that the application lacks sufficient error checking to gracefully handle the upload of non-image files. Maybe I’ll add that later! ūüėČ

Containerization

To containerize the application, I used the Dockerfile from https://github.com/tinrab/go-tensorflow-image-recognition but removed the step that downloads the InceptionV3 model. My application contains a ready to use NasnetMobile model.

The container image is based on tensorflow/tensorflow:1.12.0. It is further modified as required with the TensorFlow C API and the installation of Go. As discussed earlier, I uploaded a working image on Docker Hub.

Conclusion

Once you know how to use TensorFlow models from Go applications, it is easy to embed them in any application, from command-line tools to APIs to web applications. Although this application does server-side processing, you can also use a model directly in the browser with TensorFlow.js or ONNX.js. For ONNX, try https://microsoft.github.io/onnxjs-demo/#/resnet50 to perform image classification with ResNet50 in the browser. You will notice that it will take a while to get started due to the model being downloaded. Once the model is downloaded, you can start classifying images. Personally, I prefer the server-side approach but it all depends on the scenario.

Virtual Node support in Azure Kubernetes Service (AKS)

Although I am using Kubernetes a lot, I didn’t quite get to trying the virtual nodes support. Virtual nodes is basically the implementation on AKS of the virtual kubelet project. The virtual kubelet project allows Kubernetes nodes to be backed by other services that support containers such as AWS Fargate, IoT Edge, Hyper.sh or Microsoft’s ACI (Azure Container Instances). The idea is that you spin up containers using the familiar Kubernetes API but on services like Fargate and ACI that can instantly scale and only charge you for the seconds the containers are running.

As expected, the virtual nodes support that is built into AKS uses ACI as the backing service. To use it, you need to deploy Kubernetes with virtual nodes support. Use either the CLI or the Azure Portal:

  • CLI: uses the Azure CLI on your machine or from cloud shell
  • Portal: uses the Azure Portal

Note that virtual nodes for AKS are currently in preview. Virtual nodes require AKS to be configured with the advanced network option. You will need to provide a subnet for the virtual nodes that will be dedicated to ACI. The advanced networking option gives you additional control about IP ranges but also allows you to deploy a cluster in an existing virtual network. Note that advanced networking results in the use of the Azure Virtual Network container network interface. Each pod on a regular host gets its own IP address on the virtual network. You will see them in the network as connected devices:

Connected devices on the Kubernetes VNET (includes pods)

In contrast, the containers you will create in the steps below will not show up as connected devices since they are managed by ACI which works differently.

Ok, go ahead and deploy a Kubernetes cluster or just follow along. After deployment, kubectl get nodes will show a result similar to the screenshot below:

kubectl get nodes output with virtual node

With the virtual node online, we can deploy containers to it. Let’s deploy the ONNX ResNet50v2 container from an earlier post and scale it up. Create a YAML file like below and use kubectl apply -f path_to_yaml to create a deployment:

 apiVersion: apps/v1
kind: Deployment
metadata:
name: resnet
spec:
replicas: 1
selector:
matchLabels:
app: resnet
template:
metadata:
labels:
app: resnet
spec:
containers:
- name: onnxresnet50v2
image: gbaeke/onnxresnet50v2
ports:
- containerPort: 5001
resources:
requests:
cpu: 1
limits:
cpu: 1
nodeSelector:
kubernetes.io/role: agent
beta.kubernetes.io/os: linux
type: virtual-kubelet
tolerations:
- key: virtual-kubelet.io/provider
operator: Exists
- key: azure.com/aci
effect: NoSchedule

With the nodeSelector, you constrain a pod to run on particular nodes in your cluster. In this case, we want to deploy on a host of type virtual-kubelet. With the toleration, you specify that the container can be scheduled on the hosts that match the specified taints. There are two taints here, virtual-kubelet.io/provider and azure.com/aci which are applied to the virtual kubelet node.

After executing the above YAML, I get the following result after kubectl get pods -o wide:

The pod is pending on node virtual-node-aci-linux

After a while, the pod will be running, but it’s actually just a container on ACI.

Let’s expose the deployment with a public IP via an Azure load balancer:

kubectl expose deployment resnet --port=80 --target-port=5001 --type=LoadBalancer

The above command creates a service of type LoadBalancer that maps port 80 of the Azure load balancer to, eventually, port 5001 of the container. Just use kubectl get svc to see the external IP address. Configuring the load balancer usually takes around one minute.

Now let’s try to scale the deployment to 100 containers:

kubectl scale --replicas=100 deployments/resnet

Instantly, the containers will be provisioned on ACI via the virtual kubelet:

NAME                      READY     STATUS     RESTARTS   AGE
resnet-6d7954d5d7-26n6l 0/1 Waiting 0 30s
resnet-6d7954d5d7-2bjgp 0/1 Creating 0 30s
resnet-6d7954d5d7-2jsrs 0/1 Creating 0 30s
resnet-6d7954d5d7-2lvqm 0/1 Pending 0 27s
resnet-6d7954d5d7-2qxc9 0/1 Creating 0 30s
resnet-6d7954d5d7-2wnn6 0/1 Creating 0 28s
resnet-6d7954d5d7-44rw7 0/1 Creating 0 30s
.... repeat ....

When you run¬†kubectl¬†get¬†endpoints you will see all the endpoints “behind” the resnet service:

NAME         ENDPOINTS                                                       
resnet 40.67.216.68:5001,40.67.219.10:5001,40.67.219.22:5001
+ 97 more…

In container monitoring:

Hey, one pod has an issue! Who cares right?

As you can see, it is really easy to get started with virtual nodes and to scale up a deployment. In a later post, I will take a look at auto scaling containers on a virtual node.

Microsoft Face API with a local container

A few days ago, I obtained access to the Face container. It provides access to the Face API via a container you can run where you want: on your pc, at the network edge or in your datacenter. You should allocate 6 GB or RAM and 2 cores for the container to run well. Note that you still need to create a Face API resource in the Azure Portal. The container needs to be associated with the Azure Face API via the endpoint and access key:

Face API with a West Europe (Amsterdam) endpoint

I used the Standard tier, which charges 0.84 euros per 1000 calls. As noted, the container will not function without associating it with an Azure Face API resource.

When you gain access to the container registry, you can pull the container:

docker pull containerpreview.azurecr.io/microsoft/cognitive-services-face:latest

After that, you can run the container as follows (for API billing endpoint in West Europe):

docker run --rm -it -p 5000:5000 --memory 6g --cpus 2 containerpreview.azurecr.io/microsoft/cognitive-services-face Eula=accept Billing=https://westeurope.api.cognitive.microsoft.com/face/v1.0 ApiKey=YOUR_API_KEY

The container will start. You will see the output (–it):

Running Face API container

And here’s the spec:

API spec Face API v1

Before showing how to use the detection feature, note that the container needs Internet access for billing purposes. You will not be able to run the container in fully offline scenarios.

Over at https://github.com/gbaeke/msface-go, you can find a simple example in Go that uses the container. The Face API can take a byte stream of an image or a URL to an image. The example takes the first approach and loads an image from disk as specified by the -image parameter. The resulting io.Reader is passed to the getFace function which does the actual call to the API (uri = http://localhost:5000/face/v1.0/detect):

request, err := http.NewRequest("POST", uri+"?returnFaceAttributes="+params, m)
request.Header.Add("Content-Type", "application/octet-stream")

// Send the request to the local web service
resp, err := client.Do(request)
if err != nil {
    return "", err
}

The response contains a Body attribute and that attribute is unmarshalled to a variable of type interface. That one is marshalled with indentation to a byte slice (b) which is returned by the function as a string:

var response interface{}
err = json.Unmarshal(respBody, &response)
if err != nil {
    return "", err
}
b, err := json.MarshalIndent(response, "", "\t")

Now you can use a picture like the one below:

Is he smiling?

Here are some parts of the input, following the command
detectface -image smiling.jpg

Emotion is clearly happiness with additional features such as age, gender, hair color, etc…

[
{
"faceAttributes": {
"accessories": [],
"age": 33,
"blur": {
"blurLevel": "high",
"value": 1
},
"emotion": {
"anger": 0,
"contempt": 0,
"disgust": 0,
"fear": 0,
"happiness": 1,
"neutral": 0,
"sadness": 0,
"surprise": 0
},
"exposure": {
"exposureLevel": "goodExposure",
"value": 0.71
},
"facialHair": {
"beard": 0.6,
"moustache": 0.6,
"sideburns": 0.6
},
"gender": "male",
"glasses": "NoGlasses",
"hair": {
"bald": 0.26,
"hairColor": [
{
"color": "black",
"confidence": 1
}],
"faceId": "b6d924c1-13ef-4d19-8bc9-34b0bb21f0ce",
"faceRectangle": {
"height": 1183,
"left": 944,
"top": 167,
"width": 1183
}
}
]

That’s it! Give the Face API container a go with the tool. You can get it here: https://github.com/gbaeke/msface-go/releases/tag/v0.0.1 (Windows)

Running a GoCV application in a container

In earlier posts (like here and here) I mentioned GoCV. GoCV allows you to use the popular OpenCV library from your Go programs. To avoid installing OpenCV and having to compile it from source, a container that runs your GoCV app can be beneficial. This post provides information about doing just that.

The following GitHub repository, https://github.com/denismakogon/gocv-alpine, contains all you need to get started. It’s for OpenCV 3.4.2 so you will run into issues when you want to use OpenCV 4.0. The pull request, https://github.com/denismakogon/gocv-alpine/pull/7, contains the update to 4.0 but it has not been merged yet. I used the proposed changes in the pull request to build two containers:

  • the build container: gbaeke/gocv-4.0.0-build
  • the run container: gbaeke/gocv-4.0.0-run

They are over on Docker Hub, ready for use. To actually use the above images in a typical two-step build, I used the following Dockerfile:

FROM gbaeke/gocv-4.0.0-build as build       
RUN go get -u -d gocv.io/x/gocv
RUN go get -u -d github.com/disintegration/imaging
RUN go get -u -d github.com/gbaeke/emotion
RUN cd $GOPATH/src/github.com/gbaeke/emotion && go build -o $GOPATH/bin/emo ./main.go

FROM gbaeke/gocv-4.0.0-run
COPY --from=build /go/bin/emo /emo
ADD haarcascade_frontalface_default.xml /

ENTRYPOINT ["/emo"]

The above Dockerfile uses the webcam emotion detection program from https://github.com/gbaeke/emotion. To run it on a Linux system, use the following command:

docker run -it --rm --device=/dev/video0 --env SCOREURI="YOUR-SCORE-URI" --env VIDEO=0 gbaeke/emo

The SCOREURI environment variable needs to refer to the score URI offered by the ONNX FER+ container as discussed in Detecting Emotions with FER+. With VIDEO=0 the GUI window that shows the webcam video stream is turned off (required). Detected emotions will be logged to the console.

To be able to use the actual webcam of the host, the –device flag is used to map /dev/video0 from the host to the container. That works well on a Linux host and was tested on a laptop running Ubuntu 16.04.

Using the Microsoft Face API to detect emotions in photos and video

In a previous post, I blogged about detecting emotions with the ONNX FER+ model. As an alternative, you can use cloud models hosted by major cloud providers such as Microsoft, Amazon and Google. Besides those, there are many other services to choose from.

To detect facial emotions with Azure, there is a Face API in two flavours:

  • Cloud: API calls are sent to a cloud-hosted endpoint in the selected deployment region
  • Container: API calls are sent to a container that you deploy anywhere, including the edge (e.g. IoT Edge device)

To use the container version, you need to request access via this link. In another blog post, I already used the Text Analytics container to detect sentiment in a piece of text.

Note that the container version is not free and needs to be configured with an API key. The API key is obtained by deploying the Face API in the cloud. Doing so generates a primary and secondary key. Be aware that the Face API container, like the Text Analytics container, needs connectivity to the cloud to ensure proper billing. It cannot be used in completely offline scenarios. In short, no matter the flavour you use, you need to deploy the Face API. It will appear in the portal as shown below:

Deployed Face API (part of Cognitive Services)

Using the API is a simple matter. An image can be delivered to the API in two ways:

  • Link: just provide a URL to an image
  • Octet-stream: POST binary data (the image’s bytes) to the API

In the Go example you can find on GitHub, the second approach is used. You simply open the image file (e.g. a jpg or png) and pass the byte array to the endpoint. The endpoint is in the following form for emotion detection:

https://westeurope.api.cognitive.microsoft.com/face/v1.0/detect?returnFaceAttributes=emotion

Instead of emotion, you can ask for other attributes or a combination of attributes: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. You simply add them together with +’s (e.g. emotion+age+gender). When you add attributes, the cost per call will increase slightly as will the response time. With the additional attributes, the Face API is much more useful than the simple FER+ model. The Face API has several additional features such as storing and comparing faces. Check out the documentation for full details.

To detect emotion in a video, the sample at https://github.com/gbaeke/emotion/blob/master/main.go contains some commented out code in the import section and around line 100 so you can use the Face API via the github.com/gbaeke/emotion/faceapi/msface package’s GetEmotion() function instead of the GetEmotion() function in the code. Because we have the full webcam image and face in an OpenCV mat, some extra code is needed to serialize it to a byte stream in a format the Face API understands:

encodedImage, _ := gocv.IMEncode(gocv.JPEGFileExt, face)       
emotion, err = msface.GetEmotion(bytes.NewReader(encodedImage))

In the above example, the face region detected by OpenCV is encoded to a JPG format as a byte slice. The byte slice is simply converted to an io.Reader and handed to the GetEmotion() function in the msface package.

When you use the Face API to detect emotions in a video stream from a webcam (or a video file), you will be hitting the API quite hard. You will surely need the standard tier of the API which allows you to do 10 transactions per second. To add face and emotion detection to video, the solution discussed in Detecting Emotions in FER+ is a better option.

Detecting emotions with FER+

In an earlier post, I discussed classifying images with the ResNet50v2 model. Azure Machine Learning Service was used to create a container image that used the ONNX ResNet50v2 model and the ONNX Runtime for scoring.

Continuing on that theme, I created a container image that uses the ONNX FER+ model that can detect emotions in an image. The container image also uses the ONNX Runtime for scoring.

You might wonder why you would want to detect emotions this way when there are many services available that can do this for you with a simple API call! You could use Microsoft’s Face API or Amazon’s Rekognition for example. While those services are easy to use and provide additional features, they do come at a cost. If all you need is basic detection of emotions, using this FER+ container is sufficient and cost effective.

Azure Face API (image from Microsoft website)

A notebook to create the image and deploy a container to Azure Container Instances (ACI) can be found here. The notebook uses the Azure Machine Learning SDK to register the model to an Azure Machine Learning workspace, build a container image from that model and deploy the container to ACI. The scoring script score.py is shown below.

score.py

The model expects an 64×64 gray scale image of a face in an array with the following dimensions: [1][1][64][64]. The output is JSON with a results array that contains the probabilities for each emotion and a time field with the inference time.

The emotion probabilities are in this order:

0: "neutral", 1: "happy", 2: "surprise", 3: "sadness", 4: "anger", 5: "disgust", 6: "fear", 7: "contempt

To actually capture the emotions, I wrote a small demo program in Go that uses OpenCV (via GoCV). You can find it on GitHub: https://github.com/gbaeke/emotion. You will need to install OpenCV and GoCV. Find the instructions here: https://gocv.io/getting-started/linux/. There are similar instructions for Mac and Windows but I have not tried those

The program is still a little rough around the edges but it does the trick. The scoring URI is hard coded to http://localhost:5002/score. With Docker installed, use the following command to install the scoring container:

 docker run -d -p 5002:5001 gbaeke/onnxferplus

Have fun with it!