Creating and deploying a model with Azure Machine Learning Service

In this post, we will take a look at creating a simple machine learning model for text classification and deploying it as a container with Azure Machine Learning service. This post is not intended to discuss the finer details of creating a text classification model. In fact, we will use the Keras library and its Reuters newswire dataset to create a simple dense neural network. You can find many online examples based on this dataset. For further information, be sure to check out and buy 👍 Deep Learning with Python by François Chollet, the creator of Keras and now at Google. It contains a section that explains using this dataset in much more detail!

Machine Learning service workspace

To get started, you need an Azure subscription. Once you have the subscription, create a Machine Learning service workspace. Below, you see such a workspace:

My Machine Learning service workspace (gebaml)

Together with the workspace, you also get a storage account, a key vault, application insights and a container registry. In later steps, we will create a container and store it in this registry. That all happens behind the scenes though. You will just write a few simple lines of code to make that happen!

Note the Authoring (Preview) section! These were added just before Build 2019 started. For now, we will not use them.

Azure Notebooks

To create the model and interact with the workspace, we will use a free Jupyter notebook in Azure Notebooks. At this point in time (8 May 2019), Azure Notebooks is still in preview. To get started, find the link below in the Overview section of the Machine Learning service workspace:

Getting Started with Notebooks

To quickly get the notebook, you can clone my public project: ⏩⏩⏩ https://notebooks.azure.com/geba/projects/textclassificationblog.

Creating the model

When you open the notebook, you will see the following first four cells:

Getting the dataset

It’s always simple if a prepared dataset is handed to you like in the above example. Above, you simply use the reuters class of keras.datasets and use the load_data method to get the data and directly assign it to variables to hold the train and test data plus labels.

In this case, the data consists of newswires with a corresponding label that indicates the category of the newswire (e.g. an earnings call newswire). There are 46 categories in this dataset. In the real world, you would have the newswire in text format. In this case, the newswire has already been converted (preprocessed) for you in an array of integers, with each integer corresponding to a word in a dictionary.

A bit further in the notebook, you will find a Vectorization section:

Vectorization

In this section, the train and test data is vectorized using a one-hot encoding method. Because we specified, in the very first cell of the notebook, to only use the 10000 most important words each article can be converted to a vector with 10000 values. Each value is either 1 or 0, indicating the word is in the text or not.

This bag-of-words approach is one of the ways to represent text in a data structure that can be used in a machine learning model. Besides vectorizing the training and test samples, the categories are also one-hot encoded.

Now the dense neural network model can be created:

Dense neural net with Keras

The above code defines a very simple dense neural network. A dense neural network is not necessarily the best type but that’s ok for this post. The specifics are not that important. Just note that the nn variable is our model. We will use this variable later when we convert the model to the ONNX format.

The last cell (16 above) does the actual training in 9 epochs. Training will be fast because the dataset is relatively small and the neural network is simple. Using the Azure Notebooks compute is sufficient. After 9 epochs, this is the result:

Training result

Not exactly earth-shattering: 78% accuracy on the test set!

Saving the model in ONNX format

ONNX is an open format to store deep learning models. When your model is in that format, you can use the ONNX runtime for inference.

Converting the Keras model to ONNX is easy with the onnxmltools:

Converting the Keras model to ONNX

The result of the above code is a file called reuters.onnx in your notebook project.

Predict with the ONNX model

Let’s try to predict the category of the first newswire in the test set. Its real label is 3, which means it’s a newswire about an earnings call (earn class):

Inferencing with the ONNX model

We will use similar code later in score.py, a file that will be used in a container we will create to expose the model as an API. The code is pretty simple: start an inference session based on the reuters.onnx file, grab the input and output and use run to predict. The resulting array is the output of the softmax layer and we use argmax to extract the category with the highest probability.

Saving the model to the workspace

With the model in reuters.onnx, we can add it to the workspace:

Saving the model in the workspace

You will need a file in your Azure Notebook project called config.json with the following contents:

{
     "subscription_id": "<subscription-id>",
     "resource_group": "<resource-group>",
     "workspace_name": "<workspace-name>" 
} 

With that file in place, when you run cell 27 (see above), you will need to authenticate to Azure to be able to interact with the workspace. The code is pretty self-explanatory: the reuters.onnx model will be added to the workspace:

Models added to the workspace

As you can see, you can save multiple versions of the model. This happens automatically when you save a model with the same name.

Creating the scoring container image

The scoring (or inference) container image is used to expose an API to predict categories of newswires. Obviously, you will need to give some instructions how scoring needs to be done. This is done via score.py:

score.py

The code is similar to the code we wrote earlier to test the ONNX model. score.py needs an init() and run() function. The other functions are helper functions. In init(), we need to grab a reference to the ONNX model. The ONNX model file will be placed in the container during the build process. Next, we start an InferenceSession via the ONNX runtime. In run(), the code is similar to our earlier example. It predicts via session.run and returns the result as JSON. We do not have to worry about the rest of the code that runs the API. That is handled by Machine Learning service.

Note: using ONNX is not a requirement; we could have persisted and used the native Keras model for instance

In this post, we only need score.py since we do not train our model via Azure Machine learning service. If you train a model with the service, you would create a train.py file to instruct how training should be done based on data in a storage account for instance. You would also provision compute resources for training. In our case, that is not required so we train, save and export the model directly from the notebook.

Training and scoring with Machine Learning service

Now we need to create an environment file to indicate the required Python packages and start the image build process:

Create an environment yml file via the API and build the container

The build process is handled by the service and makes sure the model file is in the container, in addition to score.py and myenv.yml. The result is a fully functional container that exposes an API that takes an input (a newswire) and outputs an array of probabilities. Of course, it is up to you to define what the input and output should be. In this case, you are expected to provide a one-hot encoded article as input.

The container image will be listed in the workspace, potentially multiple versions of it:

Container images for the reuters ONNX model

Deploy to Azure Container Instances

When the image is ready, you can deploy it via the Machine Learning service to Azure Container Instances (ACI) or Azure Kubernetes Service (AKS). To deploy to ACI:

Deploying to ACI

When the deployment is finished, the deployment will be listed:

Deployment (ACI)

When you click on the deployment, the scoring URI will be shown (e.g. http://IPADDRESS:80/score). You can now use Postman or any other method to score an article. To quickly test the service from the notebook:

Testing the service

The helper method run of aci_service will post the JSON in test_sample to the service. It knows the scoring URI from the deployment earlier.

Conclusion

Containerizing a machine learning model and exposing it as an API is made surprisingly simple with Azure Machine learning service. It saves time so you can focus on the hard work of creating a model that performs well in the field. In this post, we used a sample dataset and a simple dense neural network to illustrate how you can build such a model, convert it to ONNX format and use the ONNX runtime for scoring.

Creating and containerizing a TensorFlow Go application

In an earlier post, I discussed using a TensorFlow model from a Go application. With the TensorFlow bindings for Go, you can load a model that was exported with TensorFlow’s SavedModelBuilder module. That module saves a “snapshot” of a trained model which can be used for inference.

In this post, we will actually use the model in a web application. The application presents the user with a page to upload an image:

The upload page

The class and its probability is displayed, including the processed image:

Clearly a hen!

The source code of the application can be found at https://github.com/gbaeke/nasnet-go. If you just want to try the application, use Docker and issue the following command (replace port 80 with another port if there is a conflict):

docker run -p 80:9090 -d gbaeke/nasnet

The image is around 2.55GB in size so be patient when you first run the application. When the container has started, open your browser at http://localhost to see the upload page.

To quickly try it, you can run the container on Azure Container Instances. If you use the Portal, specify port 9090 as the container port.

Nasnet container in ACI

A closer look at the appN

**UPDATE**: since first publication, the http handler code was moved into from main.go to handlers/handlers.go

In the init() function, the nasnet model is loaded with tf.LoadSavedModel. The ImageNet categories are also loaded with a call to getCategories() and stored in categories which is a map of int to a string array.

In main(), we simply print the TensorFlow version (1.12). Next, http.HandleFunc is used to setup a handler (upload func) when users connect to the root of the web app.

Naturally, most of the logic is in the upload function. In summary, it does the following:

  • when users just navigate to the page (HTTP GET verb), render the upload.gtpl template; that template contains the upload form and uses a bit of bootstrap to make it just a bit better looking (and that’s already an overstatement); to learn more about Go web templates, see this link.
  • when users submit a file (POST), the following happens:
    • read the image
    • convert the image to a tensor with the getTensor function; getTensor returns a *tf.Tensor; the tensor is created from a [1][224][224][3] array; note that each pixel value gets normalized by subtracting by 127.5 and then dividing by 127.5 which is the same preprocessing applied as in Keras (divide by 127.5 and subtract 1)
    • run a session by inputting the tensor and getting the categories and probabilities as output
    • look for the highest probability and save it, together with the category name in a variable of type ResultPageData (a struct)
    • the struct data is used as input for the response.gtpl template

Note that the image is also shown in the output. The processed image (resized to 224×224) gets converted to a base64-encoded string. That string can be used in HTML image rendering as follows (where {{.Picture}} in the template will be replaced by the encoded string):

 <img src="data:image/jpg;base64,{{.Picture}}"> 

Note that the application lacks sufficient error checking to gracefully handle the upload of non-image files. Maybe I’ll add that later! 😉

Containerization

To containerize the application, I used the Dockerfile from https://github.com/tinrab/go-tensorflow-image-recognition but removed the step that downloads the InceptionV3 model. My application contains a ready to use NasnetMobile model.

The container image is based on tensorflow/tensorflow:1.12.0. It is further modified as required with the TensorFlow C API and the installation of Go. As discussed earlier, I uploaded a working image on Docker Hub.

Conclusion

Once you know how to use TensorFlow models from Go applications, it is easy to embed them in any application, from command-line tools to APIs to web applications. Although this application does server-side processing, you can also use a model directly in the browser with TensorFlow.js or ONNX.js. For ONNX, try https://microsoft.github.io/onnxjs-demo/#/resnet50 to perform image classification with ResNet50 in the browser. You will notice that it will take a while to get started due to the model being downloaded. Once the model is downloaded, you can start classifying images. Personally, I prefer the server-side approach but it all depends on the scenario.

Using TensorFlow models in Go

Image via www.vpnsrus.com

In earlier posts, I discussed hosting a deep learning model such as Resnet50 on Kubernetes or Azure Container Instances. The model can then be used as any API which receives input as JSON and returns a result as JSON.

Naturally, you can also run the model in offline scenarios and directly from your code. In this post, I will take a look at calling a TensorFlow model from Go. If you want to follow along, you will need Linux or MacOS because the Go module does not support Windows.

Getting Ready

I installed an Ubuntu Data Science Virtual Machine on Azure and connected to it with X2Go:

Data Science Virtual Machine (Ubuntu) with X2Go

The virtual machine has all the required machine learning tools installed such as TensorFlow and Python. It also has Visual Studio Code. There are some extra requirements though:

  • Go: follow the instructions here to download and install Go
  • TensorFlow C API: follow the instructions here to download and install the C API; the TensorFlow package for Go requires this; it is recommended to also build and run the Hello from TensorFlow C program to verify that the library works (near the bottom of the instructions page)

After installing Go and the TensorFlow C API, install the TensorFlow Go package with the following command:

go get github.com/tensorflow/tensorflow/tensorflow/go

Test the package with go test:

go test github.com/tensorflow/tensorflow/tensorflow/go

The above command should return:

ok      github.com/tensorflow/tensorflow/tensorflow/go  0.104s

The go get command installed the package in $HOME/go/src/github.com if you did not specify a custom $GOPATH (see this wiki page for more info).

Getting a model

A model describes how the input (e.g. an image for image classification) gets translated to an output (e.g. a list of classes with probabilities). The model contains thousands or even millions of parameters which means a model can be quite large. In this example, we will use NASNetMobile which can be used to classify images.

Now we need some code to save the model in TensorFlow format so that it can be used from a Go program. The code below is based on the sample code on the NASNetMobile page from modeldepot.io. It also does a quick test inference on a cat image.

import keras
from keras.applications.nasnet import NASNetMobile
from keras.preprocessing import image
from keras.applications.xception import preprocess_input, decode_predictions
import numpy as np
import tensorflow as tf
from keras import backend as K

sess = tf.Session()
K.set_session(sess)

model = NASNetMobile(weights="imagenet")
img = image.load_img('cat.jpg', target_size=(224,224))
img_arr = np.expand_dims(image.img_to_array(img), axis=0)
x = preprocess_input(img_arr)
preds = model.predict(x)
print('Prediction:', decode_predictions(preds, top=5)[0])

#save the model for use with TensorFlow
builder = tf.saved_model.builder.SavedModelBuilder("nasnet")

#Tag the model, required for Go
builder.add_meta_graph_and_variables(sess, ["atag"])
builder.save()
sess.close()

On the Ubuntu Data Science Virtual Machine, the above code should execute without any issues because all Python packages are already installed. I used the py35 conda environment. Use activate py35 to make sure you are in that environment.

The above code results in a nasnet folder, which contains the saved_model.pb file for the graph structure. The actual weights are in the variables subfolder. In total, the nasnet folder is around 38MB.

Great! Now we need a way to use the model from our Go program.

Using the saved model from Go

The model can be loaded with the LoadSavedModel function of the TensorFlow package. That package is imported like so:

import (
tf "github.com/tensorflow/tensorflow/tensorflow/go"
)

LoadSavedModel is used like so:

model, err := tf.LoadSavedModel("nasnet",
[]string{"atag"}, nil)
if err != nil {
log.Fatal(err)
}

The above code simply tries to load the model from the nasnet folder. We also need to specify the tag.

Next, we need to load an image and convert the image to a tensor with the following dimensions [1][224][224][3]. This is similar to my earlier ResNet50 post.

Now we need to pass the tensor to the model as input, and retrieve the class predictions as output. The following code achieves this:

output, err := model.Session.Run(
map[tf.Output]*tf.Tensor{
model.Graph.Operation("input_1").Output(0): input,
},
[]tf.Output{
model.Graph.Operation("predictions/Softmax").Output(0),
},
nil,
)
if err != nil {
log.Fatal(err)
}

What the heck is this? The run method is defined as follows:

func (s *Session) Run(feeds map[Output]*Tensor, fetches []Output, targets []*Operation) ([]*Tensor, error)

When you build a model, you can give names to tensors and operations. In this case the input tensor (of dimensions [1][224][224][3]) is called input_1 and needs to be specified as a map. The inference operation is called predictions/Softmax and the output needs to be specified as an array.

The actual predictions can be retrieved from the output variable:

predictions, ok := output[0].Value().([][]float32)
if !ok {
log.Fatal(fmt.Sprintf("output has unexpected type %T", output[0].Value()))
}

If you are not very familiar with Go, the code above uses type assertion to verify that predictions is a 2-dimensional array of float32. If the type assertion succeeds, the predictions variable will contain the actual predictions: [[<probability class 1 (tench)>, <probability class 2 (goldfish)>, …]]

You can now simply find the top prediction(s) in the array and match them with the list of classes for NASNet (actually the ImageNet classes). I get the following output with a cat image:

Yep, it’s a tabby!

If you are wondering what image I used:

Tabby?

Conclusion

With Go’s TensorFlow bindings, you can load TensorFlow models from disk and use them for inference locally, without having to call a remote API. We used Python to prepare the model with some help from Keras.

Microsoft Face API with a local container

A few days ago, I obtained access to the Face container. It provides access to the Face API via a container you can run where you want: on your pc, at the network edge or in your datacenter. You should allocate 6 GB or RAM and 2 cores for the container to run well. Note that you still need to create a Face API resource in the Azure Portal. The container needs to be associated with the Azure Face API via the endpoint and access key:

Face API with a West Europe (Amsterdam) endpoint

I used the Standard tier, which charges 0.84 euros per 1000 calls. As noted, the container will not function without associating it with an Azure Face API resource.

When you gain access to the container registry, you can pull the container:

docker pull containerpreview.azurecr.io/microsoft/cognitive-services-face:latest

After that, you can run the container as follows (for API billing endpoint in West Europe):

docker run --rm -it -p 5000:5000 --memory 6g --cpus 2 containerpreview.azurecr.io/microsoft/cognitive-services-face Eula=accept Billing=https://westeurope.api.cognitive.microsoft.com/face/v1.0 ApiKey=YOUR_API_KEY

The container will start. You will see the output (–it):

Running Face API container

And here’s the spec:

API spec Face API v1

Before showing how to use the detection feature, note that the container needs Internet access for billing purposes. You will not be able to run the container in fully offline scenarios.

Over at https://github.com/gbaeke/msface-go, you can find a simple example in Go that uses the container. The Face API can take a byte stream of an image or a URL to an image. The example takes the first approach and loads an image from disk as specified by the -image parameter. The resulting io.Reader is passed to the getFace function which does the actual call to the API (uri = http://localhost:5000/face/v1.0/detect):

request, err := http.NewRequest("POST", uri+"?returnFaceAttributes="+params, m)
request.Header.Add("Content-Type", "application/octet-stream")

// Send the request to the local web service
resp, err := client.Do(request)
if err != nil {
    return "", err
}

The response contains a Body attribute and that attribute is unmarshalled to a variable of type interface. That one is marshalled with indentation to a byte slice (b) which is returned by the function as a string:

var response interface{}
err = json.Unmarshal(respBody, &response)
if err != nil {
    return "", err
}
b, err := json.MarshalIndent(response, "", "\t")

Now you can use a picture like the one below:

Is he smiling?

Here are some parts of the input, following the command
detectface -image smiling.jpg

Emotion is clearly happiness with additional features such as age, gender, hair color, etc…

[
{
"faceAttributes": {
"accessories": [],
"age": 33,
"blur": {
"blurLevel": "high",
"value": 1
},
"emotion": {
"anger": 0,
"contempt": 0,
"disgust": 0,
"fear": 0,
"happiness": 1,
"neutral": 0,
"sadness": 0,
"surprise": 0
},
"exposure": {
"exposureLevel": "goodExposure",
"value": 0.71
},
"facialHair": {
"beard": 0.6,
"moustache": 0.6,
"sideburns": 0.6
},
"gender": "male",
"glasses": "NoGlasses",
"hair": {
"bald": 0.26,
"hairColor": [
{
"color": "black",
"confidence": 1
}],
"faceId": "b6d924c1-13ef-4d19-8bc9-34b0bb21f0ce",
"faceRectangle": {
"height": 1183,
"left": 944,
"top": 167,
"width": 1183
}
}
]

That’s it! Give the Face API container a go with the tool. You can get it here: https://github.com/gbaeke/msface-go/releases/tag/v0.0.1 (Windows)

Using the Microsoft Face API to detect emotions in photos and video

In a previous post, I blogged about detecting emotions with the ONNX FER+ model. As an alternative, you can use cloud models hosted by major cloud providers such as Microsoft, Amazon and Google. Besides those, there are many other services to choose from.

To detect facial emotions with Azure, there is a Face API in two flavours:

  • Cloud: API calls are sent to a cloud-hosted endpoint in the selected deployment region
  • Container: API calls are sent to a container that you deploy anywhere, including the edge (e.g. IoT Edge device)

To use the container version, you need to request access via this link. In another blog post, I already used the Text Analytics container to detect sentiment in a piece of text.

Note that the container version is not free and needs to be configured with an API key. The API key is obtained by deploying the Face API in the cloud. Doing so generates a primary and secondary key. Be aware that the Face API container, like the Text Analytics container, needs connectivity to the cloud to ensure proper billing. It cannot be used in completely offline scenarios. In short, no matter the flavour you use, you need to deploy the Face API. It will appear in the portal as shown below:

Deployed Face API (part of Cognitive Services)

Using the API is a simple matter. An image can be delivered to the API in two ways:

  • Link: just provide a URL to an image
  • Octet-stream: POST binary data (the image’s bytes) to the API

In the Go example you can find on GitHub, the second approach is used. You simply open the image file (e.g. a jpg or png) and pass the byte array to the endpoint. The endpoint is in the following form for emotion detection:

https://westeurope.api.cognitive.microsoft.com/face/v1.0/detect?returnFaceAttributes=emotion

Instead of emotion, you can ask for other attributes or a combination of attributes: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. You simply add them together with +’s (e.g. emotion+age+gender). When you add attributes, the cost per call will increase slightly as will the response time. With the additional attributes, the Face API is much more useful than the simple FER+ model. The Face API has several additional features such as storing and comparing faces. Check out the documentation for full details.

To detect emotion in a video, the sample at https://github.com/gbaeke/emotion/blob/master/main.go contains some commented out code in the import section and around line 100 so you can use the Face API via the github.com/gbaeke/emotion/faceapi/msface package’s GetEmotion() function instead of the GetEmotion() function in the code. Because we have the full webcam image and face in an OpenCV mat, some extra code is needed to serialize it to a byte stream in a format the Face API understands:

encodedImage, _ := gocv.IMEncode(gocv.JPEGFileExt, face)       
emotion, err = msface.GetEmotion(bytes.NewReader(encodedImage))

In the above example, the face region detected by OpenCV is encoded to a JPG format as a byte slice. The byte slice is simply converted to an io.Reader and handed to the GetEmotion() function in the msface package.

When you use the Face API to detect emotions in a video stream from a webcam (or a video file), you will be hitting the API quite hard. You will surely need the standard tier of the API which allows you to do 10 transactions per second. To add face and emotion detection to video, the solution discussed in Detecting Emotions in FER+ is a better option.

Detecting emotions with FER+

In an earlier post, I discussed classifying images with the ResNet50v2 model. Azure Machine Learning Service was used to create a container image that used the ONNX ResNet50v2 model and the ONNX Runtime for scoring.

Continuing on that theme, I created a container image that uses the ONNX FER+ model that can detect emotions in an image. The container image also uses the ONNX Runtime for scoring.

You might wonder why you would want to detect emotions this way when there are many services available that can do this for you with a simple API call! You could use Microsoft’s Face API or Amazon’s Rekognition for example. While those services are easy to use and provide additional features, they do come at a cost. If all you need is basic detection of emotions, using this FER+ container is sufficient and cost effective.

Azure Face API (image from Microsoft website)

A notebook to create the image and deploy a container to Azure Container Instances (ACI) can be found here. The notebook uses the Azure Machine Learning SDK to register the model to an Azure Machine Learning workspace, build a container image from that model and deploy the container to ACI. The scoring script score.py is shown below.

score.py

The model expects an 64×64 gray scale image of a face in an array with the following dimensions: [1][1][64][64]. The output is JSON with a results array that contains the probabilities for each emotion and a time field with the inference time.

The emotion probabilities are in this order:

0: "neutral", 1: "happy", 2: "surprise", 3: "sadness", 4: "anger", 5: "disgust", 6: "fear", 7: "contempt

To actually capture the emotions, I wrote a small demo program in Go that uses OpenCV (via GoCV). You can find it on GitHub: https://github.com/gbaeke/emotion. You will need to install OpenCV and GoCV. Find the instructions here: https://gocv.io/getting-started/linux/. There are similar instructions for Mac and Windows but I have not tried those

The program is still a little rough around the edges but it does the trick. The scoring URI is hard coded to http://localhost:5002/score. With Docker installed, use the following command to install the scoring container:

 docker run -d -p 5002:5001 gbaeke/onnxferplus

Have fun with it!

ResNet50v2 classification in Go with a local container

To quickly go to the code, go here. Otherwise, keep reading…

In a previous blog post, I wrote about classifying images with the ResNet50v2 model from the ONNX Model Zoo. In that post, the container ran on a Kubernetes cluster with GPU nodes. The nodes had an NVIDIA v100 GPU. The actual classification was done with a simple Python script with help from Keras and Numpy. Each inference took around 25 milliseconds.

In this post, we will do two things:

  • run the scoring container (CPU) on a local machine that runs Docker
  • perform the scoring (classification) in Go

Installing the scoring container locally

I pushed the scoring container with the ONNX ResNet50v2 image to the following location: https://cloud.docker.com/u/gbaeke/repository/docker/gbaeke/onnxresnet50v2. Run the container with the following command:

docker run -d -p 5001:5001 gbaeke/onnxresnet50

The container will be pulled and started. The scoring URI is on http://localhost:5001/score.

Note that in the previous post, Azure Machine Learning deployed two containers: the scoring container (the one described above) and a front-end container. In that scenario, the front-end container handles the HTTP POST requests (optionally with SSL) and route the request to the actual scoring container.

The scoring container accepts the same payload as the front-end container. That means it can be used on its own, as we are doing now.

Note that you can also use IoT Edge, as explained in an earlier post. That actually shows how easy it is to push AI models to the edge and use them locally, befitting your business case.

Scoring with Go

To actually classify images, I wrote a small Go program to do just that. Although there are some scientific libraries for Go, they are not really needed in this case. That means we do have to create the 4D tensor payload and interpret the softmax result manually. If you check the code, you will see that is not awfully difficult.

The code can be found in the following GitHub repository: https://github.com/gbaeke/resnet-score.

Remember that this model expects the input as a 4D tensor with the following dimensions:

  • dimension 0: batch (we only send one image here)
  • dimension 1: channels (one for each; RGB)
  • dimension 2: height
  • dimension 3: width

The 4D tensor needs to be serialized to JSON in a field called data. We send that data with HTTP POST to the scoring URI at http://localhost:5001/score.

The response from the container will be JSON with two fields: a result field with the 1000 softmax values and a time field with the inference time. We can use the following two structs for marshaling and unmarshaling

Input and output of the model

Note that this model expects pictures to be scaled to 224 by 224 as reflected by the height and width dimensions of the uint8 array. The rest of the code is summarized below:

  • read the image; the path of the image is passed to the code via the -image command line parameter
  • the image is resized with the github.com/disintegration/imaging package (linear method)
  • the 4D tensor is populated by iterating over all pixels of the image, extracting r,g and b and placing them in the BCHW array; note that the r,g and b values are uint16 and scaled to fit in a uint8
  • construct the input which is a struct of type InputData
  • marshal the InputData struct to JSON
  • POST the JSON to the local scoring URI
  • read the HTTP response and unmarshal the response in a struct of type OutputData
  • find the highest probability in the result and note the index where it was found
  • read the 1000 ImageNet categories from imagenet_class_index.json and marshal the JSON into a map of string arrays
  • print the category using the index with the highest probability and the map

What happens when we score the image below?

What is this thing?

Running the code gives the following result:

$ ./class -image images/cassette.jpg

Highest prob is 0.9981583952903748 at 481 (inference time: 0.3309464454650879 )
Probably [n02978881 cassette

The inference time is 1/3 of a second on my older Linux laptop with a dual-core i7.

Try it yourself by running the container and the class program. Download it from here (Linux).