Trying Google Cloud Run

With the release of Google’s Cloud Run, I decided to check it out with my nasnet container.

With Cloud Run, you simply deploy your container and let Google scale it based on the requests it receives. When your container is not used, it gets scaled to zero. As such, it combines the properties of a serverless offering such as Azure Functions with standard containers. Today, you cannot put limits on scaling (e.g. max X instances).

You might be tempted to compare Cloud Run to something like Azure Container Instances but it is not exactly the same. True, Azure Container Instances (ACI) allows you to simply deploy a container without the need for an orchestrator such as Kubernetes. With ACI however, memory and CPU capacity are reserved and it does not scale your container based on the requests it receives. ACI can be used in conjunction with virtual nodes in AKS (Kubernetes on Azure) to achieve somewhat similar results at higher cost and complexity. However, ACI can be used in broader scenarios such as stateful applications beyond the simple HTTP use case.

Prerequisites

Cloud Run containers should be able to fit in 2GB of memory. They should be stateless and all computation should be scoped to a HTTP request.

Your container needs to be invocable via HTTP requests on port 8080. It is against best practices though to hardcode this port. Instead, you should check the PORT environment variable that is automatically injected into the container by Cloud Run. Google might change the port in the future! In the nasnet container, the code checks this as follows:

port := getEnv("PORT", "9090") 

getEnv is a custom function that checks the environment variable. If it is not set, port is set to the value of the second parameter:

func getEnv(key, fallback string) string {
value, exists := os.LookupEnv(key)
if !exists {
value = fallback
}
return value
}

Later, in the call to ListenAndServe, the port variable is used as follows:

log.Fatal(http.ListenAndServe(":"+port, nil)) 

Deploying to Cloud Run

Make sure you have access to Google Cloud and create a project. I created a project called CRTest.

Next, clone the nasnet-go repository:

git clone https://github.com/gbaeke/nasnet-go.git

If you have Docker installed, issue the following command to build and tag the container (from the nasnet-go folder which is a folder created by the git clone command above):

docker build -t gcr.io/<PROJECT>/nasnet:latest .

In the above command, replace <PROJECT> with your Google Cloud project name.

To push the container image to the Google Cloud Repository, install gcloud. When you run gcloud init, you will have to authenticate to Google Cloud. We install gcloud here to make the authentication process to Google Container Registry easier. To do that, run the following command:

gcloud auth configure-docker

Next, authenticate to the registry:

docker login gcr.io/<PROJECT>

Now that you are logged in, push the image:

docker push gcr.io/<PROJECT>/nasnet:latest

If you don’t want to bother yourself with local build and push, you can use Google Cloud Build instead:

gcloud builds submit --tag gcr.io/crtest/nasnet .

The above command will package you source files, submit them to Cloud Build and build the container in Google Cloud. When finished, the container image will be pushed to gcr.io/crtest. Either way, when the push is done, check the image in the console:

The nasnet container image in gcr

Now we have the image in the repository, we can use it with Cloud Run. In the console, navigate to Cloud Run and click Create Service:

Creating a Cloud Run service for nasnet

By default, allocated memory is set to 256MB which is too low for this image. Set allocated memory to 1GB. If you set it too low, your container will be restarted. Click Optional Settings and change the allocated memory:

Change the allocated memory for the nasnet container

Note: this particular container writes files you upload to the local file system; this is not recommended since data written to the file system is counted as memory

Note: the container will handle multiple requests up to a maximum of 80; currently 80 concurrent requests per container is the maximum

Now finish the configuration. The image will be pulled and the Cloud Run service will be started:

Cloud Run gives you a https URL to connect to your container. You can configure custom domains as well. When you browse to the URL, you should see the following:

Try it by uploading an image to classify it!

Conclusion

Google Cloud Run makes it easy to deploy HTTP invocable containers in a serverless fashion. In this example, I modified the nasnet code to check the PORT environment variable. In the runtime configuation, I set the amount of memory to 1GB. That was all that was needed to get this container to run. Note that Cloud Run can also be used in conjunction with GKE (Google Kubernetes Engine). That’s a post for some other time!

Azure Front Door Revisited

A while ago, I wrote a post about Azure Front Door. In that post, I wrote that http to https redirection was not possible. With Azure Front Door being GA, let’s take a look if that is still the case.

In the previous post, I had the following configuration in Front Door Designer:

Azure Front Door Designer

The above configuration exposes a static website hosted in an Azure Storage Acccount (the backend in the backend pool). The custom domain deploy.baeke.info maps to geba.azurefd.net using a CNAME in my CloudFlare hosted domain. The routing rule routeall maps all requests to the backend.

The above configuration does not, however, redirect http://deploy.baeke.info to https://deploy.baeke.info which is clearly not what we want. In order to achieve that goal, the routing rules can be changed. A redirect routing rule looks as follows:

Redirect routing rule (Replace destination host was not required)

The routall rule looks like this:

Routing rule

The routing rule simply routes https://deploy.baeke.info to the azdeploy backend pool which only contains the single static website hosted in a storage account.

The full config looks like this:

Full config in Front Door designer

Although not very useful for this static website, I also added WAF (Web Application Firewall) rules to Azure Front Door. In the Azure Portal, just search for WAF and add a policy. I added a default policy and associated it with this Azure Front Door website:

WAF rules associated with the Azure Front Door frontend

If required, you can enable/disable WAF rules:

Using TensorFlow models in Go

Image via www.vpnsrus.com

In earlier posts, I discussed hosting a deep learning model such as Resnet50 on Kubernetes or Azure Container Instances. The model can then be used as any API which receives input as JSON and returns a result as JSON.

Naturally, you can also run the model in offline scenarios and directly from your code. In this post, I will take a look at calling a TensorFlow model from Go. If you want to follow along, you will need Linux or MacOS because the Go module does not support Windows.

Getting Ready

I installed an Ubuntu Data Science Virtual Machine on Azure and connected to it with X2Go:

Data Science Virtual Machine (Ubuntu) with X2Go

The virtual machine has all the required machine learning tools installed such as TensorFlow and Python. It also has Visual Studio Code. There are some extra requirements though:

  • Go: follow the instructions here to download and install Go
  • TensorFlow C API: follow the instructions here to download and install the C API; the TensorFlow package for Go requires this; it is recommended to also build and run the Hello from TensorFlow C program to verify that the library works (near the bottom of the instructions page)

After installing Go and the TensorFlow C API, install the TensorFlow Go package with the following command:

go get github.com/tensorflow/tensorflow/tensorflow/go

Test the package with go test:

go test github.com/tensorflow/tensorflow/tensorflow/go

The above command should return:

ok      github.com/tensorflow/tensorflow/tensorflow/go  0.104s

The go get command installed the package in $HOME/go/src/github.com if you did not specify a custom $GOPATH (see this wiki page for more info).

Getting a model

A model describes how the input (e.g. an image for image classification) gets translated to an output (e.g. a list of classes with probabilities). The model contains thousands or even millions of parameters which means a model can be quite large. In this example, we will use NASNetMobile which can be used to classify images.

Now we need some code to save the model in TensorFlow format so that it can be used from a Go program. The code below is based on the sample code on the NASNetMobile page from modeldepot.io. It also does a quick test inference on a cat image.

import keras
from keras.applications.nasnet import NASNetMobile
from keras.preprocessing import image
from keras.applications.xception import preprocess_input, decode_predictions
import numpy as np
import tensorflow as tf
from keras import backend as K

sess = tf.Session()
K.set_session(sess)

model = NASNetMobile(weights="imagenet")
img = image.load_img('cat.jpg', target_size=(224,224))
img_arr = np.expand_dims(image.img_to_array(img), axis=0)
x = preprocess_input(img_arr)
preds = model.predict(x)
print('Prediction:', decode_predictions(preds, top=5)[0])

#save the model for use with TensorFlow
builder = tf.saved_model.builder.SavedModelBuilder("nasnet")

#Tag the model, required for Go
builder.add_meta_graph_and_variables(sess, ["atag"])
builder.save()
sess.close()

On the Ubuntu Data Science Virtual Machine, the above code should execute without any issues because all Python packages are already installed. I used the py35 conda environment. Use activate py35 to make sure you are in that environment.

The above code results in a nasnet folder, which contains the saved_model.pb file for the graph structure. The actual weights are in the variables subfolder. In total, the nasnet folder is around 38MB.

Great! Now we need a way to use the model from our Go program.

Using the saved model from Go

The model can be loaded with the LoadSavedModel function of the TensorFlow package. That package is imported like so:

import (
tf "github.com/tensorflow/tensorflow/tensorflow/go"
)

LoadSavedModel is used like so:

model, err := tf.LoadSavedModel("nasnet",
[]string{"atag"}, nil)
if err != nil {
log.Fatal(err)
}

The above code simply tries to load the model from the nasnet folder. We also need to specify the tag.

Next, we need to load an image and convert the image to a tensor with the following dimensions [1][224][224][3]. This is similar to my earlier ResNet50 post.

Now we need to pass the tensor to the model as input, and retrieve the class predictions as output. The following code achieves this:

output, err := model.Session.Run(
map[tf.Output]*tf.Tensor{
model.Graph.Operation("input_1").Output(0): input,
},
[]tf.Output{
model.Graph.Operation("predictions/Softmax").Output(0),
},
nil,
)
if err != nil {
log.Fatal(err)
}

What the heck is this? The run method is defined as follows:

func (s *Session) Run(feeds map[Output]*Tensor, fetches []Output, targets []*Operation) ([]*Tensor, error)

When you build a model, you can give names to tensors and operations. In this case the input tensor (of dimensions [1][224][224][3]) is called input_1 and needs to be specified as a map. The inference operation is called predictions/Softmax and the output needs to be specified as an array.

The actual predictions can be retrieved from the output variable:

predictions, ok := output[0].Value().([][]float32)
if !ok {
log.Fatal(fmt.Sprintf("output has unexpected type %T", output[0].Value()))
}

If you are not very familiar with Go, the code above uses type assertion to verify that predictions is a 2-dimensional array of float32. If the type assertion succeeds, the predictions variable will contain the actual predictions: [[<probability class 1 (tench)>, <probability class 2 (goldfish)>, …]]

You can now simply find the top prediction(s) in the array and match them with the list of classes for NASNet (actually the ImageNet classes). I get the following output with a cat image:

Yep, it’s a tabby!

If you are wondering what image I used:

Tabby?

Conclusion

With Go’s TensorFlow bindings, you can load TensorFlow models from disk and use them for inference locally, without having to call a remote API. We used Python to prepare the model with some help from Keras.

ResNet50v2 classification in Go with a local container

To quickly go to the code, go here. Otherwise, keep reading…

In a previous blog post, I wrote about classifying images with the ResNet50v2 model from the ONNX Model Zoo. In that post, the container ran on a Kubernetes cluster with GPU nodes. The nodes had an NVIDIA v100 GPU. The actual classification was done with a simple Python script with help from Keras and Numpy. Each inference took around 25 milliseconds.

In this post, we will do two things:

  • run the scoring container (CPU) on a local machine that runs Docker
  • perform the scoring (classification) in Go

Installing the scoring container locally

I pushed the scoring container with the ONNX ResNet50v2 image to the following location: https://cloud.docker.com/u/gbaeke/repository/docker/gbaeke/onnxresnet50v2. Run the container with the following command:

docker run -d -p 5001:5001 gbaeke/onnxresnet50

The container will be pulled and started. The scoring URI is on http://localhost:5001/score.

Note that in the previous post, Azure Machine Learning deployed two containers: the scoring container (the one described above) and a front-end container. In that scenario, the front-end container handles the HTTP POST requests (optionally with SSL) and route the request to the actual scoring container.

The scoring container accepts the same payload as the front-end container. That means it can be used on its own, as we are doing now.

Note that you can also use IoT Edge, as explained in an earlier post. That actually shows how easy it is to push AI models to the edge and use them locally, befitting your business case.

Scoring with Go

To actually classify images, I wrote a small Go program to do just that. Although there are some scientific libraries for Go, they are not really needed in this case. That means we do have to create the 4D tensor payload and interpret the softmax result manually. If you check the code, you will see that is not awfully difficult.

The code can be found in the following GitHub repository: https://github.com/gbaeke/resnet-score.

Remember that this model expects the input as a 4D tensor with the following dimensions:

  • dimension 0: batch (we only send one image here)
  • dimension 1: channels (one for each; RGB)
  • dimension 2: height
  • dimension 3: width

The 4D tensor needs to be serialized to JSON in a field called data. We send that data with HTTP POST to the scoring URI at http://localhost:5001/score.

The response from the container will be JSON with two fields: a result field with the 1000 softmax values and a time field with the inference time. We can use the following two structs for marshaling and unmarshaling

Input and output of the model

Note that this model expects pictures to be scaled to 224 by 224 as reflected by the height and width dimensions of the uint8 array. The rest of the code is summarized below:

  • read the image; the path of the image is passed to the code via the -image command line parameter
  • the image is resized with the github.com/disintegration/imaging package (linear method)
  • the 4D tensor is populated by iterating over all pixels of the image, extracting r,g and b and placing them in the BCHW array; note that the r,g and b values are uint16 and scaled to fit in a uint8
  • construct the input which is a struct of type InputData
  • marshal the InputData struct to JSON
  • POST the JSON to the local scoring URI
  • read the HTTP response and unmarshal the response in a struct of type OutputData
  • find the highest probability in the result and note the index where it was found
  • read the 1000 ImageNet categories from imagenet_class_index.json and marshal the JSON into a map of string arrays
  • print the category using the index with the highest probability and the map

What happens when we score the image below?

What is this thing?

Running the code gives the following result:

$ ./class -image images/cassette.jpg

Highest prob is 0.9981583952903748 at 481 (inference time: 0.3309464454650879 )
Probably [n02978881 cassette

The inference time is 1/3 of a second on my older Linux laptop with a dual-core i7.

Try it yourself by running the container and the class program. Download it from here (Linux).

Creating a GPU container image for scoring with Azure Machine Learning

In a previous post, I discussed how you can add an existing Kubernetes cluster to an Azure Machine Learning workspace. Adding an existing cluster is necessary when the workspace does not support auto creation of a cluster. That is the case when you want to use the Standard_NC6s_v3 virtual machine image. I also used a container for scoring pictures with the ResNet50v2 model from the ONNX Model Zoo. Now we will take a look at actually creating that container image with GPU support. Note that in many cases, inference with CPUs is more than sufficient but the GPU case is more interesting to look at!

To get started, you need an Azure subscription with an Azure Machine Learning workspace. Take a look here for instructions.

Once you have a workspace, there are a few steps to take. If you look at the diagram at the top of this post, we will perform the steps starting from Register and manage your model:

  • Register model: we will add the Resnet50v2 model from the ONNX Model Zoo; we are using this existing model instead of our own; ResNet50v2 can recognize pictures in 1000 categories
  • Create container image: from the model in the workspace, we create a container image with GPU support
  • Deploy container image: from the image in the workspace, we deploy the image to compute that supports GPUs

Machine Learning SDK

The Azure Machine Learning service has a Machine Learning SDK for Python. All the steps discussed above can be performed with code. You can find an example of the Python code to use in the following Jupyter notebook hosted on Azure Notebooks: https://gebaml-geba.notebooks.azure.com/j/notebooks/ONNXResnet.ipynb. Note that the Azure Notebooks service is still in preview and a bit rough around the edges. The Machine Learning SDK is available by default in Azure Notebooks.

At the beginning of the notebook, we import azureml.core which allows you to check the version of the SDK (among other things):

Registering the model

First, we download the model to the notebook project. In the notebook, the urllib module is used to download the compressed version of the ResNet50v2 model. The tarball is untarred in resnet50v2/resnet50v2.onnx. You should see the model as a complex function with, in this case, millions of parameters (weights). The input to the function are the pixels of your picture (their red, green and blue values). The output of the function is a category: cat, guitar, …

Now that we have the model, we need to add it to the workspace, which means we also have to authenticate. Create a file called config.json with the following contents:

{
"subscription_id": "your Azure subscription ID", "resource_group": "your Azure ML resource group",
"workspace_name": "your Azure ML workspace name"
}

With the Workspace class from azureml.core we authenticate to Azure and grab a reference to the workspace with the ws variable. The Workspace.from_config() function searches for the config.json file.

Now we can finally register the model in the workspace using Model.register:

The above is the same as adding a model using the Azure Portal. You might hit file upload limits in the portal so adding the model via code is the better approach. Your model is now registered in the workspace:

Creating a GPU container image from the model

Now that we have the model, we can create the container image. The model will be included in the image which will add about 100MB to its size. The container image in Azure Machine Learning is created from four settings/artifacts:

  • model: registered in the workspace
  • score file: a file score.py with an init() and run() function; helper functions can also be included
  • dependency file: used to indicate the Python modules that need to be installed in the image (see https://conda.io/docs/)
  • GPU support: set to True or False

You will find the score file in the notebook. It was copied from a Microsoft supplied sample. If you do not have some experience with Machine Learning and neural networks (in this case), it will be difficult to create this from scratch. The ResNet50v2 model expects a 4-dimensional tensor with the following dimensions:

  • 0: batch (1 when you send 1 image)
  • 1: channels (3 channels for red, green and blue; RGB)
  • 2: height (224 pixels)
  • 3: width (224 pixels)

For inference, you will actually send the above data in a JSON payload as the data field. The preprocess() function in score.py grabs the data field and converts it to a NumPy array. The data is then normalized by dividing each pixel by 255, subtracting the mean values (of each channel) and dividing by the standard deviation (of each channel) . The normalized data is then sent to the model which outputs an array with 1000 probabilities that sum to 1 (via a softmax function).

Why are there a thousand probabilities? The model was trained on a thousand different categories of images and for each of these categories, a probability is output. After inference we will need a list of these categories so we can find the one that matches with our uploaded image and that has the highest probability!

This particular score.py file uses the ONNX runtime for inference. To enable GPU support, make sure you include the onnxruntime-gpu package in your conda dependencies as shown below:

With score.py and myenv.yml, the container image with GPU support can be created. Note that we are specifying the score.py file, the conda file and the model. GPU support is enabled as well via enable_gpu=True.

The code above should result in the following image in your workspace (after several minutes of building):

In the background, this image is stored in the container registry that got created when you deployed the Azure Machine Learning workspace. You are now ready for the third step, deploying the image to compute that supports GPUs (for instance Kubernetes). That step, together with some code to actually recognize images, will be for another post. In that post, we will also compare CPU to GPU speed.

Conclusion

In this post, we looked at creating a scoring (inference) container image with GPU support. Instead of creating and using our own model, we used the ResNet50v2 model from the ONNX Model Zoo. The model file, together with a score.py file and conda dependency file was used to build a container image. Azure Machine Learning builds the container image for you and stores it in a container registry. Although Azure Machine Learning takes care of most of the infrastructure work, you still need to know how to write the scoring file. In this post, the scoring file uses the ONNX runtime but you can use other runtimes or frameworks such as TensorFlow or MXNET.


Deploying Azure Cognitive Services Containers with IoT Edge

Introduction

Azure Cognitive Services is a collection of APIs that make your applications smarter. Some of those APIs are listed below:

  • Vision: image classification, face detection (including emotions), OCR
  • Language: text analytics (e.g. key phrase or sentiment analysis), language detection and translation

To use one of the APIs you need to provision it in an Azure subscription. After provisioning, you will get an endpoint and API key. Every time you want to classify an image or detect sentiment in a piece of text, you will need to post an appropriate payload to the cloud endpoint and pass along the API key as well.

What if you want to use these services but you do not want to pass your payload to a cloud endpoint for compliance or latency reasons? In that case, the Cognitive Services containers can be used. In this post, we will take a look at the Text Analytics containers, specifically the one for Sentiment Analysis. Instead of deploying the container manually, we will deploy the container with IoT Edge.

IoT Edge Configuration

To get started, create an IoT Hub. The free tier will do just fine. When the IoT Hub is created, create an IoT Edge device. Next, configure your actual edge device to connect to IoT Hub with the connection string of the device you created in IoT Hub. Microsoft have a great tutorial to do all of the above, using a virtual machine in Azure as the edge device. The tutorial I linked to is the one for an edge device running Linux. When finished, the device should report its status to IoT Hub:

If you want to install IoT Edge on an existing device like a laptop, follow the procedure for Linux x64.

Once you have your edge device up and running, you can use the following command to obtain the status of your edge device: sudo systemctl status iotedge. The result:

Deploy Sentiment Analysis container

With the IoT Edge daemon up and running, we can deploy the Sentiment Analysis container. In IoT Hub, select your IoT Edge device and select Set modules:

In Set Modules you have the ability to configure the modules for this specific device. Modules are always deployed as containers and they do not have to be specifically designed or developed for use with IoT Edge. In the three step wizard, add the Sentiment Analysis container in the first step. Click Add and then select IoT Edge Module. Provide the following settings:

Although the container can freely be pulled from the Image URI, the container needs to be configured with billing info and an API key. In the Billing environment variable, specify the endpoint URL for the API you configured in the cloud. In ApiKey set your API key. Note that the container always needs to be connected to the cloud to verify that you are allowed to use the service. Remember that although your payload is not sent to the cloud, your container usage is. The full container create options are listed below:

{
"Env": [
"Eula=accept",
"Billing=https://westeurope.api.cognitive.microsoft.com/text/analytics/v2.0",
"ApiKey=<yourKEY>"
],
"HostConfig": {
"PortBindings": {
"5000/tcp": [
{
"HostPort": "5000"
}
]
}
}
}

In HostConfig we ask the container runtime (Docker) to map port 5000 of the container to port 5000 of the host. You can specify other create options as well.

On the next page, you can configure routing between IoT Edge modules. Because we do not use actual IoT Edge modules, leave the configuration as shown below:

Now move to the last page in the Set Modules wizard to review the configuration and click Submit.

Give the deployment some time to finish. After a while, check your edge device with the following command: sudo iotedge list. Your TextAnalytics container should be listed. Alternatively, use sudo docker ps to list the Docker containers on your edge device.

Testing the Sentiment Analysis container

If everything went well, you should be able to go to http://localhost:5000/swagger to see the available endpoints. Open Sentiment Analysis to try out a sample:

You can use curl to test as well:

curl -X POST "http://localhost:5000/text/analytics/v2.0/sentiment" -H  "accept: application/json" -H  "Content-Type: application/json-patch+json" -d "{  \"documents\": [    {      \"language\": \"en\",      \"id\": \"1\",      \"text\": \"I really really despise this product!! DO NOT BUY!!\"    }  ]}"

As you can see, the API expects a JSON payload with a documents array. Each document object has three fields: language, id and text. When you run the above command, the result is:

{"documents":[{"id":"1","score":0.0001703798770904541}],"errors":[]}

In this case, the text I really really despise this product!! DO NOT BUY!! clearly results in a very bad score. As you might have guessed, 0 is the absolute worst and 1 is the absolute best.

Just for fun, I created a small Go program to test the API:

The Go program can be found here: https://github.com/gbaeke/sentiment. You can download the executable for Linux with: wget https://github.com/gbaeke/sentiment/releases/download/v0.1/ta. Make ta executable and use ./ta –help for help with the parameters.

Summary

IoT Edge is a great way to deploy containers to edge devices running Linux or Windows. Besides deploying actual IoT Edge modules, you can deploy any container you want. In this post, we deployed a Cognitive Services container that does Sentiment Analysis at the edge.

Deploying Azure resources using webhookd

In the previous blog post, I discussed adding SSL to webhookd. In this post, I will briefly show how to use this solution to deploy Azure resources.

To run webhookd, I deployed a small Standard_B1s machine (1GB RAM, 1 vCPU) with a system assigned managed identity. After deployment, information about the managed identity is available via the Identity link.

Code running on a machine with a managed identity needs to do something specific to obtain information about the identity like a token. With curl, you would issue the following command:

curl 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.azure.com%2F' -H Metadata:true -s

The response would be JSON that contains a field called access_token. You could parse out the access_token and then use the token in a call to the Azure Resource Manager APIs. You would use the token in the autorization header. Full details about acquiring these tokens can be found here. On that page, you will find details about acquiring the token with Go, JavaScript and several other languages.

Because we are using webhookd and shell scripts, the Azure CLI is the ideal way to create Azure resources. The Azure CLI can easily authenticate with the managed identity using a simple command: az login –identity. Here’s a shell script that uses it to create a virtual machine:

#!/bin/bash echo "Authenticating...`az login --identity`" 

echo "Creating the resource group...`az group create -n $rg -l westeurope`"

echo "Creating the vm...`az vm create --no-wait --size Standard_B1s --resource-group $rg --name $vmname --image win2016datacenter --admin-username azureuser --admin-password $pw`"

The script expects three parameters: rg, vmname and pw. We can pass these parameters as HTTP query parameters. If the above script would be in the ./scripts/vm folder as create.sh, I could do the following call to webhookd:

curl --user api -XPOST "https://<public_server_dns>/vm/create?vmname=myvm&rg=myrg&pw=Abcdefg$$$$!!!!" 

The response to the above call would contain the output from the three az commands. The az login command would output the following:

 data:   {
data: "environmentName": "AzureCloud",
data: "id": "<id>",
data: "isDefault": true,
data: "name": "<subscription name>",
data: "state": "Enabled",
data: "tenantId": "<tenant_id>",
data: "user": {
data: "assignedIdentityInfo": "MSI",
data: "name": "systemAssignedIdentity",
data: "type": "servicePrincipal"
data: }

Notice the user object, which clearly indicates we are using a system-assigned managed identity. In my case, the managed identity has the contributor role on an Azure subscription used for testing. With that role, the shell script has the required access rights to deploy the virtual machine.

As you can see, it is very easy to use webhookd to deploy Azure resources if the Azure virtual machine that runs webhookd has a managed identity with the required access rights.