Infrastructure as Code: exploring Pulumi

Image: from the Pulumi website

In my Twitter feed, I often come across Pulumi so I decided to try it out. Pulumi is an Infrastructure as Code solution that allows you to use familiar development languages such as JavaScript, Python and Go. The idea is that you define your infrastructure in the language that you prefer, versus some domain specific language. When ready, you merely use pulumi up to deploy your resources (and pulumi update, pulumi destroy, etc…). The screenshot below shows the deployment of an Azure resource group, storage account, file share and a container group on Azure Container Instances. The file share is mapped as a volume to one of the containers in the container group:

Deploying infrastructure with pulumi up

Installation is extremely straightforward. I chose to write the code in JavaScript as I had all the tools already installed on my Windows box. It is also more polished than the Go option (for now). I installed Pulumi per their instructions over at https://pulumi.io/quickstart/install.html.

Next, I used their cloud console to create a new project. Eventually, you will need to run a pulumi new command on your local machine. The cloud console will provide you with the command to use which is handy when you are just getting started. The cloud console provides a great overview of all your activities:

Nice and green (because I did not include the failed ones 😉)

In Resources, you can obtain a graph of the deployed resources:

Don’t you just love pretty graphs like this?

Let’s take a look at the code. The complete code is in the following gist: https://gist.github.com/gbaeke/30ae42dd10836881e7d5410743e4897c.

Resource group, storage account and share

The above code creates the resource group, storage account and file share. It is so straightforward that there is no need to explain it, especially if you know how it works with ARM. The simplicity of just referring to properties of resources you just created is awesome!

Next, we create a container group with two containers:

Creating the container group

If you have ever created a container group with a YAML file or ARM template, the above code will be very familiar. It defines a DNS label for the group and sets the type to Linux (ACI also supports Windows). Then two containers are added. The realtime-go container uses CertMagic to obtain Let’s Encrypt certificates. The certificates should be stored in persistent storage and that is what the Azure File Share is used for. It is mounted on /.local/share/certmagic because that is where the files will be placed in a scratch container.

I did run into a small issue with the container group. The realtime-go container should expose both port 80 and 443 but the port setting is a single numeric value. In YAML or ARM, multiple ports can be specified which makes total sense. Pulumi has another cross-cloud option to deploy containers which might do the trick.

All in all, I am pleasantly surprised with Pulumi. It’s definitely worth a more in-depth investigation!

Detecting emotions with FER+

In an earlier post, I discussed classifying images with the ResNet50v2 model. Azure Machine Learning Service was used to create a container image that used the ONNX ResNet50v2 model and the ONNX Runtime for scoring.

Continuing on that theme, I created a container image that uses the ONNX FER+ model that can detect emotions in an image. The container image also uses the ONNX Runtime for scoring.

You might wonder why you would want to detect emotions this way when there are many services available that can do this for you with a simple API call! You could use Microsoft’s Face API or Amazon’s Rekognition for example. While those services are easy to use and provide additional features, they do come at a cost. If all you need is basic detection of emotions, using this FER+ container is sufficient and cost effective.

Azure Face API (image from Microsoft website)

A notebook to create the image and deploy a container to Azure Container Instances (ACI) can be found here. The notebook uses the Azure Machine Learning SDK to register the model to an Azure Machine Learning workspace, build a container image from that model and deploy the container to ACI. The scoring script score.py is shown below.

score.py

The model expects an 64×64 gray scale image of a face in an array with the following dimensions: [1][1][64][64]. The output is JSON with a results array that contains the probabilities for each emotion and a time field with the inference time.

The emotion probabilities are in this order:

0: "neutral", 1: "happy", 2: "surprise", 3: "sadness", 4: "anger", 5: "disgust", 6: "fear", 7: "contempt

To actually capture the emotions, I wrote a small demo program in Go that uses OpenCV (via GoCV). You can find it on GitHub: https://github.com/gbaeke/emotion. You will need to install OpenCV and GoCV. Find the instructions here: https://gocv.io/getting-started/linux/. There are similar instructions for Mac and Windows but I have not tried those

The program is still a little rough around the edges but it does the trick. The scoring URI is hard coded to http://localhost:5002/score. With Docker installed, use the following command to install the scoring container:

 docker run -d -p 5002:5001 gbaeke/onnxferplus

Have fun with it!

ResNet50v2 classification in Go with a local container

To quickly go to the code, go here. Otherwise, keep reading…

In a previous blog post, I wrote about classifying images with the ResNet50v2 model from the ONNX Model Zoo. In that post, the container ran on a Kubernetes cluster with GPU nodes. The nodes had an NVIDIA v100 GPU. The actual classification was done with a simple Python script with help from Keras and Numpy. Each inference took around 25 milliseconds.

In this post, we will do two things:

  • run the scoring container (CPU) on a local machine that runs Docker
  • perform the scoring (classification) in Go

Installing the scoring container locally

I pushed the scoring container with the ONNX ResNet50v2 image to the following location: https://cloud.docker.com/u/gbaeke/repository/docker/gbaeke/onnxresnet50v2. Run the container with the following command:

docker run -d -p 5001:5001 gbaeke/onnxresnet50

The container will be pulled and started. The scoring URI is on http://localhost:5001/score.

Note that in the previous post, Azure Machine Learning deployed two containers: the scoring container (the one described above) and a front-end container. In that scenario, the front-end container handles the HTTP POST requests (optionally with SSL) and route the request to the actual scoring container.

The scoring container accepts the same payload as the front-end container. That means it can be used on its own, as we are doing now.

Note that you can also use IoT Edge, as explained in an earlier post. That actually shows how easy it is to push AI models to the edge and use them locally, befitting your business case.

Scoring with Go

To actually classify images, I wrote a small Go program to do just that. Although there are some scientific libraries for Go, they are not really needed in this case. That means we do have to create the 4D tensor payload and interpret the softmax result manually. If you check the code, you will see that is not awfully difficult.

The code can be found in the following GitHub repository: https://github.com/gbaeke/resnet-score.

Remember that this model expects the input as a 4D tensor with the following dimensions:

  • dimension 0: batch (we only send one image here)
  • dimension 1: channels (one for each; RGB)
  • dimension 2: height
  • dimension 3: width

The 4D tensor needs to be serialized to JSON in a field called data. We send that data with HTTP POST to the scoring URI at http://localhost:5001/score.

The response from the container will be JSON with two fields: a result field with the 1000 softmax values and a time field with the inference time. We can use the following two structs for marshaling and unmarshaling

Input and output of the model

Note that this model expects pictures to be scaled to 224 by 224 as reflected by the height and width dimensions of the uint8 array. The rest of the code is summarized below:

  • read the image; the path of the image is passed to the code via the -image command line parameter
  • the image is resized with the github.com/disintegration/imaging package (linear method)
  • the 4D tensor is populated by iterating over all pixels of the image, extracting r,g and b and placing them in the BCHW array; note that the r,g and b values are uint16 and scaled to fit in a uint8
  • construct the input which is a struct of type InputData
  • marshal the InputData struct to JSON
  • POST the JSON to the local scoring URI
  • read the HTTP response and unmarshal the response in a struct of type OutputData
  • find the highest probability in the result and note the index where it was found
  • read the 1000 ImageNet categories from imagenet_class_index.json and marshal the JSON into a map of string arrays
  • print the category using the index with the highest probability and the map

What happens when we score the image below?

What is this thing?

Running the code gives the following result:

$ ./class -image images/cassette.jpg

Highest prob is 0.9981583952903748 at 481 (inference time: 0.3309464454650879 )
Probably [n02978881 cassette

The inference time is 1/3 of a second on my older Linux laptop with a dual-core i7.

Try it yourself by running the container and the class program. Download it from here (Linux).

Creating a GPU container image for scoring with Azure Machine Learning

In a previous post, I discussed how you can add an existing Kubernetes cluster to an Azure Machine Learning workspace. Adding an existing cluster is necessary when the workspace does not support auto creation of a cluster. That is the case when you want to use the Standard_NC6s_v3 virtual machine image. I also used a container for scoring pictures with the ResNet50v2 model from the ONNX Model Zoo. Now we will take a look at actually creating that container image with GPU support. Note that in many cases, inference with CPUs is more than sufficient but the GPU case is more interesting to look at!

To get started, you need an Azure subscription with an Azure Machine Learning workspace. Take a look here for instructions.

Once you have a workspace, there are a few steps to take. If you look at the diagram at the top of this post, we will perform the steps starting from Register and manage your model:

  • Register model: we will add the Resnet50v2 model from the ONNX Model Zoo; we are using this existing model instead of our own; ResNet50v2 can recognize pictures in 1000 categories
  • Create container image: from the model in the workspace, we create a container image with GPU support
  • Deploy container image: from the image in the workspace, we deploy the image to compute that supports GPUs

Machine Learning SDK

The Azure Machine Learning service has a Machine Learning SDK for Python. All the steps discussed above can be performed with code. You can find an example of the Python code to use in the following Jupyter notebook hosted on Azure Notebooks: https://gebaml-geba.notebooks.azure.com/j/notebooks/ONNXResnet.ipynb. Note that the Azure Notebooks service is still in preview and a bit rough around the edges. The Machine Learning SDK is available by default in Azure Notebooks.

At the beginning of the notebook, we import azureml.core which allows you to check the version of the SDK (among other things):

Registering the model

First, we download the model to the notebook project. In the notebook, the urllib module is used to download the compressed version of the ResNet50v2 model. The tarball is untarred in resnet50v2/resnet50v2.onnx. You should see the model as a complex function with, in this case, millions of parameters (weights). The input to the function are the pixels of your picture (their red, green and blue values). The output of the function is a category: cat, guitar, …

Now that we have the model, we need to add it to the workspace, which means we also have to authenticate. Create a file called config.json with the following contents:

{
"subscription_id": "your Azure subscription ID", "resource_group": "your Azure ML resource group",
"workspace_name": "your Azure ML workspace name"
}

With the Workspace class from azureml.core we authenticate to Azure and grab a reference to the workspace with the ws variable. The Workspace.from_config() function searches for the config.json file.

Now we can finally register the model in the workspace using Model.register:

The above is the same as adding a model using the Azure Portal. You might hit file upload limits in the portal so adding the model via code is the better approach. Your model is now registered in the workspace:

Creating a GPU container image from the model

Now that we have the model, we can create the container image. The model will be included in the image which will add about 100MB to its size. The container image in Azure Machine Learning is created from four settings/artifacts:

  • model: registered in the workspace
  • score file: a file score.py with an init() and run() function; helper functions can also be included
  • dependency file: used to indicate the Python modules that need to be installed in the image (see https://conda.io/docs/)
  • GPU support: set to True or False

You will find the score file in the notebook. It was copied from a Microsoft supplied sample. If you do not have some experience with Machine Learning and neural networks (in this case), it will be difficult to create this from scratch. The ResNet50v2 model expects a 4-dimensional tensor with the following dimensions:

  • 0: batch (1 when you send 1 image)
  • 1: channels (3 channels for red, green and blue; RGB)
  • 2: height (224 pixels)
  • 3: width (224 pixels)

For inference, you will actually send the above data in a JSON payload as the data field. The preprocess() function in score.py grabs the data field and converts it to a NumPy array. The data is then normalized by dividing each pixel by 255, subtracting the mean values (of each channel) and dividing by the standard deviation (of each channel) . The normalized data is then sent to the model which outputs an array with 1000 probabilities that sum to 1 (via a softmax function).

Why are there a thousand probabilities? The model was trained on a thousand different categories of images and for each of these categories, a probability is output. After inference we will need a list of these categories so we can find the one that matches with our uploaded image and that has the highest probability!

This particular score.py file uses the ONNX runtime for inference. To enable GPU support, make sure you include the onnxruntime-gpu package in your conda dependencies as shown below:

With score.py and myenv.yml, the container image with GPU support can be created. Note that we are specifying the score.py file, the conda file and the model. GPU support is enabled as well via enable_gpu=True.

The code above should result in the following image in your workspace (after several minutes of building):

In the background, this image is stored in the container registry that got created when you deployed the Azure Machine Learning workspace. You are now ready for the third step, deploying the image to compute that supports GPUs (for instance Kubernetes). That step, together with some code to actually recognize images, will be for another post. In that post, we will also compare CPU to GPU speed.

Conclusion

In this post, we looked at creating a scoring (inference) container image with GPU support. Instead of creating and using our own model, we used the ResNet50v2 model from the ONNX Model Zoo. The model file, together with a score.py file and conda dependency file was used to build a container image. Azure Machine Learning builds the container image for you and stores it in a container registry. Although Azure Machine Learning takes care of most of the infrastructure work, you still need to know how to write the scoring file. In this post, the scoring file uses the ONNX runtime but you can use other runtimes or frameworks such as TensorFlow or MXNET.


Attaching Kubernetes clusters with NVIDIA V100 GPUs to Azure Machine Learning Service

Azure Machine Learning Service allows you to easily deploy compute for training and inference via a machine learning workspace. Although one of the compute types is Kubernetes, the workspace is a bit picky about the node VM sizes. I wanted to use two Standard_NC6s_v3 instances with NVIDIA Tesla V100 GPUs but that was not allowed. Other GPU instances, such as the Standard_NC6 type (K80 GPU) can be deployed from the workspace.

Luckily, you can deploy clusters on your own and then attach the cluster to your Azure Machine Learning workspace. You can create the cluster with the below command. Make sure you ask for a quota increase that allows 12 cores of Standard_NC6s_v3.

az aks create -g RESOURCE_GROUP --generate-ssh-keys --node-vm-size Standard_NC6s_v3 --node-count 2 --disable-rbac --name NAME --admin-username azureuser --kubernetes-version 1.11.5

Before I ran the above command, I created an Azure Machine Learning workspace to a resource group called ml-rg. The above command was run with RESOURCE_GROUP set to ml-rg and NAME set to mlkub. After a few minutes, you should have your cluster up and running. Be mindful of the price of this cluster. GPU instances are not cheap!

Now we can Add Compute to the workspace. In your workspace, navigate to Compute and use the + Add Compute button. Complete the form as below. The compute name does not need to match the cluster name.

After a while, the Kubernetes cluster should be attached:

Manually deployed cluster attached

Note that detaching a cluster does not remove it. Be sure to remove the cluster manually!

You can now deploy container images to the cluster that take advantage of the GPU of each node. When you a deploy an image marked as a GPU image, Azure Machine Learning takes care of all the parameters that allow your container to use the GPU on the Kubernetes node.

The screenshot below shows a deployment of an image that can be used for inference. It uses an ONNX ResNet50v2 model.

Deployment of container for scoring (inference; ResNet50v2)

With the below picture of a cat, the model used by the container guesses it is an Egyptian Cat (it’s not but it is close) with close to 94% certainty.

Egyptian Cat (not)

Using your own compute with the Azure Machine Learning service is very easy to do. The more interesting and somewhat more complicated parts such as the creation of the inference container that supports GPUs is something I will discuss in a later post. In a follow-up post, I will also discuss how you send image data to the scoring container.

Deploying Azure Cognitive Services Containers with IoT Edge

Introduction

Azure Cognitive Services is a collection of APIs that make your applications smarter. Some of those APIs are listed below:

  • Vision: image classification, face detection (including emotions), OCR
  • Language: text analytics (e.g. key phrase or sentiment analysis), language detection and translation

To use one of the APIs you need to provision it in an Azure subscription. After provisioning, you will get an endpoint and API key. Every time you want to classify an image or detect sentiment in a piece of text, you will need to post an appropriate payload to the cloud endpoint and pass along the API key as well.

What if you want to use these services but you do not want to pass your payload to a cloud endpoint for compliance or latency reasons? In that case, the Cognitive Services containers can be used. In this post, we will take a look at the Text Analytics containers, specifically the one for Sentiment Analysis. Instead of deploying the container manually, we will deploy the container with IoT Edge.

IoT Edge Configuration

To get started, create an IoT Hub. The free tier will do just fine. When the IoT Hub is created, create an IoT Edge device. Next, configure your actual edge device to connect to IoT Hub with the connection string of the device you created in IoT Hub. Microsoft have a great tutorial to do all of the above, using a virtual machine in Azure as the edge device. The tutorial I linked to is the one for an edge device running Linux. When finished, the device should report its status to IoT Hub:

If you want to install IoT Edge on an existing device like a laptop, follow the procedure for Linux x64.

Once you have your edge device up and running, you can use the following command to obtain the status of your edge device: sudo systemctl status iotedge. The result:

Deploy Sentiment Analysis container

With the IoT Edge daemon up and running, we can deploy the Sentiment Analysis container. In IoT Hub, select your IoT Edge device and select Set modules:

In Set Modules you have the ability to configure the modules for this specific device. Modules are always deployed as containers and they do not have to be specifically designed or developed for use with IoT Edge. In the three step wizard, add the Sentiment Analysis container in the first step. Click Add and then select IoT Edge Module. Provide the following settings:

Although the container can freely be pulled from the Image URI, the container needs to be configured with billing info and an API key. In the Billing environment variable, specify the endpoint URL for the API you configured in the cloud. In ApiKey set your API key. Note that the container always needs to be connected to the cloud to verify that you are allowed to use the service. Remember that although your payload is not sent to the cloud, your container usage is. The full container create options are listed below:

{
"Env": [
"Eula=accept",
"Billing=https://westeurope.api.cognitive.microsoft.com/text/analytics/v2.0",
"ApiKey=<yourKEY>"
],
"HostConfig": {
"PortBindings": {
"5000/tcp": [
{
"HostPort": "5000"
}
]
}
}
}

In HostConfig we ask the container runtime (Docker) to map port 5000 of the container to port 5000 of the host. You can specify other create options as well.

On the next page, you can configure routing between IoT Edge modules. Because we do not use actual IoT Edge modules, leave the configuration as shown below:

Now move to the last page in the Set Modules wizard to review the configuration and click Submit.

Give the deployment some time to finish. After a while, check your edge device with the following command: sudo iotedge list. Your TextAnalytics container should be listed. Alternatively, use sudo docker ps to list the Docker containers on your edge device.

Testing the Sentiment Analysis container

If everything went well, you should be able to go to http://localhost:5000/swagger to see the available endpoints. Open Sentiment Analysis to try out a sample:

You can use curl to test as well:

curl -X POST "http://localhost:5000/text/analytics/v2.0/sentiment" -H  "accept: application/json" -H  "Content-Type: application/json-patch+json" -d "{  \"documents\": [    {      \"language\": \"en\",      \"id\": \"1\",      \"text\": \"I really really despise this product!! DO NOT BUY!!\"    }  ]}"

As you can see, the API expects a JSON payload with a documents array. Each document object has three fields: language, id and text. When you run the above command, the result is:

{"documents":[{"id":"1","score":0.0001703798770904541}],"errors":[]}

In this case, the text I really really despise this product!! DO NOT BUY!! clearly results in a very bad score. As you might have guessed, 0 is the absolute worst and 1 is the absolute best.

Just for fun, I created a small Go program to test the API:

The Go program can be found here: https://github.com/gbaeke/sentiment. You can download the executable for Linux with: wget https://github.com/gbaeke/sentiment/releases/download/v0.1/ta. Make ta executable and use ./ta –help for help with the parameters.

Summary

IoT Edge is a great way to deploy containers to edge devices running Linux or Windows. Besides deploying actual IoT Edge modules, you can deploy any container you want. In this post, we deployed a Cognitive Services container that does Sentiment Analysis at the edge.

Draft: a simpler way to deploy to Kubernetes during development

If you work with containers and work with Kubernetes, Draft makes it easier to deploy your code while you are in the earlier development stages. You use Draft while you are working on your code but before you commit it to version control. The idea is simple:

  • You have some code written in something like Node.js, Go or another supported language
  • You then use draft create to containerize the application based on Draft packs; several packs come with the tool and provide a Dockerfile and a Helm chart depending on the development language
  • You then use draft up to deploy the application to Kubernetes; the application is made accessible via a public URL

Let’s demonstrate how Draft is used, based on a simple Go application that is just a bit more complex than the Go example that comes with Draft. I will use the go-data service that I blogged about earlier. You can find the source code on GitHub. The go-data service is a very simple REST API. By calling the endpoint /data/{deviceid}, it will check if a “device” exists and then actually return no data. Hey, it’s just a sample! The service uses the Gorilla router but also Go Micro to call a device service running in the Kubernetes cluster. If the device service does not run, the data service will just report that the device does not exist.

Note that this post does not cover how to install Draft and its prerequisites like Helm and a Kubernetes Ingress Controller. You will also need a Kubernetes cluster (I used Azure ACS) and a container registry (I used Docker Hub). I installed all client-side components in the Windows 10 Linux shell which works great!

The only thing you need on your development box that has Helm and Draft installed is main.go and an empty glide.yaml file. The first command to run is draft create

This results in several files and folders being created, based on the Golang Draft pack. Draft detected you used Go because of glide.yaml. No Docker container is created at this point.

  • Dockerfile: a simple Dockerfile that builds an image based on the golang:onbuild image
  • draft.toml: the Draft configuration file that contains the name of the application (set randomly), the namespace to deploy to and if the folder needs to be watched for changes after you do draft up
  • chart folder: contains the Helm chart for your application; you might need to make changes here if you want to modify the Kubernetes deployment as we will do soon

When you deploy, Draft will do several things. It will package up the chart and your code and send it to the Draft server-side component running in Kubernetes. It will then instruct Draft to build your container, push it to a configured registry and then install the application in Kubernetes. All those tasks are performed by the Draft server component, not your client!

In my case, after running draft up, I get the following on my prompt (after the build, push and deploy steps):

image

In my case, the name of the application was set to exacerbated-ragdoll (in draft.toml). Part of what makes Draft so great is that it then makes the service available using that name and the configured domain. That works because of the following:

  • During installation of Draft, you need to configure an Ingress Controller in Kubernetes; you can use a Helm chart to make that easy; the Ingress Controller does the magic of mapping the incoming request to the correct application
  • When you configure Draft for the first time with draft init you can pass the domain (in my case baeke.info); this requires a wildcard A record (e.g. *.baeke.info) that points to the public IP of the Ingress Controller; note that in my case, I used Azure Container Services which makes that IP the public IP of an Azure load balancer that load balances traffic between the Ingress Controller instances (ngnix)

So, with only my source code and a few simple commands, the application was deployed to Kubernetes and made available on the Internet! There is only one small problem here. If you check my source code, you will see that there is no route for /. The Draft pack for Golang includes a livenessProbe on / and a readinessProbe on /. The probes are in deployment.yaml which is the file that defines the Kubernetes deployment. You will need to change the path in livenessProbe and readinessProbe to point to /data/device like so:

- containerPort: {{ .Values.service.internalPort }}
livenessProbe:
  httpGet:
   path: /data/device
   port: {{ .Values.service.internalPort }}
  readinessProbe:
   httpGet:
   path: /data/device
   port: {{ .Values.service.internalPort }}

If you already deployed the application but Draft is still watching the folder, you can simply make the above changes and save the deployment.yaml file (in chart/templates). The container will then be rebuilt and the deployment will be updated. When you now check the service with curl, you should get something like:

curl http://exacerbated-ragdoll.baeke.info/data/device1

Device active:  false
Oh and, no data for you!

To actually make the Go Micro features work, we will have to make another change to deployment.yaml. We will need to add an environment variable that instructs our code to find other services developed with Go Micro using the kubernetes registry:

- name: {{ .Chart.Name }}
  image: "{{ .Values.image.registry }}/{{ .Values.image.org }}/{{ .Values.image.name }}:{{ .Values.image.tag }}"
  imagePullPolicy: {{ .Values.image.pullPolicy }}
  env:
   - name: MICRO_REGISTRY
     value: kubernetes

To actually test this, use the following command to deploy the device service.

kubectl create -f https://raw.githubusercontent.com/gbaeke/go-device/master/go-device-dep.yaml

You can then check if it works by running the curl command again. It should now return the following:

Device active:  true
Oh and, no data for you!

Hopefully, you have seen how you can work with Draft from your development box and that you can modify the files generated by Draft to control how your application gets deployed. In our case, we had to modify the health checks to make sure the service can be reached. In addition, we had to add an environment variable because the code uses the Go Micro microservices framework.

Adaptable IoT

On May 24, 2017 I gave a short partner session at Techorama, a technology event in Belgium for both developers and IT Pros. You can find the slides on SlideShare:

Since it was a short session and a short slide deck, this post provides a bit more background information.

First, what do I mean with Adaptable IoT? Basically, an IoT solution should be adaptable at two levels:

  1. The IoT platform: use a platform that can be easily adapted to new conditions such as changed business needs or higher scaling requirements; a platform that allows you to plug in new services
  2. The application you write on the platform: use a flexible architecture that can easily be changed according to changing business needs; and no, that does not mean you have to use microservices

The presentation mainly focuses on the first point, which deals with the platform aspects that should be adaptable end-to-end at the following levels:

  • Devices and edge: devices should not be isolated in the field which means you should provide a two-way communication channel, a way to update firmware and write robust device code as a base requirement
  • Ingestion and management: with most platforms, the service used for ingestion of telemetry also provides management
  • Processing: the platform should be easy to extend with extra processing steps with limited impact on the existing processing pipeline
  • Storage: the platform should provide flexible storage options for both structured and unstructured data
  • Analytics: the platform should provide both descriptive and predictive analytics options that can be used to answer relevant business questions

Before continuing, note that this post focuses on Microsoft Azure with its Azure IoT Suite. The concepts laid out in this post can apply to other platforms as well!

Devices and Edge

There is a lot to say about devices and edge. What we see in the field is that most tend to think that the devices are the easy part. In fact, devices tend to be the most difficult part in an end-to-end IoT solution. Prototyping is easy because you can skip many of the hard parts you encounter in production:

  • Use Arduino or platforms such as particle.io: they are easy to use but do not give you full access to the underlying hardware and speed might be an issue
  • To demonstrate that it works, you can use simple and cheap sensors. But do they work in the long run? What about calibration?
  • You can use any library you find on the net but stability and accuracy might be an issue in production and even in the prototyping phase!
  • You can store secrets to connect to your back-end application directly in the sketch. In production however, you will need to store them securely.
  • Using TLS for secure connections is easy, provided the hardware and libraries support it. But what about certificate checks and expiry of root and leaf certificates?
  • You can just use WiFi because it is easy and convenient.

When you move to production and you want to create truly adaptable devices, you will need to think about several things:

  • Drop Arduino and move to C/C++ directly on the metal; heck, maybe you even have to throw in some assembler depending on the use case (though I hope not!); your focus should be on stability, speed and power usage.
  • Provide two-way communications so that devices can send telemetry and status messages to the back-end and the back-end can send messages back.
  • Make sure you can send messages to groups of devices (e.g. based on some query)
  • Provide a firmware update mechanism. Easier said than done!
  • Make sure the device is secure. Store secrets in a crypto chip.
  • Use stable and supported libraries such as the Azure IoT device SDK for C

Take into account that many devices will not be able to connect to your back-end directly, requiring a gateway at the edge. The edge should be adaptable as well, with options to do edge processing beyond merely relaying messages. What are some of those additional edge features?

  • Inference based on a machine learning algorithm trained in the cloud (e.g. anomaly detection)
  • Aggregation of data (e.g. stream processing with windowing)
  • Launch compute tasks based on conditions (e.g. launch an Azure Function when an anomaly is detected)

Ideally, the edge components are developed and tested in the cloud and then exported to the edge. Azure IoT Edge provides that functionality and uses containers to encapsulate the functionality described above.

Ingestion and management

The central service in the Azure IoT Suite for ingestion and management is Azure IoT Hub. It is highly scalable and makes your IoT solution adaptable by providing configuration and reporting mechanisms for devices. The figure below illustrates what is possible:

iothub

Device Twin functionality provides you with several options to make the solution adaptable and highly configurable:

  • From the back-end, you set desired properties that your devices can pick up. For instance, set a reporting interval to instruct the device to send telemetry more often
  • From the device, you send reported properties like battery status or available memory so you can act accordingly (e.g. send the user an alert to charge the device)
  • From the back-end, set tags to group devices (e.g. set the device location such as building, floor, room, etc…)

In a previous post, I already talked about setting desired properties with Device Twins and that today, you need to use the MQTT protocol to make this work. You can use the MQTT protocol directly or as part of one of the Azure Device SDKs where the protocol can simply be set as configuration.

The concept of jobs makes the solution even more adaptable since desired properties can be set on a group of devices using a query. By creating a query like ‘all devices where tag.building=buildingX’, you can set a desired property like the reporting interval on hundreds of devices at once.

Processing

The selected cloud platform should allow you to create an adaptable processing pipeline. With IoT Hub, the telemetry is made available to downstream components with a multi-consumer queue. An example is shown below:

processing

It is relatively easy to plug in new downstream components or modiy components. As an example, Microsoft recently made Time Series Insights available that uses an IoT Hub or an Event Hub as input. In a recent blogpost, I already described that service. Even if you already have an existing pipeline, it is simple to plug in Time Series Insights and to start using it to analyze your data.

Communication between microservices in Kubernetes with Go Micro

In this post, we continue the story we started with two earlier posts:

In the previous post, I described a very simple service written with the help of Go Micro. It exposes an RPC call Get that retrieves a device from a list of devices. Now we want a simple data service we can call via a RESTful interface like so: http://name_or_ip/data/device1. Note that no actual data is returned by the call. We just return true if the device exists and false if it does not.

The code for the “data” service can be found here: https://github.com/gbaeke/go-data/blob/master/main.go. The code is again very simply. To expose the RESTful interface, I used Gorilla. In the handler for the route /data/{device}, we call the Device service using a Go Micro client. Because the client is configured to use Kubernetes as the registry, it will look up where the Device service lives and call it. Let’s take a look at the code to call the Device service.

It starts with declaring a variable of type device.DevSvcClient which is defined in the generated code by protoc (see code for the device service here):

// devSvc is the service for the client
var (
	cl device.DevSvcClient
)

In the init() function, the client is created and configured to call the go.micro.srv.device service:

func init() {
	// make sure flags are processed
	cmd.Init()

	// initialise a default client for device service
	cl = device.NewDevSvcClient("go.micro.srv.device", client.DefaultClient)

}

In the route handler, the device name is extracted from the URL and then we call another function that returns true if the device exists and is active.

deviceActive(&device.DeviceName{Name: deviceName})

The deviceActive function looks like:

func deviceActive(d *device.DeviceName) bool {
	//call Get method from devSvcClient to obtain the device
	fmt.Println("Getting device", d.Name)
	rsp, err := cl.Get(context.TODO(), d)
	if err != nil {
		fmt.Println(err)
		return false
	}

	return rsp.Active
}

The above function expects a pointer to a DeviceName struct which is again defined by the protoc generated code used by the Device service. As you can see, calling the Get method is trivial. Behind the scenes, the client code locates the Device service in Kubernetes and does all the serialization/deserialization work to and from a binary format.

After the service is deployed in Kubernetes (see this post), we can check if it works using:

curl http://ip_of_loadbalancer/data/device1

The above should return the following:

Device active:  true
Oh and, no data for you!

I told you the service returned no data! 🙂

We now have two services that communicate using RPC in a Kubernetes cluster. Writing RESTful APIs and putting them in front of the RPC services is easy enough but something is off though! We don’t want to deploy many services that offer a RESTful API and then expose them using multiple external IPs because that’s just cumbersome. What we do want is to use the API Gateway pattern. In a future post, I will describe how to use Go Micro’s API gateway and an API service that exposes the device service to the outside world using a RESTful interface. Quite the mouthful… Stay tuned!

Microservices on Kubernetes: a simple example in Go

In the previous post, Getting started with Kubernetes on Azure, we talked about creating a Kubernetes cluster and deploying a couple of services. There are basically two services:

  • Data: a service that exposes an endpoint to pick up data for an IoT device; you call it with http://service_endpoint:8080/data/devicename
  • Device: a service that can be used by the Data API to check if a device exists; if the device exists you will see that in the response

When you call the Data service, it will call the Device service using gRPC, using HTTP as the transport protocol. You define the service using Protocol Buffers. gRPC works across languages and platforms, so I could have implemented each service using a different language like Go for the Device service and Node.js for the Data service. In this example, I decided to use Go in both cases and use Go Micro, a pluggable RPC framework for microservices. Go Micro uses gRPC and protocol buffers under the hood with changes specific to Go Micro.

Ok, enough with the talk, let’s take a look how it is done. The Device service is kept extremely simple for an abvious reason: I just started with Go Micro and then it is best to start with something simple. I do expect you know a bit of Go from here on out. All the code can be found at https://github.com/gbaeke/go-device.

Lets start with the definition of Protocol Buffers, found in proto/device.proto:

syntax = "proto3";

service DevSvc {
    rpc Get(DeviceName) returns (Device) {}
}

message DeviceName {
    string name = 1;
}

message Device {
    string name = 1;
    bool active = 2;
}

We define one RPC call here that expects a DeviceName message as input and returns a Device message. Simple enough but this does not get us very far. To actually use this in Go (or another supported language), we will generate some code from the above definition. You need a couple of things to do that though:

  • protoc compiler: download from Github  for your platform
  • protobuf plugins for code generation for Go Micro: run go get github.com/micro/protobuf/{proto,protoc-gen-go} (if you have issues, use 2 gets, one for proto and one for protoc-gen-go)

To actually compile the proto file, use the following command:

protoc --go_out=plugins=micro:. device.proto

That compiles device.proto to device.pb.go with help from the micro plugin. You can check the generated code here. Among other things, there are Go structs for the DeviceName and Device message plus several methods you can call on these structs such as Reset() and String().

Now for main.go! You’ll need several imports: for the generated code but also for the dependencies to build the service with Go Micro. If you check the code, you will also find the following import:

_ "github.com/micro/go-plugins/registry/kubernetes"

As stated above, Go Micro is a pluggable RPC framework. Out of the box, a microservice written with Go Micro will try to register itself with Consul on localhost for service discovery and configuration. We could run the Consul service in Kubernetes but Kubernetes supports service registration natively. Kubernetes support is something you add with the import above. That is not enough though! You still need to tell Go Micro to use Kubernetes as the registry, either with the —registry command line parameter or with an environment variable MICRO_REGISTRY. Check https://github.com/gbaeke/go-device/blob/master/go-device-dep.yaml file where that environment variable is set. Besides Consul and Kubernetes, there are other alternatives. One of them is multicast DNS (mdns) which is handy when you are testing services on your local machine and you don’t have something like Consul running.

If you want to check the information that is registered, you can do the following (after running kubectl proxy --port=8080):

curl http://localhost:8080/api/v1/pods | grep micro

Each pod will have an annotation with key micro.mu/service-<servicename> with information about the service such as its name, IP address, port, and much more.

Now really over to main.go, which is pretty self explanatory. There’s a struct called DevSvc which has a field called devs which holds the map of strings to Device structs. The DevSvc actually defines the service and you write the RPC calls as methods of that struct. Check out the following code snippet:

// DevSvc defines the service
type DevSvc struct {
	devs map[string]*device.Device
}
func (d *DevSvc) Get(ctx context.Context, req *device.DeviceName, rsp *device.Device) error {
	device, ok := d.devs[req.Name]
	if !ok {
		fmt.Println("Device does not exist")
		return nil
	}

	fmt.Println("Will respond with ", device)

	// this also works
	rsp.Name = device.Name
	rsp.Active = device.Active

	return nil
}

The Get function implements what was defined in the .proto file earlier and uses pointers to a DeviceName struct as input and a pointer to a Device struct as output. The code itself is of course trivial and just looks up a device in the map and returns it with rsp.

Of course, this handler needs to be registered and this happens in the main() function (besides setting up the service and implementing a custom flag):

// register handler and initialise devs map with a list of devices
device.RegisterDevSvcHandler(service.Server(), &DevSvc{devs: LoadDevices()})

If you want to test the service and call it (e.g. on the local machine) then clone the repository (or get it) and run the server as follows:

go run main.go --registry=mdns

In another terminal, run:

go run main.go --registry=mdns --run_client

When you run the code with the run_client option, the runClient function is called which looks like:

func runClient(service micro.Service) {
	// Create new client to call DevSvc service
	DevClient := device.NewDevSvcClient("go.micro.srv.device", service.Client())

	// Call Get to get a device
	rsp, err := DevClient.Get(context.TODO(), &device.DeviceName{Name: "device2"})
	if err != nil {
		fmt.Println(err)
		return
	}

	// Print response
	fmt.Println("Response: ", rsp)
}

This again shows the power of using a framework like Go Micro: you create a client for the DevSvc service and then simply perform the remote procedure call with the Get method, passing in a DeviceName struct with the Name field set to the device you want to check. The client uses the service registry to know where and how to connect. All the serialization and deserialization is handled for you as well using protocol buffers.

So great, you now have a little bit more information about the Device service and you know how to deploy it to Kubernetes. In another post, we’ll see how the Data service works and explore some other options to write that service.

%d bloggers like this: