For a customer that is developing a microservices application, the proposed architecture contains two Kubernetes ingresses:
internal ingress: exposed via an Azure internal load balancer, deployed in a separate subnet in the customer’s VNET; no need for SSL
external ingress: exposed via an external load balancer; SSL via Let’s Encrypt
The internal ingress exposes API endpoints via Azure API Management and its ability to connect to internal subnets. The external ingress exposes web applications via Azure Front Door.
The Ingress Controller of choice is Traefik. We use the Helm chart to deploy Traefik in the cluster. The example below uses Azure Kubernetes Service so I will refer to Azure objects such as VNETs, subnets, etc… Let’s get started!
In values.yaml, use ingressClass to set a custom class. For example:
When you do not set this value, the default ingressClass is traefik. When you define the ingress object, you refer to this class in your manifest via the annotation below:
When we deploy the internal ingress, we need to tell Traefik to create an internal load balancer. Optionally, you can specify a subnet to deploy to. You can add these options under the service section in values.yaml:
The above setting makes sure that the annotations are set on the service that the Helm chart creates to expose Traefik to the “outside” world. The settings are not Traefik specific.
Above, we want Kubernetes to deploy the Azure internal load balancer to a subnet called traefik. That subnet needs to exist in the VNET that contains the Kubernetes subnet. Make sure that the AKS service principal has the necessary access rights to deploy the load balancer in the subnet. If it takes a long time to deploy the load balancer, use kubectl get events in the namespace where you deploy Traefik (typically kube-system).
If you want to provide an static IP address to the internal load balancer, you can do so via the loadBalancerIp setting near the top of values.yaml. You can use any free address in the subnet where you deploy the load balancer.
All done! You can now deploy the internal ingress with:
The external ingress is simple now. Just set the ingressClass to traefik-ext (or leave it at the default of traefik although that’s not very clear) and remove the other settings. If you want a static public IP address, you can create such an address first and specify it in values.yaml. In an Azure context, you would create a public IP object in the resource group that contains your Kubernetes nodes.
If you need multiple ingresses of the same type or brand, use distinct values for ingressClass and reference the class in your ingress manifest file. Naturally, when you use two different solutions, say Kong for APIs and Traefik for web sites, you do not need to do that since they use different ingressClass values by default (kong and traefik). Hope this quick tip was useful!
In some of my previous posts, I talked about Azure Front Door and Web Application Firewall policies to protect a workload like one or more APIs running on Kubernetes or App Service. Although I enabled the Web Application Firewall policies, I did not show what happens when the rules are triggered. Let’s take a look at that! 🕶
Before we get started though, take the following diagram into account:
WAF for Front Door is a global solution. You create a WAF policy in the portal or via other means and attach it to a Front Door frontend. Rules are evaluated and acted upon at the edge versus on your application server.
Azure WAF supports custom rules and Azure-managed rule sets (based on OWASP). The custom rules are interesting because they allow you to restrict IP addresses, configure geographic based access control and more.
There’s an additional rule type called bot protection rule as well. At the time of this writing (beginning June 2019) this feature is in public preview. It uses the Microsoft Intelligent Security Graph to do its magic, similarly to Azure Firewall when you enable Threat Intelligence.
Let’s first use a tool that can scan an endpoint for vulnerabilities to trigger the WAF rules. One such tool is OWASP ZAP, which you need to install on your workstation.
Before we check the logs, note we have set the policy to Detection:
Now let’s take a look at the logs. Use the following query in Log Analytics and modify it for your own host (host_s field):
| where ResourceType == "FRONTDOORS" and Category == "FrontdoorWebApplicationFirewallLog"
| where action_s == "Block"
| where host_s == "api.baeke.info"
Like with any Log Analytics query, you can place alerts on log occurrences. You will need to be in the Log Analytics workspace, and not in the Logs section of Azure Front Door:
Azure Web Application Firewall policies for Azure Front Door integrate with Azure Monitor and Log Analytics, like most other Azure services. With some KQL, the query language for Log Analytics, it is straightforward to request the logs and set alerts on them.
In the post, Securing your API with Kong and CloudFlare, I exposed a dummy API on Kubernetes with Kong and published it securely with CloudFlare. The breadth of features and its ease of use made CloudFlare a joy to work with. It didn’t take long before I got the question: “can’t you do that with Azure only?”. The answer is obvious: “Of course you can!”
In this post, the traffic flow is as follows:
Consumer -- HTTPS --> Azure Front Door with WAF policy -- HTTPS --> Kong (exposed with Azure Load Balancer) -- HTTP --> API Kubernetes service --> API pods
Similarly to CloudFlare, Azure Front Door provides a fully trusted certificate for consumers of the API. In contrast to CloudFlare, Azure Front Door does not provide origin certificates which are trusted by Front Door. That’s easy to solve though by using a fully trusted Let’s Encrypt certificate which is stored as a Kubernetes secret and used in the Kubernetes Ingress definition. For this post, I requested a wildcard certificate for *.baeke.info via https://www.sslforfree.com/
Let’s take it step-by-step, starting at the API and Kong level.
APIs and Kong
Just like in the previous posts, we have a Kubernetes service called func and back-end pods that host the API implemented via Azure Functions in a container. Below you see the API pods in the default namespace. For convenience, Kong is also deployed in that namespace (not recommended in production):
Naturally, certificate and key should be replaced with the base64-encoded strings of the certificate and key you have obtained (in this case from https://www.sslforfree.com).
At the DNS level, api-o.baeke.info should refer to the external IP address of the exposed Kong Ingress Controller (proxy):
For the rest, the Kong configuration is not very different from the configuration in Securing your API with Kong and CloudFlare. I did remove the whitelisting configuration, which needs to be updated for Azure Front Door.
Great, we now have our API listening on https://api-o.baeke.info but it is not exposed via Azure Front Door and it does not have a WAF policy. Let’s change that.
Web Application Firewall (WAF) Policy
You can create a WAF policy from the portal:
The above policy is set to detection only. No custom rules have been defined, but a managed rule set is activated:
The WAF policy was saved as baekeapiwaf. It will be attached to an Azure Front Door frontend. When a policy is attached to a frontend, it will be shown in the policy:
Azure Front Door
We will now add Azure Front Door to obtain the following flow:
Consumer ---> https://api.baeke.info (Front Door + WAF) --> https://api-o.baeke.info
The final configuration in Front Door Designer looks like this:
When a request comes in for api.baeke.info, the response from api-o.baeke.info is served. Caching was not enabled. The frontend and backend are tied together via the routing rule.
The first thing you need to do is to add the azurefd.net frontend which is baeke-api.azurefd.net in the above config. There’s not much to say about that. Just click the blue plus next to Frontend hosts and follow the prompts. I did not attach a WAF policy to that frontend because it will not forward requests to the backend. We will use a custom domain for that.
Next, click the blue plus again to add the custom domain (here api.baeke.info). In your DNS zone, create a CNAME record that maps api.yourdomain.com to the azurefd.net name:
I attached the WAF policy baekeapiwaf to the front-end domain:
Next, I added a certificate. When you select Front Door managed, you will get a Digicert managed image. If the CNAME mapping is not complete, you will get an e-mail from Digicert to approve certificate issuance. Make sure you check your e-mails if it takes long to issue the certificate. It will take a long time either way so be patient! 💤💤💤
Now that we have the frontend, specify the backend that Front Door needs to connect to:
The backend pool uses the API exposed at api-o.baeke.info as defined earlier. With only one backend, priority and weight are of no importance. It should be clear that you can add multiple backends, potentially in different regions, and load balance between them.
You will also need a health probe to check for healthy and unhealthy backends:
Note that the above health check does NOT return a 200 OK status code. That is the only status code that would result in a healthy endpoint. With the above config, Kong will respond with a “no Route matched” 404 Not Found error instead. That does not mean that Front Door will not route to this endpoint though! When all endpoints are in a failed state, Front Door considers them healthy anyway 😲😲😲 and routes traffic using round-robin. See the documentation for more info.
Now that we have the frontend and the backend, let’s tie the two together with a rule:
In the first part of the rule, we specify that we listen for requests to api.baeke.info (and not the azurefd.net domain) and that we only accept https. The pattern /* basically forwards everything to the backend.
In the route details, we specify the backend to route to:
Clearly, we want to route to the api-o backend we defined earlier. We only connect to the backend via HTTPS. It only accepts HTTPS anyway, as defined at the Kong level via a KongIngress resource.
Note that it is possible to create a HTTP to HTTPS redirect rule. See the post Azure Front Door Revisited for more information. Without the rule, you will get the following warning:
Test, test, test
Let’s call the API via the http tool:
Clearly, Azure Front Door has served this request as indicated by the X-Azure-Ref header. Let’s try http:
Azure Front Door throws the above error because the routing rule only accepts https on api.baeke.info!
White listing Azure Front Door
To restrict calls to the backend to Azure Front Door, I used the following KongPlugin definition:
The IP range is documented here. Note that the IP range can and probably will change in the future.
In the ingress definition, I added the plugin via the annotations:
plugins.konghq.com: http-auth, whitelist-fd
Calling the backend API directly will now fail:
Publishing APIs (or any web app), whether they are running on Kubernetes or other systems, is easy to do with the combination of Azure Front Door and Web Application Firewall policies. Do take pricing into account though. It’s a mixture of relatively low fixed prices with variable pricing per GB and requests processed. In general, CloudFlare has the upper hand here, from both a pricing and features perspective. On the other hand, Front Door has advantages when it comes to automating its deployment together with other Azure resources. As always: plan, plan, plan and choose wisely! 🦉
In the previous post, we looked at API Management with Kong and the Kong Ingress Controller. We did not care about security and exposed a sample toy API over a public HTTP endpoint that also required an API key. All in the clear, no firewall, no WAF, nothing… 👎👎👎
In this post, we will expose the API over TLS and configure Kong to use a CloudFlare origin certificate. An origin certificate is issued and trusted by CloudFlare to connect to the origin, which in our case is an API hosted on Kubernetes.
The API consumer will not connect directly to the Kubernetes-hosted API exposed via Kong. Instead, the consumer connects to CloudFlare over TLS and uses a certificate issued by CloudFlare that is fully trusted by browsers and other clients.
The traffic flow is as follows:
Consumer --> CloudFlare (TLS with fully trusted cert, WAF, ...) --> Kong Ingress (TLS with origin cert) --> API (HTTP)
Refer to the previous post for installation instructions. The YAML files to configure the Ingress, KongIngress, Consumer, etc… are almost the same. The Ingress resource has the following changes:
We use a new hostname api.baeke.info
We configure TLS for api.baeke.info by referring to a secret called baeke.info.tls which contains the CloudFlare origin certificate.
We use an additional Kong plugin which provides whitelisting of CloudFlare addresses; only CloudFlare is allowed to connect to the Ingress
Here is the plugin definition for whitelisting with the current (June 15th, 2019) list of IP ranges used by CloudFlare. Note that you have to supply the addresses and ranges as an array. The documentation shows a comma-separated list! 🤷♂️
In the previous post, the protocols array contained the http value.
Note: for whitelisting to work, the Kong proxy service needs externalTrafficPolicy set to Local. Use kubectl edit svc kong-kong-proxy to modify that setting. You can set this value at deployment time as well. This might or might not work for you. I used AKS where this produces the desired outcome.
Get the external IP of the kong-kong-proxy service and create a DNS entry for it. I created a A record for api.baeke.info:
Make sure the orange cloud is active. In this case, this means that requests for api.baeke.info are proxied by CloudFlare. That allows us to cache, enable WAF (web application firewall), rate limiting and more!
In the Firewall section, WAF is turned on. Note that this is a paying feature!
In Crypto, Universal SSL is turned on and set to Full (strict).
Full (strict) means that CloudFlare connects to your origin over HTTPS and that it expects a valid certificate, which is checked. An origin certificate, issued by CloudFlare but not trusted by your operating system is also valid. As stated above, I use such an origin certificate at the Ingress level.
The origin certificate can be issued and/or downloaded from the Crypto section:
I created an origin certificate for *.baeke.info and baeke.info and downloaded the certificate and private key in PEM format. I then encoded the contents of the certificate and key in base64 format and used them in a secret:
As you have seen in the Ingress definition, it referred to this secret via its name, baeke.info.tls.
When a consumer connects to the API, the fully trusted certificate issued by CloudFlare is used:
We also make sure consumers of the API need to use TLS:
With the above configuration, consumers need to securely connect to https://api.baeke.info at CloudFlare. CloudFlare connects securely to the origin, which is the external IP of the ingress. Only CloudFlare is allowed to connect to that external IP because of the whitelisting configuration.
Testing the API
Let’s try the API with the http tool:
All sorts of headers are added by CloudFlare which makes it clear that CloudFlare is proxying the requests. When we don’t add a key or specify a wrong one:
The key is now securely sent from consumer to CloudFlare to origin. Phew! 😎
In this post, we hosted an API on Kubernetes, exposed it with Kong and secured it with CloudFlare. This example can easily be extended with multiple Kong proxies for high availability and multiple APIs (/users, /orders, /products, …) that are all protected by CloudFlare with end-to-end encryption and WAF. CloudFlare lends an extra helping hand by automatically generating both the “front-end” and origin certificates.
In previous posts, I wrote about Azure API Management in combination with APIs hosted on Kubernetes:
API Management with private APIs: requires API Management with virtual network integration because the APIs are reachable via an internal ingress on the Azure virtual network; use the premium tier 💰💰💰
API Management with public APIs: does not require virtual network integration but APIs need to restrict access to the public IP address of the API Management instance; you can use the other less expensive tiers 🎉🎉🎉
Instead of using API Management, there are many other solutions. One of those solutions is Kong 🐵. In this post, we will take a look at Kong Ingress Controller, which can be configured via Kubernetes API objects such as ingresses and custom resource definitions defined by Kong. We will do the following:
The above command installs Kong in the default namespace. List the services in that namespace with kubectl get svc and note the external IP of the kong-kong-proxy service. I associated that IP with a wildcard DNS entry like *.kong.yourdomain.com. That allows me to create an ingress for http://user.kong.yourdomain.com.
Note that you should not make the admin API publicly available via a load balancer. Just remove –set admin.type=LoadBalancer to revert to the default NodePort or set admin.type=ClusterIP.
The Helm chart will automatically install a PostgreSQL instance via a StatefulSet. The instance will have an 8GB disk attached. Use kubectl get pv to check that. You can use an external PostgreSQL instance or Cassandra (even Cosmos DB with the Cassandra API). I would highly recommend to use external state. There is also an option to not use a database but I did not try that.
Install the dummy user service
Use the deployment from the previous post, which deploys two pods with a container based on gbaeke/ingfunc. It contains the dummy API which is actually an Azure Function container running the Kestrel web server.
The ingress.class annotation ensures that Kong picks up this Ingress definition because I also had Traefik installed, which is another Ingress Controller. The plugins.konghq.com annotation refers to two plugins:
rate limiting: we will define this later to limit requests to 1 request/second
key auth: we will define this later to require the consumer to specify a previously defined API key
Go ahead and save the above file and apply it with kubectl apply -f filename.yaml. In subsequent steps, do the same for the other YAML definitions. All resources will be deployed in the default namespace.
Kong-specific ingress properties
The KongIngress custom resource definition can be used to specify additional Kong-specific properties on the Ingress:
The name of the KongIngress resource is func, which is the same name as the Ingress. This associates the KongIngress resource with the Ingress resource automatically. Note that we restricted the methods to GET and that we specify the path to the back-end API as /api/getusers. You also need strip_path set to true to make this work (strips the original path from the request).
To configure rate limiting, a typical capability of an API management solution, use the definition below:
This is a custom resource definition of kind (type) KongPlugin. Via the plugin property we specify the rate-limiting plugin and set it to one request per second. Note that we call this resource http-ratelimit and that we use this name in the annotation of the Ingress specification. That associates the plugin with that specific Ingress resource.
Require an API key
To require an API key, first create a consumer with a KongConsumer object:
Next, create a credential and associate it with the consumer:
A bit further down the output of the admin API, the enabled plug-ins should be listed:
In this post, we looked at the basics of Kong Ingress Controller and a few of its options to translate the path, limit the rate of requests and key authentication. We did not touch on other stuff like SSL, the Enterprise version and many of the other plugins. Hopefully though, this is just enough to get you started with the open source version on Kubernetes. Take a look a the Kong documentation for more in depth information!
You have decided to host your APIs in Kubernetes in combination with an API management solution? You are surely not the only one! In an Azure context, one way of doing this is combining Azure API Management and Azure Kubernetes Service (AKS). This post describes one of the ways to get this done. We will use the following services:
Virtual Network: AKS will use advanced networking and Azure CNI
Private DNS: to host a private DNS zone (private.baeke.info) ; note that private DNS is in public preview
AKS: deployed in a subnet of the virtual network
Traefik: Ingress Controller deployed on AKS, configured to use an internal load balancer in a dedicated subnet of the virtual network
Azure API Management: with virtual network integration which requires Developer or Premium; note that Premium comes at a hefty price though
Let’s take it step by step but note that this post does not contain all the detailed steps. I might do a video later with more details. Check the YouTube channel for more information.
We will setup something like this:
Consumer --> Azure API Management public IP --> ILB (in private VNET) --> Traefik (in Kubernetes) --> API (in Kubernetes - ClusterIP service in front of a deployment)
Create a virtual network in a resource group. We will add a private DNS zone to this network. You should not add resources such as virtual machines to this virtual network before you add the private DNS zone.
I will call my network privdns and add a few subnets (besides default):
aks: used by AKS
traefik: for the internal load balancer (ILB) and the front-end IP addresses
apim: to give API management access to the virtual network
Add a private DNS zone to the virtual network with Azure CLI:
az network dns zone create -g rg-ingress -n private.baeke.info --zone-type Private --resolution-vnets privdns
You can now add records to this private DNS zone:
az network dns record-set a add-record \
-g rg-ingress \
-z private.baeke.info \
-n test \
To test name resolution, deploy a small Linux virtual machine and ping test.private.baeke.info:
Deploy AKS and use advanced networking. Use the aks subnet when asked. Each node you deploy will get 30 IP address in the subnet:
To expose the APIs over an internal IP we will use ingress objects, which require an Ingress Controller. Traefik is just one of the choices available. Any Ingress Controller will work.
Instead of using ingresses, you could also expose your APIs via services of type LoadBalancer and use an internal load balancer. The latter approach would require one IP per API where the ingress approach only requires one IP in total. That IP resolves to Traefik which uses the host header to route to the APIs.
We will install Traefik with Helm. Check my previous post for more info about Traefik and Helm. In this case, I will download and untar the Helm chart and modify values.yaml. To download and untar the Helm chart use the following command:
helm fetch stable/traefik --untar
You will now have a traefik folder, which contains values.yaml. Modify values.yaml as follows:
This will instruct Helm to add the above annotations to the Traefik service object. It instructs the Azure cloud integration components to use an internal load balancer. In addition, the load balancer should be created in the traefik subnet. Make sure that your AKS service principal has the RBAC role on the virtual network to perform this operation.
Now you can install Traefik on AKS. Make sure you are in the traefik folder where the Helm chart was untarred:
When the installation is finished, there should be an internal load balancer in the resource group that is behind your AKS cluster:
The result of kubectl get svc -n kube-system should result in something like:
We can now reach Treafik on the virtual network and create an A record that resolves to this IP. The func.private.baeke.info I will use later, resolves to the above IP.
Azure API Management
Deploy API Management from the portal. API Management will need access to the virtual network which means we need a version (SKU) that has virtual network support. This is needed simply because the APIs are not exposed on the public Internet.
For testing, use the Developer SKU. In production, you should use the Premium SKU although it is very expensive. Microsoft should really make the virtual network integration part of every SKU since it is such a common scenario! Come on Microsoft, you know it’s the right thing to do! 😉
Above, API Management is configured to use the apim subnet of the virtual network. It will also be able to resolve private DNS names via this integration. Note that configuring the network integration takes quite some time.
Deploy a service and ingress
I deployed the following sample API with a simple deployment and service. Save this as func.yaml and run kubectl apply -f func.yaml. You will end up with two pods running a super simple and stupid API plus a service object of type ClusterIP, which is only reachable inside Kubernetes:
Notice I used func.private.baeke.info! Naturally, that name should resolve to the IP address on the ILB that routes to Traefik.
Testing the API from API Management
In API Management, I created an API that uses func.private.baeke.info as the backend. Yes, I know, the API name is bad. It’s just a sample ok? 😎
Let’s test the GET operation I created:
In this post, we looked at one way to expose Kubernetes-hosted APIs to the outside world via Azure API Management. The traffic flow is as follows:
Consumer --> Azure API Management public IP --> ILB (in private VNET) --> Traefik (in Kubernetes) --> API (in Kubernetes - ClusterIP service in front of a deployment)
Because we have to use host names in ingress definitions, we added a private DNS zone to the virtual network. We can create multiple A records, one for each API, and provide access to these APIs with ingress objects.
As stated above, you can also expose each API via an internal load balancer. In that case, you do not need an Ingress Controller such as Traefik. Alternatively, you could also replace Azure API Management with a solution such as Kong. I have used Kong in the past and it is quite good! The choice for one or the other will depend on several factors such as cost, features, ease of use, support, etc…
In this post, we will take a look at creating a simple machine learning model for text classification and deploying it as a container with Azure Machine Learning service. This post is not intended to discuss the finer details of creating a text classification model. In fact, we will use the Keras library and its Reuters newswire dataset to create a simple dense neural network. You can find many online examples based on this dataset. For further information, be sure to check out and buy 👍 Deep Learning with Python by François Chollet, the creator of Keras and now at Google. It contains a section that explains using this dataset in much more detail!
Together with the workspace, you also get a storage account, a key vault, application insights and a container registry. In later steps, we will create a container and store it in this registry. That all happens behind the scenes though. You will just write a few simple lines of code to make that happen!
Note the Authoring (Preview) section! These were added just before Build 2019 started. For now, we will not use them.
To create the model and interact with the workspace, we will use a free Jupyter notebook in Azure Notebooks. At this point in time (8 May 2019), Azure Notebooks is still in preview. To get started, find the link below in the Overview section of the Machine Learning service workspace:
When you open the notebook, you will see the following first four cells:
It’s always simple if a prepared dataset is handed to you like in the above example. Above, you simply use the reuters class of keras.datasets and use the load_data method to get the data and directly assign it to variables to hold the train and test data plus labels.
In this case, the data consists of newswires with a corresponding label that indicates the category of the newswire (e.g. an earnings call newswire). There are 46 categories in this dataset. In the real world, you would have the newswire in text format. In this case, the newswire has already been converted (preprocessed) for you in an array of integers, with each integer corresponding to a word in a dictionary.
A bit further in the notebook, you will find a Vectorization section:
In this section, the train and test data is vectorized using a one-hot encoding method. Because we specified, in the very first cell of the notebook, to only use the 10000 most important words each article can be converted to a vector with 10000 values. Each value is either 1 or 0, indicating the word is in the text or not.
This bag-of-words approach is one of the ways to represent text in a data structure that can be used in a machine learning model. Besides vectorizing the training and test samples, the categories are also one-hot encoded.
Now the dense neural network model can be created:
The above code defines a very simple dense neural network. A dense neural network is not necessarily the best type but that’s ok for this post. The specifics are not that important. Just note that the nn variable is our model. We will use this variable later when we convert the model to the ONNX format.
The last cell (16 above) does the actual training in 9 epochs. Training will be fast because the dataset is relatively small and the neural network is simple. Using the Azure Notebooks compute is sufficient. After 9 epochs, this is the result:
Not exactly earth-shattering: 78% accuracy on the test set!
Saving the model in ONNX format
ONNX is an open format to store deep learning models. When your model is in that format, you can use the ONNX runtime for inference.
Converting the Keras model to ONNX is easy with the onnxmltools:
The result of the above code is a file called reuters.onnx in your notebook project.
Predict with the ONNX model
Let’s try to predict the category of the first newswire in the test set. Its real label is 3, which means it’s a newswire about an earnings call (earn class):
We will use similar code later in score.py, a file that will be used in a container we will create to expose the model as an API. The code is pretty simple: start an inference session based on the reuters.onnx file, grab the input and output and use run to predict. The resulting array is the output of the softmax layer and we use argmax to extract the category with the highest probability.
Saving the model to the workspace
With the model in reuters.onnx, we can add it to the workspace:
You will need a file in your Azure Notebook project called config.json with the following contents:
With that file in place, when you run cell 27 (see above), you will need to authenticate to Azure to be able to interact with the workspace. The code is pretty self-explanatory: the reuters.onnx model will be added to the workspace:
As you can see, you can save multiple versions of the model. This happens automatically when you save a model with the same name.
Creating the scoring container image
The scoring (or inference) container image is used to expose an API to predict categories of newswires. Obviously, you will need to give some instructions how scoring needs to be done. This is done via score.py:
The code is similar to the code we wrote earlier to test the ONNX model. score.py needs an init() and run() function. The other functions are helper functions. In init(), we need to grab a reference to the ONNX model. The ONNX model file will be placed in the container during the build process. Next, we start an InferenceSession via the ONNX runtime. In run(), the code is similar to our earlier example. It predicts via session.run and returns the result as JSON. We do not have to worry about the rest of the code that runs the API. That is handled by Machine Learning service.
Note: using ONNX is not a requirement; we could have persisted and used the native Keras model for instance
In this post, we only need score.py since we do not train our model via Azure Machine learning service. If you train a model with the service, you would create a train.py file to instruct how training should be done based on data in a storage account for instance. You would also provision compute resources for training. In our case, that is not required so we train, save and export the model directly from the notebook.
Now we need to create an environment file to indicate the required Python packages and start the image build process:
The build process is handled by the service and makes sure the model file is in the container, in addition to score.py and myenv.yml. The result is a fully functional container that exposes an API that takes an input (a newswire) and outputs an array of probabilities. Of course, it is up to you to define what the input and output should be. In this case, you are expected to provide a one-hot encoded article as input.
The container image will be listed in the workspace, potentially multiple versions of it:
Deploy to Azure Container Instances
When the image is ready, you can deploy it via the Machine Learning service to Azure Container Instances (ACI) or Azure Kubernetes Service (AKS). To deploy to ACI:
When the deployment is finished, the deployment will be listed:
When you click on the deployment, the scoring URI will be shown (e.g. http://IPADDRESS:80/score). You can now use Postman or any other method to score an article. To quickly test the service from the notebook:
The helper method run of aci_service will post the JSON in test_sample to the service. It knows the scoring URI from the deployment earlier.
Containerizing a machine learning model and exposing it as an API is made surprisingly simple with Azure Machine learning service. It saves time so you can focus on the hard work of creating a model that performs well in the field. In this post, we used a sample dataset and a simple dense neural network to illustrate how you can build such a model, convert it to ONNX format and use the ONNX runtime for scoring.
Deploying Azure Kubernetes Service (AKS) is, like most other Kubernetes-as-a-service offerings such as those from DigitalOcean and Google, very straightforward. It’s either a few clicks in the portal or one or two command lines and you are finished.
Using these services properly and in a secure fashion is another matter though. I am often asked how to secure access to the cluster and its applications. In addition, customers also want visibility and control of incoming and outgoing traffic. Combining Azure Firewall with AKS is one way of achieving those objectives.
This post will take a look at the combination of Azure Firewall and AKS. It is inspired by this post by Dennis Zielke. In that post, Dennis provides all the necessary Azure CLI commands to get to the following setup:
In what follows, I will keep referring to the subnet names and IP addresses as in the above diagram.
Azure Firewall is a stateful firewall, provided as a service with built-in high availability. You deploy it in a subnet of a virtual network. The subnet should have the name AzureFirewallSubnet. The firewall will get two IP addresses:
Internal IP: the first IP address in the subnet (here 10.0.3.4)
Public IP: a public IP address; in the above setup we will use it to provide access to a Kubernetes Ingress controller via a DNAT rule
As in the physical world, you will need to instruct systems to route traffic through the firewall. In Azure, this is done via a route table. The following route table was created:
In (1) a route to 0.0.0.0/0 is defined that routes to the private IP of the firewall. The route will be used when no other route applies! The route table is associated with just the aks-5-subnet (2), which is the subnet where AKS (with advanced networking) is deployed. It’s important to note that now, all external traffic originating from the Kubernetes cluster passes through the firewall.
When you compare Azure Firewall to the Network Virtual Appliances (NVAs) from vendors such as CheckPoint, you will notice that the capabilities are somewhat limited. On the flip side though, Azure Firewall is super simple to deploy when compared with a highly available NVA setup.
Before we look at the firewall rules, let’s take a look at the Kubernetes Ingress Controller.
Kubernetes Ingress Controller
In this example, I will deploy nginx-ingress as an Ingress Controller. It will provide access to HTTP-based workloads running in the cluster and it can route to various workloads based on the URL. I will deploy the nginx-ingress with Helm.
Think of an nginx-ingress as a reverse proxy. It receives http requests, looks at the hostname and path (e.g. mydomain.com/api/user) and routes the request to the appropriate Kubernetes service (e.g. the user service).
Normally, the nginx-ingress service is accessed via an Azure external load balancer. Behind the scenes, this is the result of the service object having spec.type set to the value LoadBalancer. If we want external traffic to nginx-ingress to pass through the firewall, we will need to tell Kubernetes to create an internal load balancer via an annotation. Let’s do that with Helm. First, you will need to install tiller, the server-side component of Helm. Use the following procedure from the Microsoft documentation:
This ensures an internal load balancer gets created. It gets created in the mc-* resource group that backs your AKS deployment:
Note that Kubernetes creates the load balancer, including the rules and probes for port 80 and 443 as defined in the service object that comes with the Helm chart. The load balancer is created in the ing-4-subnet as instructed by the service annotation. Its private IP address is 10.0.4.4 as in the diagram at the top of this post
DNAT Rule to Load Balancer
To provide access to internal resources, Azure Firewall uses DNAT rules which stands for destination network address translation. The concept is simple: traffic to the firewall’s public IP on some port can be forwarded to an internal IP on the same or another port. In our case, traffic to the firewall’s public IP on port 80 and 443 is forwarded to the internal load balancer’s private IP on port 80 and 443. The load balancer will forward the request to nginx-ingress:
If the installation of nginx-ingress was successful, you should end up at the default back-end when you go to http://firewallPublicIP.
If you configured Log Analytics and installed the Azure Firewall solution, you can look at the firewall logs. DNAT actions are logged and can be inspected:
Application and Network Rules
Azure Firewall application rules are rules that allow or deny outgoing HTTP/HTTPS traffic based on the URL. The following rules were defined:
The above rules allow http and https traffic to destinations such as docker.io, cloudflare and more.
Note that another Azure Firewall rule type, network rules, are evaluated first. If a match is found, rule evaluation is stopped. Suppose you have these network rules:
The above network rule allows port 22 and 443 for all sources and destinations. This means that Kubernetes can actually connect to any https-enabled site on the default port, regardless of the defined application rules. See rule processing for more information.
This feature alerts on and/or denies network traffic coming from known bad IP addresses or domains. You can track this via Log Analytics:
Above, you see denied port scans, traffic from botnets or brute force credentials attacks all being blocked by Azure Firewall. This feature is currently in preview.
The AKS documentation has a best practices section that discusses networking. It contains useful information about the networking model (Kubenet vs Azure CNI), ingresses and WAF. It does not, at this point in time (May 2019), desicribe how to use Azure Firewall with AKS. It would be great if that were added in the near future.
Here are a couple of key points to think about:
WAF (Web Application Firewall): Azure Firewall threat intelligence is not WAF; to enable WAF, there are several options:
you can use cloud-native WAFs such as TwistLock (WAF is one of the features of this product; it also provides firewall and vulnerability assessment)
remote access to Kubernetes API: today, the API server is exposed via a public IP address; having the API server on a local IP will be available soon
remote access to Kubernetes hosts using SSH: only allow SSH on the private IP addresses; use a bastion host to enable connectivity
Azure Kubernetes Service (AKS) can be combined with Azure Firewall to control network traffic to and from your Kubernetes cluster. Log Analytics provides the dashboard and logs to report and alert on traffic patterns. Features such as threat intelligence provide an extra layer of defense. For HTTP/HTTPS workloads (so most workloads), you should complement the deployment with a WAF such as Azure Application Gateway or 3rd party.
In this post, we will look at an example of an Azure Function, running in a Premium plan, that queries CosmosDB. We will restrict incoming traffic to the Azure Function from a subnet and only allow CosmosDB to be queried by the same Azure Function. Here’s a diagram:
To restrict incoming traffic to the Azure Function, navigate to the Function App in the portal and select Networking in Platform Features. You will see the following screen:
We will configure the inbound restrictions via Configure Access Restrictions. You can configure restrictions for both the Function App itself and the scm site:
From the moment you add rules, a Deny All rule will appear. In the above rules, I allowed my private IP and the default subnet in the virtual network. The second rule configures the service endpoints. When you open the properties of the subnet, you will see:
Great! When you try to access the function from any other location, you will get a 403 error from the Azure Functions front-end. So don’t expect a connection timeout like with regular network security rules.
The example Azure Function uses an HTTP trigger and a Cosmos DB input (cosmos). Documents contain a name property. The query outputs the name found on the first document:
In order to secure access to Cosmos DB, two features were used:
Azure Functions VNet Integration (VNet integration is currently in preview)
Cosmos DB network service endpoints to restrict access to the subnet that provides the Azure Function hosts with an IP address
Configuring the VNet integration is straightforward, especially when compared to the old style of integration which required a VPN tunnel:
As you can see in the above screenshot, you delegate a subnet to the App Service hosts. In my case, that is subnet func-sec:
The bottom of the screenshot shows the subnet is delegated to the Microsoft.Web/serverFarms service. That is the result of the VNet integration.
You can also see the subnet has service endpoints configured for Cosmos DB. That is the result of the Cosmos DB configuration below:
In Cosmos DB, an existing virtual network was added. I did not enable the Accept connections from within public Azure datacenters option.
When you remove the service endpoint and you run the Azure Function, the following error is thrown:
Unable to proceed with the request. Please check the authorization claims to ensure the required permissions to process the request. ActivityId: 03b2c11f-2b21-44c9-ab44-61b4864539fe, Microsoft.Azure.Documents.Common/22.214.171.124, Windows/10.0.14393 documentdb-netcore-sdk/2.2.0
Does it work from a VM in the default subnet?
If all went well, I should be able to call the Azure Function from the virtual machine in the default subnet. Let’s try with curl:
The name field in the first document is set to itsme so it worked! Great, the function can be called from the default subnet. In case you are wondering about the use of -p in the ssh command: this virtual machine sat behind an Azure Firewall and the VM ssh port was exposed via a DNAT rule over a random port.
From another location, the following error is shown (wrapped around some HTML but this is the main error):
Error 403 - This web app is stopped
With virtual network service endpoints now available for most Azure PaaS (platform as a service) components, you can ensure those services are only accessed from intended locations. In this example, you saw how to secure access to Azure Functions and Cosmos DB. Service endpoints combined with the App Service VNet integration make it straightforward to secure a Function App end-to-end.
Several years ago, when we started our first adventures in the wonderful world of IoT, we created an application for visualizing real-time streams of sensor data. The sensor data came from custom-built devices that used 2G for connectivity. IoT networks and protocols such as SigFox, NB-IoT or Lora were not mainstream at that time. We leveraged what were then new and often preview-level Azure services such as IoT Hub, Stream Analytics, etc… The architecture was loosely based on lambda architecture with a hot and cold path and stateful window-based stream processing. Fun stuff!
Kubernetes already existed but had not taken off yet. Managed Kubernetes services such as Azure Kubernetes Service (AKS) weren’t a thing.
The application (end-user UI and management) was loosely based on a micro-services pattern and we decided to run the services as Docker containers. At that time, Karim Vaes, now a Program Manager for Azure Storage, worked at our company and was very enthusiastic about Rancher. , Rancher was still v1 and we decided to use it in combination with their own container orchestration framework called Cattle.
Our experience with Rancher was very positive. It was easy to deploy and run in production. The combination of GitHub, Shippable and the Rancher CLI made it extremely easy to deploy our code. Rancher, including Cattle, was very stable for our needs.
In recent years though, the growth of Kubernetes as a container orchestrator platform has far outpaced the others. Using an alternative orchestrator such as Cattle made less sense. Rancher 2.0 is now built around Kubernetes but maintains the same experience as earlier versions such as simple deployment and flexible configuration and management.
In this post, I will look at deploying Rancher 2.0 and importing an existing AKS cluster. This is a basic scenario but it allows you to get a feel for how it works. Indeed, besides deploying your cluster with Rancher from scratch (even on-premises on VMware), you can import existing Kubernetes clusters including managed clusters from Google, Amazon and Azure.
For evaluation purposes, it is best to just run Rancher on a single machine. I deployed an Azure virtual machine with the following properties:
Operating system: Ubuntu 16.04 LTS
Size: DS2v3 (2 vCPUs, 8GB of RAM)
Public IP with open ports 22, 80 and 443
DNS name: somename.westeurope.cloudapp.azure.com
In my personal DNS zone on CloudFlare, I created a CNAME record for the above DNS name. Later, when you install Rancher you can use the custom DNS name in combination with Let’s Encrypt support.
On the virtual machine, install Docker. Use the guide here. You can use the convenience script as a quick way to install Docker.
With Docker installed, install Rancher with the following command:
More details about the single node installation can be found here. Note that Rancher uses etcd as a datastore. With the command above, the data will be in /var/lib/rancher inside the container. This is ok if you are just doing a test drive. In other cases, use external storage and mount it on /var/lib/rancher.
A single-node install is great for test and development. For production, use the HA install. This will actually run Rancher on Kubernetes. Rancher recommends a dedicated cluster in this scenario.
To get started, I added an existing three-node AKS cluster to Rancher. After you add the cluster and turn on monitoring, you will see the following screen when you navigate to Clusters and select the imported cluster:
To demonstrate the functionality, I deployed a 3-node cluster (1.11.9) with RBAC enabled and standard networking. After deployment, open up Azure Cloud shell and get your credentials:
az aks list -o table az aks get-credentials -n cluster-name -g cluster-resource-group kubectl cluster-info
The first command lists the clusters in your subscription, including their name and resource group. The second command configures kubectl, the Kubernetes command line admin tool, which is pre-installed in Azure Cloud Shell. To verify you are connected, the last command simply displays cluster information.
Now that the cluster is deployed, let’s try to import it. In Rancher, navigate to Global – Clusters and click Add Cluster:
Click Import, type a name and click Create. You will get a screen with a command to run:
Continue on in Rancher, the cluster will be added (by the components you deployed above):
Click on the cluster:
To see live metrics, you can click Enable Monitoring. This will install and configure Prometheus and Grafana. You can control several parameters of the deployment such as data retention:
Notice that by default, persistent storage for Grafana and Prometheus is not configured.
Note: with monitoring enabled or not, you will notice the following error in the dashboard:
The error is described here. In short, the components are probably healthy. The error is not related to a Rancher issue but an upstream Kubernetes issue.
When the monitoring API is ready, you will see live metrics and Grafana icons. Clicking on the Graphana icon next to Nodes gives you this:
Of course, Azure provides Container Insights for monitoring. The Grafana dashboards are richer though. On the other hand, querying and alerting on logs and metrics from Container Insights is powerful as well. You can of course enable them all and use the best of both worlds.
We briefly looked at Rancher 2.0 and how it can interact with a existing AKS cluster. An existing cluster is easy to add. Once it is added, adding monitoring is “easy peasy lemon squeezy” as my daughter would call it! 😉 As with Rancher 1.x, I am again pleasantly surprised at how Rancher is able to make complex matters simpler and more fun to work with. There is much more to explore and do of course. That’s for some follow-up posts!