Managed Kubernetes: AWS vs. Azure vs. Google Cloud

Which managed Kubernetes service should you choose? Amazon EKS, Azure Kubernetes Service, and Google Kubernetes Engine compared

No doubt about it, Kubernetes is hot. By all indications, the open-source project created by Google, and now shepherded by the CNCF (Cloud Native Computing Foundation), has won the war for container orchestration dominance. Would-be competitors such as Mesosphere and Docker Inc. have adopted Kubernetes, the leading PaaS stacks such as OpenShift and Cloud Foundry now include it, and all of the major cloud vendors now support it.

But that doesn’t mean that all of the Kubernetes offerings are the same—or equal. In this article, we’ll break down the key components of managed Kubernetes, and explore how each of the three major cloud providers—Amazon Elastic Container Service for Kubernetes, Azure Kubernetes Service, and Google Kubernetes Engine—differs in its support of the platform.

Setting up the Kubernetes cluster

In our tests, all three services had no problem bringing up a cluster. Where they start to differ is in the number of steps required. Amazon EKS, for instance, requires a number of additional steps to create a cluster. With Microsoft’s Azure Kubernetes Service and Google Kubernetes Engine, however, a few quick commands do the trick, and the cluster is up and running within minutes. You also have to install separate packages for Amazon, such as heptio-authenticator, which enables federated authentication using AWS Identity and Access Management.

It’s important to note that if you’re working with different Kubernetes deployments, you must always have the kubectl command-line tool configured. For example, I’ve been working with straight Kubernetes and pretty much every managed Kubernetes provider. A lot of these services provide a command that essentially adds the kubeconfig context to your current file, making it much easier to switch between clusters of different providers. With Amazon EKS, this is a manual task—potentially an unproductive use of your time if you need to quickly create and delete clusters on demand.

From a setup standpoint, Google Kubernetes Engine and Azure Kubernetes Service are very similar, but Amazon EKS requires a significantly greater number of steps. It’s important to note that Amazon is already taking steps to speed up cluster creation.

Comparing Kubernetes dashboards

The native Kubernetes dashboard is a simple, web-based UI to all of the Kubernetes services available to you. It provides simple metrics on your deployments, pods, and services, and allows you to manage the cluster. However, while nice to have, the Kubernetes dashboard is by no means necessary. Everything you do from the Kubernetes dashboard you could just as easily do from the command line.

When working with Amazon EKS and Azure Kubernetes Service, the dashboards were very simple to use. I was able to host a cluster locally, start the proxy on my machine, and navigate to the localhost version of the Dashboard. But the dashboard of Google Kubernetes Engine was the best of the three. Frankly, I was surprised by how much I like Google’s UI. Then again, Kubernetes was created by Google, so maybe this shouldn’t have come as a surprise.

Google’s UI really seems like an upgraded version of the native Kubernetes dashboard. It behaves consistently, it runs smoothly, and it’s informative and easy to navigate. Another plus is that it’s ready to use right out of the box, so no manual dashboard setup is necessary. Azure Kubernetes Service has a built-in dashboard as well, but its navigation is a little less intuitive and requires some learning.

Amazon EKS is the only provider that doesn’t provide a functional dashboard out of the box. That’s something to think about if you’re going to be running clusters on external instances (Amazon EC2, Google Cloud Engine, Azure Compute), where people connect to the instance to get to the cluster, instead of locally. For example, I had Kubernetes running on an Amazon EC2 Ubuntu instance and then wiped it out to set up Amazon EKS. Once configured and spun up, my applications (and everything else) ran super smoothly—except there was no dashboard.

To create the Amazon EKS dashboard, the API server either needed to be exposed, or the machine hosting Kubernetes needed external access to the proxy. Neither was the case, though, and since we had PEM key access to the instance, the local proxy couldn’t be exposed without additional configuration. It got to a point where I essentially gave up trying to build the dashboard because everything could be done from the command line, and there was no point in spending more time trying to set up the UI.

Your Kubernetes dashboard of choice boils down to individual needs. If you’re just looking to evaluate software, or to make sure that when transitioning to managed Kubernetes all your services run smoothly, I’d choose Google Kubernetes Engine for the built-in dashboard and ease of use. When you think about it, though, the goal is to automate most of the workflow, so by the time your Kubernetes deployment makes its way to production, most of it will be scripted and automated, alleviating the need for a dashboard at all. For example, CERN, the European physics organization, has spun up approximately 210 Kubernetes clusters. I find it hard to believe they are using the dashboard for their production workflow.  

Scaling Kubernetes clusters

Scaling, a great perk of Kubernetes, is extremely easy to do from the command line, from dashboard, or without human input. Of the three Kubernetes offerings I’ve tested, Google Kubernetes Engine is the only one to set up cluster auto-scaling automatically. The main benefits are the ability for Kubernetes to scale up pods if they run out of resources, and to scale down nodes if they are underutilized and move their pods to other nodes. Thus auto-scaling is great for running short processes.

While Amazon EKS and Azure Kubernetes Service both state that their worker nodes are running as auto-scaling groups, the policies are not recognized by Kubernetes, which means you have to set up the auto-scaler manually. While the configuration is not overly involved, it is something that still requires setup and maintenance. You effectively deploy the auto-scaler as a deployment on your master node, and then configure its policies. While minor testing revealed no issues, I can tell you from personal experience that maintaining third-party services is just dangerous. You can view the manual setup docs for the cluster auto-scaler on Amazon and the cluster auto-scaler on Azure on GitHub.

High availability for Kubernetes clusters

Availability, while important, is not the most valuable component of Kubernetes. Kubernetes has the same benefits as regular instance availability around stability and fault-tolerance. High availability (HA) is about eliminating the single point of failure in the cluster. If I have an HA cluster, I should be able to lose some number of master nodes. That’s the basic definition of highly available Kubernetes.

Kubernetes is massive, though, with a number of components that could break down. As Lucas Käldström, a CNCF volunteer ambassador for Kubernetes who was interviewed at KubeCon + CloudNativeCon North America 2017, explained:

Take kube-dns as an example. Let’s say we have a normal cluster, it has multiple masters, multiple Etcd replicas but still running just one kube-dns. So then if the master running kube-dns fails, your cluster will experience some kind of outage because now suddenly all your service discovery queries may not resolve. So, we really have to go and take it to a deeper level and analyze where are the key components that we have in a cluster and then try to eliminate their single points of failure.

How do the managed Kubernetes providers stack up in terms of high availability?

Google Kubernetes Engine is available in two modes: multi-zone and regional. The primary difference between regional and multi-zone clusters is that regional clusters create three masters and multi-zone clusters create only one. So if you’re running a regional cluster in US-East, for instance, it would create a master on US-East-1, 2, and 3. Because regional clusters are available across a region rather than in a single zone within a region, if a single zone goes down, your Kubernetes control plane and resources are not impacted.

Lucas Käldström again:

There’s a difference between high availability and multi-master. If you, for example, have three masters and only one Nginx instance in front load balancing to those masters, you have a multi-master cluster but not a highly available one because your Nginx can go down at any time and well, there you go.

Amazon EKS takes a very similar approach to Google Kubernetes Engine. It too provides an HA master and worker nodes across a variety of zones.

Azure Kubernetes Service is the only provider that does not have HA to date. Worker nodes can be used to provide high availability for specific zones, but this requires more effort and doesn’t offer the same level of resilience that comes with HA master nodes.

At the end of the day, the decision of which Kubernetes vendor you choose should boil down to your organization and product trade-offs. Kubernetes-specific features are inclusive across the different vendors, most of which are on version v1.10 as of this writing. The choice should be made based on your organization needs.

For example, my team standardizes across AWS for almost all of our services: Docker images hosted on Amazon ECR, instances on Amazon EC2, source code in AWS CodeCommit, hosted files on Amazon S3, and so on. For me, it makes much more sense to use Amazon EKS, as standardization and automation is a focal point of the software we build and work with.

If you rely on multiple vendors for your software, you may not care about this. And if you’re more focused on evaluation and experimentation, or on the importance of auto-scaling and not having to manage scaling manually, you should probably use Google Kubernetes Engine. If you’re a Microsoft shop and you want to take advantage of Azure’s service catalog, which enables a client to request Azure services for applications running on a cluster, you should probably choose Azure Kubernetes Service.

The choice should always boil down to the business use cases you are trying to solve, as there will always be complexity. Did you think managing monolithic apps or a large Docker Compose file was a pain? Try managing hundreds of YAML files for deployments, pods, services, and daemon sets.

Eric Johanson is a software engineer at AppDynamics. Prior to AppDynamics, Eric held engineering and sales roles at AltX before it was acquired by Addepar.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Copyright © 2018 IDG Communications, Inc.