5 principles of monitoring microservices

Microservices offer a new way to build, deliver, and manage enterprise applications, but they also require a different approach for how devops monitor their systems and the data that gives them insight to overall health


I started programming when I was seven (on paper and without a computer, but that’s another story). One thing I learned early is that software development—like life—is full of trade-offs: Organizations and developers used to have to pick between performance or simplicity, innovation or manageability. But as microservices emerged as part of the container/Docker trend, application development was transformed into a set of small services, each running in its own process and communicating with mechanisms like an API. Microservices’ advantages are obvious: tremendous speed gains in software development and deployment, which saves money and, in the right circumstances, conveys a competitive edge to the organization.

I’ve talked with many devops folks over the last few years and learned a lot about their challenges. Those conversations have made it clear to me that as they embrace microservices, organizations need to rethink their software management practices as part of good performance and security hygiene. Use of microservices means changing the approach to software management, specifically how an organization handles monitoring of infrastructure, applications, and data. Left unchanged, organizations will be challenged to understand microservices performance, not to mention troubleshoot problems. Here are five ways to make your monitoring of microservices smarter and more responsive, not to mention, more efficient.

1. Monitor containers and what runs inside them

As the building blocks of microservices, containers are black boxes that span the developer laptop to the cloud. But without real visibility into containers, it’s hard to perform basic functions like monitoring or troubleshooting a service. You need to know what’s running in the container, how the application and code are performing, and if they’re generating important custom metrics.

And as your organization scales up, possibly running thousands of hosts with tens of thousands of containers, deployments can get expensive and become an orchestration nightmare. To get container monitoring right, you have a few choices: Ask developers to instrument their code directly, run sidecar containers, or leverage universal kernel-level instrumentation to see all application and container activity. Each approach has advantages and drawbacks, so you will need to review which one fulfills your most important objectives. The main point is that the old methods that worked with static workloads on VMs is just not going to cut it anymore.

2. Use orchestration systems

One of the most critical processes you can perform is tracking aggregate information from all the containers associated with a function or a service, like what is the response time of each service. This kind of on-the-fly aggregation also applies to infrastructure-level monitoring to know which services’ containers are using resources beyond their allocated CPU shares, for example.

If you are part of a development team, you can tap an orchestration system like Kubernetes, Mesosphere DC/OS, or Docker Swarm to define your microservices and understand the current state of each deployed service. If you are part of a devops team, you should redefine system alerts to get as close to monitoring the experience of the service as possible, because the alerts are the first line of defense in assessing application health. This process is easier if the monitoring system is container-native, which uses orchestration metadata, dynamically aggregates container and application data, and calculates monitoring metrics on a per-service basis.

Depending on the orchestration tool, there may be different layers of a hierarchy to drill into; Kubernetes offers Namespace, ReplicaSets, Pods and some containers. Aggregating at these layers is critical for logical troubleshooting, regardless of how the service’s containers are physically deployed.

3. Prepare for elastic, multilocation services

Container-native environments change quickly, and that sort of dynamism exposes the weaknesses in any monitoring system. Manually defining and tuning metrics may work well for 20 or 30 containers, but microservice monitoring must be able to expand and contract alongside elastic services—and without human intervention. So if you must manually define what service a container is included in for monitoring, you’re likely to miss new containers spun up during the day by Kubernetes or Mesos. In the same vein, if your organization installs a custom stats endpoint when new code is built and deployed, monitoring is complicated as developers pull base images from a Docker registry.

If your organization is using microservices, you also need monitoring that spans multiple data centers or multiple clouds. Using AWS CloudWatch, for example, is only useful if your microservices are limited to AWS.

4. Monitor APIs

As the lingua franca of microservice environments, APIs are the only elements of a service exposed to other teams. An API’s response and consistency may even become the internal service level agreement if a formal SLA hasn’t been defined. That means API monitoring must go beyond the standard, binary up-down checks.

As a user of microservices you will find it’s valuable to understand your most frequently used endpoints as a function of time, letting you see what’s changed in services usage, whether because of a design or user change. Discovering the slowest service endpoints can also show you areas that need optimization. Being able to trace service calls through the system, a function typically used only by developers, will help your organization understand the overall user experience. That aspect of API monitoring will also break down information into infrastructure- and application-based views of the microservices environment.

5. Map monitoring to the organizational structure

While microservices presage a comprehensive shift in how you and your organization monitor and secure your software infrastructure, it’s essential that you not overlook the people aspects of software monitoring. If your organization wants to benefit from this new software architecture approach, your teams must mirror microservices themselves. That means smaller, more loosely coupled teams with a lot of autonomy but focused on strategic objectives. Each team retains more control than ever over languages used, bug resolution or operational responsibilities. You can then create a monitoring platform that allows each microservice team to isolate its alerts, metrics, and dashboards, while still giving operations a global view across the system.

The speed, efficiency and potential savings of microservices can be helped along by some basic changes in software monitoring. Improved monitoring of microservices makes for robust performance and greater end-user satisfaction. Properly calibrated, microservices will deliver more capabilities in less time, a virtuous circle of service delivery.

This article is published as part of the IDG Contributor Network. Want to Join?