Centralize monitoring when you migrate to the cloud

Data migrations are the best time to add a centralized monitoring system

Centralize monitoring when you migrate to the cloud

My checklist for cloud migrations has several important steps to ensure that the business value of operating in the cloud and the reliability of running applications there is achieved. One of those key steps is to set up a holistic monitoring strategy for the cloud’s infrastructure, including networks, security, compute, and storage as well as the applications, databases, and services. Adding monitoring capabilities provides early warnings of unexpected issues and is critical to managing capacity, costs, and longer-term reliability.

Different cloud migration strategies drive business agility in different ways, but investing in monitoring—specifically application monitoring—is at the top of my list for these scenarios:

  • For lift-and-shift strategies where applications are transitioned directly to cloud infrastructure. Monitoring can indicate unexpected performance issues.
  • For in-cloud transformations where applications are re-engineered and optimized to run in the cloud. Added monitoring can alert you to new types of incidents and unforeseen capacity problems.
  • When deploying applications to multiple clouds. The added monitoring can report on latency issues and help identify root causes of complex transactions that span multiple microservices.

Monitoring cloud applications and services may require new tools

Unfortunately, just adding new monitoring capabilities may not be so easy. For organizations taking their first steps into the cloud, a whole set of new monitoring tools and types of alerts needs to be considered. Organizations that already have investments in data centers are likely to find that the tools used on premises for virtualized systems and private clouds may not work as well for public cloud applications, services, and serverless computing. Even when organizations adopt a multicloud strategy, they likely want to take advantage of each cloud vendor’s built-in monitoring capabilities. This essentially means that any cloud migration is likely to introduce new monitoring tools.

One other factor with cloud migrations is that often new people need to participate in configuring monitoring tools and responding to alerts. For example, a new cloud-native application likely has developers, devops engineers, and business owners who are subject-matter experts on what to monitor and who should be alerted when incidents occur. The new members may use different workflow tools: A new cloud-first team may be using Jira and Slack whereas the data center team may be using ServiceNow and Skype for Business.

Bottom line, even through there are very good reasons to add monitoring to the checklist, it also adds complexity. This can be minimized when a centralized monitoring strategy is executed as part of a cloud migration or multicloud strategy.

Implementing a centralized monitoring strategy

You can best understand the need and function of a centralized monitoring solution by reviewing how monitoring tools were deployed and configured in the past.

Most IT operations teams started with a number of basic monitoring tools such as Nagios and Perfmon or platforms such as SolarWinds, WhatsUp Gold, and OpManager to report on networks and infrastructure. That’s why operational teams were stronger at responding to infrastructure issues, but historically poor at responding to end-user, application, or database performance issues.

Beyond infrastructure monitoring tools, it was more common for operations teams to add monitoring tools on an as-needed basis. In some cases, tools were added in response to a set of recurring issues, for example, monitoring unreliable databases for capacity and performance problems. In other cases, monitoring was tied to adding new infrastructure such as new data center locations, networks, enterprise systems, or storage devices. Adding cloud infrastructure falls into this second category.

When adding new monitoring tools, it’s common for the engineer assigned to configure the tool to set up reporting and alerts to go directly to his or her team. This may be the easiest approach to get reports and alerts configured quickly, but in the long run it creates siloed access to information and the potential that multiple teams receive alerts from different tools.

A better way is to centralize monitoring. Each monitoring solution collects data and has proprietary reports useful for diagnosing issues. But this same monitoring data is then aggregated into a centralized monitor that can then perform many functions centrally over a wider scope of data. This centralization has several benefits:

  • Incidents can be logically grouped from multiple monitoring tools. Alerts from independent monitoring tools no longer indiscriminately fire off to independent teams. Instead, alerts are logically consolidated into incidents, analyzed using a wider data set, and intelligently routed to the right teams for response.
  • One central system can analyze slower-changing trends that might indicate capacity, security, or application usability issues.
  • Integration with workflow tools can be implemented more efficiently through the centralized monitoring tool instead of wiring in point-to-point integrations.

The intelligence is first enabled by centralizing the data and the integrations with workflow tools. The real benefits materialize as organizations implement autonomous operations and leverage open box machine learning to intelligently group alerts into manageable incidents.

Cloud transitions are the best time to implement centralized monitoring

The ideal time to create a centralized monitoring solution is when migrating applications and services to the cloud. You must still go through the process of configuring monitoring at the infrastructure, application, and service levels. But instead of configuring escalations in these solutions, you take the steps of integrating the monitoring data and alerts into your central system. This shifts the effort from implementing escalations in individual monitoring tools to the centralized one. In the end, the IT operations teams get all the benefits of centralization with little added effort.

But that also depends on the approach taken to implement centralized monitoring, and there are several strategies. Implementing a proprietary data lake and reporting tool may offer the most flexibility, but it does require an investment in developing a data lake or warehouse, building reports, configuring alerts, and integrating with workflow tools. Companies such as BigPanda offer centralized monitoring with built-in integrations, machine learning, autonomous operations, and unified analytics.

Centralized monitoring can be very powerful, especially when the machine learning properly correlates multiple alerts into single incidents, making it faster and easier to identify root causes. This is why including centralized monitoring in a cloud migration is high on my checklist. It balances the risk of adding new infrastructure capabilities by providing richer monitoring, and it can be implemented more efficiently than configuring alerts in multiple monitoring tools.

Copyright © 2019 IDG Communications, Inc.