Network monitoring is the nervous system of any infrastructure. Keeping tabs on your services -- whether they're local or in the cloud -- is vitally important to maintaining a stable and functional service infrastructure.
In this week's New Tech Forum, Ernest Mueller, product manager at infrastructure-monitoring provider CopperEgg, walks us through the growing field of network and service monitoring, and what we can now leverage to keep tabs on our ever more complex infrastructures. -- Paul Venezia
If an app slows in Singapore, does ops in Palo Alto know why?
One of the more fascinating points about multi-million-dollar IT systems businesses is that most of its operations are invisible to the naked eye. You may be sitting in a server room, but still have no idea what workloads are being performed, whether services are up or down, or what performance is like.
That's why we rely on instrumentation -- the monitoring tools that offer a deep view of what's going on inside this hidden world.
There are many different ways to instrument your applications and systems to get a view into what they're doing. But as with the real, physical world, what you're observing and how you perceive the events in question affect the conclusions you derive.
As an example, with a real user-monitoring tool, you may see that a critical application is working well for your users, who are based in the United States. But a synthetic probe shows there's an issue with access to your site from Singapore, and it will affect users in several hours when that office comes online. Neither form of monitoring is "better," per se -- they're simply instrumenting your system in different ways and therefore showing different levels of information. To have the most comprehensive view of your site, you'd want both.
Let's explore the many ways an organization can acquire and digest monitoring data and the implications of each.
The instrumentation technique is largely defined by two factors: The method by which the instrument takes a sample and the point in the web of paths and dependencies within the system where the sample is taken.
Instrumentation methods are typically divided into passive and active approaches. Passive instrumentation tries to record data about what is going on in the system without affecting it (though the observer effect is difficult to remove entirely). Active instrumentation specifically provides a stimulus to provoke a response.
Whichever approach is used, the other differentiator is depth: what type of information is captured in a sample and how rich it is. For example, a synthetic Web transaction can simply capture a response code and total time taken for the request, or it can capture an entire "waterfall" of performance of every component and how long results took to render in a real browser. A sample could be taken every minute or every hour.
The instrumentation points where you collect data can be on a server, within an application, on the network, in a client browser, or at any other part of the service. Which points you choose to instrument affect what behavior you have direct data on.
For example, if you instrument the JVM (Java virtual machine) that an application runs in, you know specifically what that JVM is doing -- and which behaviors you are only able to infer. (The JDBC call is slow, but is that because of the network, the database, the database server, or something else entirely?) It's best to choose your points of instrumentation carefully, so you see the high-level state of your system and its delivery to real users, but also dive deeper to effectively isolate and troubleshoot faults or slowdowns.