OpenNMS has a place in every enterprise. It's a highly scalable network monitoring system that is completely open source software. A single server can monitor hundreds of thousands of network interfaces and produce nice graphs for metrics such as bandwidth usage, CPU, memory, and more.
You can set thresholds that indicate when a device is busy or down and receive a notification via email, SMS, IM, and so on. Of course you can have separate logins for each of your NOC team, and you can set up an on-call schedule so that notifications go only to on-duty team members. OpenNMS also has an escalation handler, so if the level-one NOC techs don't take care of an issue right away, an engineer or manager can be notified to oversee issue resolution.
The Cacti graphing solution makes a good complement to OpenNMS. Although OpenNMS has the same graphing capabilities, Cacti's more intuitive Web UI allows nontechnical staff to build and manage collections of graphs that are interesting to them. For example, you could configure Cacti to graph data from your (SNMP-capable) HVAC controllers, and your facility maintenance team members could log in to Cacti and build custom views that display only the data they need to see. If one is watching fan rotation speed and another is tracking electrical power draw, they wouldn't have to view each other's data.
You can organize Cacti's graphs into trees, similar to the old Microsoft file system viewers used to display files in a directory structure. And with individual logins for each staff member, everyone gets their own view settings saved under their login.
My TraceRoute (MTR) is not quite as useful as it once was. MTR relies on ICMP packets to judge network latency -- and ICMP are the first packets modern routers will drop in favor of more important data traffic when they get too busy. However, I still find MTR a great tool for troubleshooting network links that traverse multiple routers. Specify a destination, and MTR shows you a list of routers that your traffic passes through on the way (as well as the destination itself) and the results of a continuous ping to those routers.
MTR updates the statistics of the pings as it runs, so you can see which routers are slow to respond or which are dropping a significant number of ping requests. The results include the percentage of lost packets, the response times from each router (average, best, and worst), and the standard deviations for those times. How many times have you heard a user complaining "the Internet is slow," only to discover that the problem is a particular website or provider upstream from your office? MTR is a great way to see whether there really is a problem and to get a quick idea of where the problem resides.
One of MTR's more commonly used command-line options is
-n, which stops MTR from doing reverse DNS lookups on the IP addresses of the routers it pings. This is handy when you're having DNS problems and don't want to wait for the lookups to timeout. Another useful option is
-r, which issues a single summary report after running a certain number of pings (specified by the
-c option) to each router. This can be used with scripts to build regular reports to be printed, emailed, or even inserted into a Web page.
Having trouble installing and setting up Win10? You aren’t alone. Here are many of the most common...
Hot or not? From the web to the motherboard to the training ground, get the scoop on what's in and...
Confidence in our power over machines also makes us guilty of hoping to bend reality to our code
Microsoft says its new Azure cloud database is all types of databases in one. Here's why that might be...
Edge computing will not replace cloud computing, though the two approaches can complement each other ...
The Rust-like open source language tackles application development where asynchrony leads to...
The popular code repository is trying to be a one-stop shop for developers to get more of their work...