Too often, IT administrators think that they can't color outside the lines. Whether it's a custom application or an "unsupported" piece of hardware, there are many of us who believe that if a monitoring tool can't handle it immediately, it can't be handled. That's simply not the case, and with a little bit of elbow grease, just about anything can be monitored, cataloged, and made more visible.
An example might be a custom application with a database back end, like a Web store or an internal finance application. Management wants to see pretty graphs and charts depicting usage data in some form or another. If you're using something like Cacti already, there are several ways to bring this data into the fold, such as constructing a simple Perl or PHP script to run queries on the database and pass counts back to Cacti, or even an SNMP call to the database server using private MIBs (management information bases). It can be done, and it can generally be done easily.
For unsupported hardware, as long as it speaks SNMP, you can most likely squeeze the data you need out of it with a little research. Once you have the right MIBs to query, you can then use that information to write a Nagios plug-in to monitor the device. An example might be my Nagios plug-ins for APC hardware -- they didn't exist when the hardware was installed, but I wanted to centralize the monitoring of those devices. I wrote a quick plug-in to check the PDUs (power distribution units) for amperage levels, the in-row cooling units for airflow and rack inlet temperatures, and so forth. Now, not only do I have that data in graphs via Cacti, but Nagios watches the same data, looking for anomalies and reporting to me via IM, e-mail, and even SMS if the numbers are out of whack.
Getting most of these tools running isn't much of a challenge. On a freshly built CentOS box, all you need to do is install the proper repository RPM from RPMForge, then type "yum install nagios ntop cacti," and Nagios, Ntop, and Cacti will download and install. Configuring the tools can take quite a while depending on the size of the infrastructure, but getting them going is a cinch. At the very least, it's worth a test-drive.
No matter what tools you use to keep tabs on your infrastructure, the fact that those tools exist essentially provides the equivalent of at least one more IT admin -- one that can't necessarily fix anything, but one that watches everything, 24/7/365. The up-front time investment is well worth the effort, no matter which way you cut it. Just be sure to run a small set of autonomous monitoring tools on another server, watching the main monitoring server. This is a case where it's always best to ensure that the watcher is being watched.