I just barely finished turning up two new datacenters in two different states within two weeks. Exhausting? Definitely. On the plus side, however, I wrote several new tools and plugins to manage all of the APC gear that went into both sites with Nagios and Cacti.
First, a little background. Both datacenters were built to be nearly identical to each other -- from rack layout to equipment, to color-coded patch cabling. The major difference is that one site is cooled with APC ACSC100 In-row air units, and the other cooled with ACRC100 In-row water-cooling units. Both sites are powered from APC Symmetra PX UPSes and PDUs, and use APC racks and 3-phase zero-U rackmount PDUs. In addition, several NetBotz WallBotz 500 units were implemented to provide external environmental monitoring and surveillance of the rooms. Basically, it's all APC gear. I'll be posting more on the build process over the next few weeks, but I wanted to get some of the code out there first.
I wrote two main plugins for Nagios and Cacti to assist in monitoring all this new stuff. The Nagios plugin checks the most pertinent data on the ACRC and ACSC units, as well as the main sensors on the NetBotz units, and the load on each phase on the PDUs. It's come in very handy since the sites were turned up, since I have a easily-digested central view of all PDUs, or all AC units on one page. Tweaking parameters on the AC units becomes very simple when you have all the data in one place, versus having to log into each unit to get status info, or even using APC's Infrastruxure Central Console.
Usage: ./check_apcext.pl -H <hostip> -C <community> -p <parameter> -w <warnval> -c <critval>
nbmstemp NetBotz main sensor temp
nbmshum NetBotz main sensor humidity
nbmsairflow NetBotz main sensor airflow
APC Metered Rack PDU (3 phase)
rpduamps Amps on each phase