Wednesday, May 11, 2011

Nagios monitoring HP Proliant running Ubuntu 10.04 Linux

A while back I wrote about how to get the HP bits and pieces installed on Ubuntu 10.04. Now the next step: getting Nagios to poll it via SNMP and report if things are okay.

OK, assuming you already have net-snmp installed (check it: snmpwalk -v1 -cpublic your.host.name system - should return basic SNMP info such as sysLocation, contact name etc).

Now run /sbin/hpsnmpconfig (installed by package hp-snmp-agents) and answer the prompts. Note that the initial question "Do you wish to use an existing snmpd.conf" fooled me the first few times through - answer "n" to have it make changes to your existing snmpd.conf file - is it just me, or is this question confusing?

Anyhow... after that it was all smooth sailing, I answered with the config I wanted, and hey presto, it made its changes to snmpd.conf

Then from another box, try a few snmpwalks to make sure the HP sub-agent is answering:

snmpwalk -v 1 -c public my.host.name 1.3.6.1.4.1.232 <-- this is where HP's SNMP subtree (arc, if you prefer the technical term) starts

You will get an abundance of SNMP output, all of it pertaining to HP components.


Then add the check_hp plugin to your Nagios box - download the file, extract the tarball, cp check_hp /path/to/nagios/executables (for us, on nanoBSD, /usr/local/libexec/nagios/) . Note I had to edit the check_hp script to make the first "use lib" to point to /usr/local/libexec/nagios/ - if this is wrong, check_hp will complain that it can't find utils.pm

Then test:
nagios# /usr/local/libexec/nagios/check_hp -H my.host.name
Compaq/HP Agent Check: overall system state OK

Woohoo! Now add the command definition to commands.cfg (for us, /usr/local/etc/nagios/commands.cfg):

# 'check_hp' command definition
define command{
        command_name    check_hp
        command_line    $USER1$/check_hp -H $HOSTADDRESS$
        }

Then add the check to the host in question:

define service{
        use                             local-service         ; Name of service template to use
        host_name                       my.host.name
        service_description             HP System State
        check_command                   check_hp
        }

Restart Nagios, and watch the newly-added check get run, and (hopefully) green bits appear on screen.

If you're interested  in what exactly the check_hp plugin is monitoring, you can run it manually with the -d flag to see debug output about the SNMP values checked. The script is also pretty easy to read, so have a look through at what is being checked. I also found a neat-o guide to the HP/Compaq MIBS which is worth a read (and considerably easier than reading the MIB files, which are enormous!)

And now that SNMP is working properly, the previously-blank System Management page (accessed at https://your.host.name:2381) will be full of useful info about your hardware. Yep, it uses SNMP to find out about the system state too. Busted SNMP == no info. No wonder I never found it very useful before!

Also a useful manual from HP here - I wish I'd had this in the first place!

No comments:

Post a Comment