Sunday, November 28, 2010

ReadyNAS SNMP agent dies... aaarrgggh

If you've been following along, you'll be aware that I set up Nagios monitoring of our ReadyNAS units via SNMP. Happiness ensues! Until Nagios starts spitting out warnings:

readyNAS temp is UNKNOWN
SNMP problem - No data received from host

Oh crud. The box is still happily clicking along... responding to pings, frontview (web management interface) is still working. And what's strangest is that most of SNMP is still responding:

$ snmpwalk -v1 -cpublic em-nas system
SNMPv2-MIB::sysDescr.0 = STRING: Linux em-nas #1 Wed Sep 22 04:42:09 PDT 2010 padre
SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (50273528) 5 days, 19:38:55.28
SNMPv2-MIB::sysContact.0 = STRING: root
[snip all the exciting rest of the system section of the MIB]

But when you try to walk the ReadyNAS-specific section of the MIB:

$ snmpwalk -v1 -cpublic em-nas enterprises.4526

Hmmm... taking a shrewd guess, the ReadyNAS section of the MIB is probably implemented as a sub-agent, and that sub-agent has died. Let's have a poke around... reading /etc/init.d/snmpd, sure enough, that script starts up the usual snmpd and snmptrapd AND /usr/sbin/readynas-agent - ahah, so is this process running?

em-nas:~# ps axwu | grep [a]gent

So a quick solution:

em-nas:~# /etc/init.d/snmpd restart
em-nas:~# ps axwu | grep [a]gent
root     29772  0.1  1.3  9600 3168 ?        S    09:28   0:00 /usr/sbin/readynas-agent

Now check that the ReadyNAS MIB works again:

$ snmpwalk -v1 -cpublic em-nas enterprises.4526.18.7.1
SNMPv2-SMI::enterprises.4526. = INTEGER: 1
SNMPv2-SMI::enterprises.4526. = STRING: "Volume C"
SNMPv2-SMI::enterprises.4526. = STRING: "RAID Level X"
SNMPv2-SMI::enterprises.4526. = STRING: "ok"
SNMPv2-SMI::enterprises.4526. = INTEGER: 2837504
SNMPv2-SMI::enterprises.4526. = INTEGER: 2177024

Yep, that's got it. Now to follow up: why does readynas-agent crash?

No comments:

Post a Comment