Sunday, October 10, 2010

Nagios monitoring Dell PE 2900 via SNMP

I decided I would like to monitor our new file server - you know, so if the RAID became degraded, I'd know... rather than lose two disks from a set like we did recently and um, lose a bit of data. Yeah, oops.

So... how hard can it be? Answer: quite hard for our Windows 2000 server. More on that later.

For Windows 2003, it wasn't too bad, but there were a few hoops to jump through. I'm documenting those hoops here for future reference:

Install OpenManage Server Administrator Managed Node (v6.3) ... ahah, but not so fast - it will probably force you to install new RAID firmware and drivers, so do that, of course, reboot... then here's the trick that got me first time around: Storage Management is deselected for installation by default, so you MUST choose a custom installation and for the love of god, select Storage Management for installation! Why would Dell do this??? Especially after making a song-and-dance that forced me to upgrade my RAID firmware... anyway...

Then to monitor via SNMP you need Windows SNMP installed (Start -> Settings -> Control Panel -> Add/Remove Programs, select "Windows Components" then "Management and Monitoring Tools", click "Details:" button and scroll down to "Simple Network Management Protocol" and make sure that's ticked. By default SNMP only allows polling from localhost (this is either good security, or absolutely stupid, depending on your point of view and level of caffeination). To allow SNMP polling from other hosts, go to the services control panel applet, find "SNMP Service", right-click, select "Properties", click the "Security" tab and either allow SNMP from all hosts, or just the hosts you choose.

Test that SNMP is working:

$ snmpget -v 1 -c public hostname .1.3.6.1.4.1.674.10893.1.20.140.1.1.2.1
SNMPv2-SMI::enterprises.674.10893.1.20.140.1.1.2.1 = STRING: "System"

(this is the name of the "disk label" for virtual disk 1). You can find a list of useful info on OpenManage SNMP here.

Great. From here I could write some simple SNMP checks for Nagios, and so long as the virtualDiskRollUpStatus (1.3.6.1.4.1.674.10893.1.20.140.1.1.19.x) comes back as 3 then we can assume we're all happy. But I thought maybe some helpful soul out there might have already written something more sophisticated for monitoring OpenManage, and they surely have - I settled on check_openmanage as a nice one.

So on the nagios server, I did this:

# cd /usr/local/libexec/nagios/
# wget http://folk.uio.no/trondham/software/check_openmanage-3.6.0/check_openmanage
# chmod +x check_openmanage
# ./check_openmanage -H my-server
OK - System: 'PowerEdge 2900 III', SN: 'XXXXXX1S', 2 GB ram (2 dimms), 2 logical drives, 4 physical drives
# vi /usr/local/etc/nagios/commands.cfg
# add these lines:
# 'check_openmanage' command definition

define command{

command_name check_openmanage

command_line $USER1$/check_openmanage -H $HOSTADDRESS$

}

Then edit the server's .cfg file to call the plugin:
define service{

use local-service ; Name of service

host_name my-server

service_description OpenManage Status

check_command check_openmanage

}



Then re-start nagios and wait till it polls, and see nice green output. Yay!

Show where Windows home directories are

In our AD, user's home directories are stored on various file servers. So when it's time to migrate them to a new file server, how do we determine who needs to get moved off the old one? ldapsearch to the rescue:


ldapsearch -x -LLL -E pr=2000/noprompt -h rov-dc -D Administrator@example.com -W -b 'cn=Users,dc=example,dc=com' -s sub homeDirectory | awk '$1 = /homeDirectory:/ {print $2}' | sort

Wednesday, October 6, 2010

Trend 10 fills disks

Trend Micro OfficeScan 10 offers us some compelling features, so we're upgrading from Trend 7 (stop laughing in the back there, we like old software!). We've found 2 significant gotchas:

Disk filler

We install the Trend server software (i.e. the part that farms out the new virus definitions to the clients on that network) in the default location - C:\Program Files\Trend Micro\ on our local file/print server and soon enough, it fills the entire C: progressively killing off services, until file and print services die, and the users start calling me. The culprit: C:\Program Files\Trend Micro\OfficeScan\PCCSRV\Apache2\logs\error.log has grown to 3 GB (yep, 3 gigabytes of error logs!). I'd love to tell you what was in that file, but Notepad won't open a file that big, and Wordpad wants to make a copy of it - on C:\temp I guess - before it opens it, making the problem even worse. So I just killed it, shrugged and moved on.

Weird Word Wackiness

OK, this is clearly a corner case, but it happened to three machines on one network. Trend 10 on the clients (two of them w2k, one XP), they try to open a Word document from a Samba file share, and...can't do it. With OpenOffice instead of Word, it works. With the documents in question copied to a Windows 2003 file server, it works. With Trend reverted to V7, it works. So: Trend 10 + MS Word 2003 + Samba file server = bizarre errors.

Wednesday, September 29, 2010

XMarks replacement

Sad news: XMarks is shutting down soon, which is a spew, since I found it really handy to sync my bookmarks across several computers and my iPhone.

Fortunately, Firefox Sync will sync my bookmarks (and history) across several computers all using Firefox. And the neato Firefox Home will enable me to access the bookmarks on my iPhone. So I'm happy-ish again.

Still a pity that XMarks is going away, though.

Monday, September 27, 2010

Set DRAC network config remotely

So I had a DRAC at a remote site that refused to play nicey-nice - I was pretty sure the default gateway was set wrong so it was unable to get out to anything beyond its LAN. I was able to SSH to a Unix host on the same network, and from there I could ssh to the DRAC in question, and do the following magical incantation:


racadm setniccfg -s 192.168.0.120 255.255.255.0 192.168.0.3


Those three sets of numbers are IP address, netmask and default gateway - setting the last one correctly... wait 10 seconds or so, and hey presto, DRAC plays nicey-nice with the network again.

A neat trick for when you can't get to the web UI.

Wednesday, September 8, 2010

Getting the track listing for a K3B project

This was harder than I expected. However...

Get your .k3b file, which is really a zip file:

$ unzip -l can-u-pick-em.k3b
Archive: can-u-pick-em.k3b
Length Date Time Name
--------- ---------- ----- ----
17 2106-02-07 17:28 mimetype
10623 2106-02-07 17:28 maindata.xml
--------- -------
10640 2 files

and extract the only useful part:

unzip -p can-u-pick-em.k3b maindata.xml > tmp.xml

That's an XML file that contains the project data. Then we feed it through an XSLT processor, using this stylesheet:


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml">

<xsl:output method="text" encoding="UTF-8"/>

<xsl:template match="/k3b_audio_project">
<xsl:apply-templates select="contents">
</xsl:apply-templates>
</xsl:template>


<xsl:template match="contents">
<xsl:apply-templates select="track">
</xsl:apply-templates>
</xsl:template>

<xsl:template match="track">
<xsl:value-of select="cd-text/artist"/>
<xsl:text> - </xsl:text>
<xsl:value-of select="cd-text/title"/>
<xsl:text>
</xsl:text>
</xsl:template>

</xsl:stylesheet>


like this:

$ xsltproc stylesheet.xslt tmp.xml
Eleni Mandell - Pauline
Neko Case - Star Witness

etc

EDIT!
Another option which I tried initially was using dcop to interrogate K3B while it has the project open - there are tons of suggestions out there to do this, but I could never get it to work. It turns out that dcop doesn't work in KDE4, and has been replaced with DBUS. Ah, so here's some examples using DBUS to talk to amarok - so maybe this will work... though this suggests otherwise

Thursday, September 2, 2010

WSUS, GPO and OU, oh my

I've been wrestling with WSUS - for testing, I only want to apply auto updates to a couple of test victims... err, I mean systems. So I thought I'd create an OU for WSUS, and a sub-OU called test, then create in that a security group, add a couple of test computers to that group. Then apply a GPO to the test OU and hey presto, it would all work. Not so! But along the way I discovered some handy tools to find out why not:

gpupdate /force - force the group policy to update from the DC right now
gpresult - show the set of policies that apply to this computer (and user)

I finally ended up moving the computer's account to a new OU (where the GPO is applied) and it all came good. Annoying, but do-able. Now, to get it detected by the WSUS server:

wuauclt.exe wuauclt /ResetAuthorization /DetectNow - forces the Windows Update agent to trot off to the update server right away. Of course, it doesn't then show up until you manually refresh the view on the WSUS admin console - took me a while to realise that.