Thursday, April 19, 2012

Nagios UPTIME errors for a Windows 2003 client

We have a Windows 2003 server - one of a farm of 5 similar machines - that suddenly started reporting errors in Nagios:

nagios# /usr/local/libexec/nagios/check_nt -H mq-citrix-5 -v UPTIME 
NSClient - ERROR: Could not get value

Looks like the issue was with the underlying Windows counters that nagios client uses to get this info - this was useful reading: http://nsclient.org/nscp/discussion/message/1066

So I did this to verify we had the same issue:

cd "\Program Files\NSClient++ "
nsclient++.exe /test 
(error output) 

To fix (on w2k3) we need to force a counter rebuild:

cd \windows\system32 
lodctr /R 

I had to re-start the NSClient++ service, then test again from nagios:

nagios# /usr/local/libexec/nagios/check_nt -H mq-citrix-5 -v UPTIME 
System Uptime - 0 day(s) 9 hour(s) 7 minute(s) 

And we are happy again. In hindsight, I guess this server crashing 6 times in an hour must have corrupted the counters. Wait a couple of minutes, and this hits my inbox:

** RECOVERY alert - mq-citrix-5/UPTIME is OK **

Yay!

Courier IMAP - migrating all users' email

So we have a project underway to move all our users off our old postfix+courier+squirrelmail system to Microsoft Exchange 2010. Now, you might think this would be easy, but you would be wrong.

Some bits are okay - getting a list of all users and their job titles, photos etc. from LDAP is easy - ldapsearch is a pretty powerful tool. But the bit that I assumed would be easiest of all - importing all their existing mail into Exchange - has proved a little more difficult.

Exchange seems not to have any native tools to import Maildir (which is of course what we use) so I planned to use imapsync. Good theory, and some useful blog posts here and here point the way to setting up one user account as an administrator who can connect to any mailbox. But after several hours of flailing around, I failed. Here's what I tried:

In /etc/courier/authldaprc, set LDAP_AUXOPTIONS sharedgroup=group

In LDAP, use the (previously unused) sharedgroup attribute:
# ldapsearch -h ldap -x -b 'ou=People,dc=example,dc=com,dc=au' '(uid=migrate1)' sharedgroup -LLL
dn: uid=migrate1,ou=People,dc=example,dc=com,dc=au
sharedgroup: administrators

Then test:
# courieruserinfo migrate1
uid=10381
gid=100
home=/home/migrate1
authaddr=migrate1
authfullname=Email Migration User
maildir=
quota=
options=

Hmmm... options isn't set. OK, try the same with userdb:
# userdb migrate1 set options=group=administrators
# userdb -show migrate1
options=group=administrators
root@zappa:~# courieruserinfo migrate1
uid=10381
gid=100
home=/home/migrate1
authaddr=migrate1
authfullname=Email Migration User
maildir=
quota=
options=

Still not set! Why not? Ahhhh, bugger, we're using PAM auth (in /etc/courier/authdaemonrc, I've set authmodulelist="authpam")

and if you read the documentation carefully enough:
"The authentication library has a facility for keep arbitrary “name=value”-type settings, called “options”, for individual accounts. This feature is only available with userdb, LDAP, MySQL, and PostgresSQL modules. Individual account options are not supported with system-based authentication modules (password/shadow files, or PAM)."

Well that explains why it doesn't work... now how do we fix that? I can see a few options, which I guess I'll be trying out in the next few weeks. More to come.