Wednesday, March 14, 2012

The Mysterious Case of the HP ProCurve PoE switch(es)

Lots of our sites use oldish Dell switches. They're nothing fancy, but they're pretty cheap, and they've stood the test of time - many of them are more than 5 years old, and never miss a beat. They just work.

One of our newer sites has a pair of fancy-pants HP ProCurve 2910al-48G-PoE switches (48 port, gigabit, power over ethernet). These switches were sufficiently expensive that we started joking that they must have gold-plated connectors and platinum cases. Still, we figured, you get what you pay for - spendy switches == extra good, right? Sadly not.

The first unpleasant surprise was when they arrived - both had chassis that were noticeably bent - as if someone had taken the "curve" part of ProCurve too literally. One was dead on arrival - wouldn't even power on. We kept the working-but-bent one, as we needed it ASAP. HP shipped the replacement for the DOA one, and life went on...

A year later another one died - quite spectacularly - the PoE power supply died, so while the switch kept on working, there was no power over ethernet, and any phones connected to it stopped working. I took out the power cord to power cycle the unit, and when I restored power, there was an unpleasant crackling noise, and acrid smoke issued from the unit. You've never seen two sysadmins move so quickly!

Again HP shipped a replacement, which died within a month, again with a fault in the PoE power supply. HP support asked us to run "show tech all" but we found this from "show log" most useful:

W 01/25/90 22:59:43 00576 chassis: 50V Power Supply 1 is Faulted. Failures: 2
W 01/25/90 22:59:44 00071 chassis: Power Supply failure:  Supply: 1, Failures: 1
W 01/25/90 22:59:45 00578 chassis: Co-processor Unrecoverable fault on PoE controller 1

Yep, another dead PoE power supply.  That was in 2010. Replacement shipped, life went on...

You guessed it... yesterday, I got a call to say that some of the phones at that site had stopped working. It wasn't too hard to guess the cause! The web UI showed no faults, but the port status display showed 5 ports were delivering PoE but had no ethernet link. Definitely not what I expect to see - once the phones are getting power, they boot and establish an ethernet link. So a 55 km drive to the site, and what do you know, another dead PoE power supply:

PoE fault lights on an HP ProCurve switch - a most unpleasant sight!
Just to be thorough, I also hooked up the LinkRunner to confirm what commonsense was already telling me - yep, no power being delivered. Sigh. Before calling HP I checked the warranty status and it told me the warranty expires in 2108 - 96 years from now! I assumed this was an error, but when I called HP to order the replacement, I was told that it is correct - these switches are covered for life (I doubt I have another 96 years on this mortal coil, so probably somewhat beyond my life-span). So that's nice, I guess, given how unreliable they are.

Anyway, the nice lady at HP is shipping out another one, so I guess I'll go to the site again, and replace it yet again. I wonder how long this one will last.

In summary: nice features, expensive switches, totally unreliable.

I'm contemplating buying or making some sort of PoE monitoring device that we could monitor from Nagios, so we know when it fails.

6 comments:

  1. When PoE is in need I only recommend CISCO. Horrible to use but those buggers keep on ticking. HP do wonder managed switches, but their PoE stuff just isn't up to any standard. But one does pay for HP's fantastic warranty.

    ReplyDelete
  2. We have over 800 HP PoE switches in play and have had very minimal PoE issues with them. Granted, we keep a spare of at least 2 of everything (many cases 5+) for quick swap should something fail. In a year, we've replaced maybe 5 2910al-48G-PoE+ switches due to power issues. Keeping all this in mind, the power on our campus is not the most reliable... Lightning storms can take our power grid down, it seems.

    ReplyDelete
  3. Glad to see I am not the only one with this problem we have lost 12 of these switches this year alone exact same problem different buildings different countys some on UPS some directly on conditioned power. Different power companies and grids

    ReplyDelete
  4. We've had 3 HP procurve 3800's have PoE issues, all brand new. These are 48 port PoE+ models. 2 of them (stacked) got a PoE error on bootup. Called our reseller, who checked them out and said HP reported they are aware of an "engineering issue" with them and had us send them somewhere thats not their normal switch repair depot. They overnighted us another 2 switches.
    Those worked for a couple bootups, now one of the two of them is reporting same thing. All 6 PoE controllers report a self-test failure on bootup.
    Calling for another replacement...
    We have 2 other 3800's and 2 2910al's with no issues so far.

    ReplyDelete
  5. Hi there,
    Greetings to all, myself Fahad working in a small organization having same issue with our HP Procurve 2910al-48 g switch. This is happened to us for the first time since I started my job here round about 14 months.
    I'll be in touch gud to see this problem after searching a long time I found it.
    fady.s@live.co.uk

    ReplyDelete
  6. Just got 2 switches 2620 48 PoE+ Switches poe function die on me, went online to check to see if it was common and found this. Calling HP now so I can use this so called amazing warranty. This really should no have happened. FML.

    ReplyDelete