User Functions
Don't have an account yet? Sign up as a New User
Who's Online
Guest Users: 8
|
| View previous topic :: View next topic |
| Author |
Message |
JesusAli Xsan Master

Joined: 25 Jul 2008 Posts: 151
|
Posted: Mon Dec 14, 2009 5:57 pm Post subject: Halp! Can't get into the WebPAM Pro! |
|
|
Hello Everyone,
I just got a call from my Network Admin reporting a siren on the E+J and a red light on PD16.
I tried to log in with the WebPAMPro and no luck, each attempt Times Out.
I have MDC02 connected with a Serial Connection. I Terminaled in to the Promise and it is in Maintenance Mode. Running "phydrv" produces this:
| Code: | MAINTENANCE MODE@cli> phydrv
===============================================================================
PdId Model Type CfgCapacity Location OpStatus ConfigStatus
===============================================================================
1 ST3750640NS SATA 698.49GB Encl1 Slot1 Not usable N/A
2 ST3750640NS SATA 698.49GB Encl1 Slot2 Not usable N/A
3 ST3750640NS SATA 698.49GB Encl1 Slot3 Not usable N/A
4 ST3750640NS SATA 698.49GB Encl1 Slot4 Not usable N/A
5 ST3750640NS SATA 698.49GB Encl1 Slot5 Not usable N/A
6 ST3750640NS SATA 698.49GB Encl1 Slot6 Not usable N/A
7 ST3750640NS SATA 698.49GB Encl1 Slot7 Not usable N/A
8 ST3750640NS SATA 698.49GB Encl1 Slot8 Not usable N/A
9 ST3750640NS SATA 698.49GB Encl1 Slot9 Not usable N/A
10 ST3750640NS SATA 698.49GB Encl1 Slot10 Not usable N/A
11 ST3750640NS SATA 698.49GB Encl1 Slot11 Not usable N/A
12 ST3750640NS SATA 698.49GB Encl1 Slot12 Not usable N/A
13 ST3750640NS SATA 698.49GB Encl1 Slot13 Not usable N/A
14 ST3750640NS SATA 698.49GB Encl1 Slot14 Not usable N/A
15 ST3750640NS SATA 698.49GB Encl1 Slot15 Not usable N/A
17 ST3750640NS SATA 698.49GB Encl2 Slot1 Not usable N/A
18 ST3750640NS SATA 698.49GB Encl2 Slot2 Not usable N/A
19 ST3750640NS SATA 698.49GB Encl2 Slot3 Not usable N/A
20 ST3750640NS SATA 698.49GB Encl2 Slot4 Not usable N/A
21 ST3750640NS SATA 698.49GB Encl2 Slot5 Not usable N/A
22 ST3750640NS SATA 698.49GB Encl2 Slot6 Not usable N/A
23 ST3750640NS SATA 698.49GB Encl2 Slot7 Not usable N/A
24 ST3750640NS SATA 698.49GB Encl2 Slot8 Not usable N/A
25 ST3750640NS SATA 698.49GB Encl2 Slot9 Not usable N/A
26 ST3750640NS SATA 698.49GB Encl2 Slot10 Not usable N/A
27 ST3750640NS SATA 698.49GB Encl2 Slot11 Not usable N/A
28 ST3750640NS SATA 698.49GB Encl2 Slot12 Not usable N/A
29 ST3750640NS SATA 698.49GB Encl2 Slot13 Not usable N/A
30 ST3750640NS SATA 698.49GB Encl2 Slot14 Not usable N/A
31 ST3750640NS SATA 698.49GB Encl2 Slot15 Not usable N/A
32 ST3750640NS SATA 698.49GB Encl2 Slot16 Not usable N/A |
In Xsan Admin the Volume looked fine. No error were reported! They are closing the building so this will have to wait until tomorrow.
I turned off the buzzer in the CLI and Stopped the Volume in Xsan Admin.
Any other advice for tomorrow morning?
Especially, is it safe to leave Maintenance Mode? |
|
| Back to top |
|
 |
abstractrude Xsan Master

Joined: 13 Mar 2008 Posts: 881
|
Posted: Mon Dec 14, 2009 8:16 pm Post subject: |
|
|
call support. but bring your unit down and back up will probably clear the issues. did you lun rebuild? what happens when you type in bga. did the controller fail over?
also try typing in event in maintenance mode and see what you get. sounds like a typical promise crash. have had 4-5 of these, where one day a unit just goes down,stops responding to I/O requests and the such, reboot will bring it back to life. |
|
| Back to top |
|
 |
JesusAli Xsan Master

Joined: 25 Jul 2008 Posts: 151
|
Posted: Tue Dec 15, 2009 1:31 am Post subject: |
|
|
Thanks abstract, based on your advice in a previous post, I tried running:
rb and bga in the CLI, but both were rejected because it says they cannot be run in Maintenance mode.
I will call support. Now that the semester is done, it's time. I can also take care of that PD13 which previously went bad.
I might as well work on upgrading the Firmware at this point, too. |
|
| Back to top |
|
 |
vicpache fully protected

Joined: 22 Oct 2008 Posts: 10
|
Posted: Thu Dec 17, 2009 5:00 pm Post subject: |
|
|
Hi Jesus,
From the sounds of it the controller is in maintenance mode for X reason.
The controller failed over and entered maintenance mode, it seems the other controller is up and running thus why you still have access to the LUNs (the benefits of using redundant controllers).
The problem should be easy to resolve.
The drives report as not usable from the controller in question because it is in maintenance mode.
There a several possible reasons why one of the two controllers may enter maintenance mode:
-Firmware mismatch
-Broken AAMUX path from one controller to a the respective disk when the LD (Logical Disk) is "Critical"
-Bad SAS cable going from the RAID head to the JBOD on the respective SAS domain (expected behavior)
-Other scenarios that are less likely
From maintenance mode you can issue the "event" command to see what events got triggered.
You can also connect to the other controller via serial prompt, telnet, ssh or http.
Do me a huge favor and give Promise Technology Technical Support a call. 408-228-1400 option #8
--
Best Regards,
Victor Pacheco
Manager, Field Application Engineering & Support
Promise Technology
580 Cottonwood Drive
Milpitas, Ca 95035
Office (408) 228-1441
Mobil (408) 202-6808
Technical Support - (408) 228-1400
http://www.promise.com |
|
| Back to top |
|
 |
JesusAli Xsan Master

Joined: 25 Jul 2008 Posts: 151
|
Posted: Fri Dec 18, 2009 2:27 pm Post subject: |
|
|
Just to follow up on my HALP! yell.
I did in fact call Promise Tech Support and they were really great.
The Tech Mario M. talked me through the proper shutdown and boot-up procedures from maintenance mode. I then rebooted the E+J, and when it came back up I saw in the Subsystem Log that Physical Disk 16 had been generating repeating errors:
| Code: | -------------------------------------------------------------------------------
SeqNo: 1916 Device: PD 16
EventId: 0x000D0011 Severity: Minor
TimeStamp: Dec 13, 2009 21:50:12 DefaultId: 0x000D0011
SpecData: 000000000000000000 0000012020202020 2020202020202035 5144354453564B
Description: Command times out on physical disk
------------------------------------------------------------------------------- |
If you notice in my post above, there is NO ENTRY for PD16!!! I didn't even see that!
Anyways, I then Exported a Subsystem report from the Promise WebPAM Pro interface (which was now working again, after the reboot) and submitted that to the Promise website where Mario M. had started a Support File for me.
Then later that night, I was called on my cell phone by a Promise Tech. I didn't recognize the number and didn't answer. But the next day at work I was called again and spoke with Stephen S. who took care of wrapping up my issue and sending out new disk sleds, which are on the way now. A replacement for Physical Disk 16 (the disk is probably fine, but the AAMUX is bad) and for Physical Disk 13 which had previously entered a "Stale" state, which I used the Command Line Interface to bring it out of (a move not recommended by Xsanity board member abstractrude, and not favored by Promise Tech Stephen S.).
The consensus is that it is a problem with the AAMUX, which is the interface on the drive sled that plugs into the SATA connector. From what I was told, the AAMUX is what allows the two different Controllers to access the same disk.
Something went wrong on the AAMUX. One of the Controllers wasn't getting a response, and since both Controllers couldn't see the Disk, it was a safety procedure to enter Maintenance Mode.
And, since Controller 1 couldn't see PD16, Controller 1 thought something was wrong with itself and reported itself as bad, but probably wasn't. I have a Spare Controller onsite so I swapped it out anyway and will check later to see if it can boot the chassis with only (the previously reported as bad) Controller 1.
All in all, I just wanted to follow up to let everyone know how impressed and surprised I was by Promise Tech support. Very helpful and cheerful guys who are being very aggressive in solving my problems.
Makes me feel confident about recommending there hardware to my school!  |
|
| Back to top |
|
 |
JesusAli Xsan Master

Joined: 25 Jul 2008 Posts: 151
|
Posted: Fri Dec 18, 2009 2:31 pm Post subject: |
|
|
Oh yeah, I almost forgot:
MY XSAN VOLUME IS UP AND WONDERFUL WITH ABSOLUTELY NO PROBLEMS.
I didn't lose a single byte of data!!! |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|