XSAN Panic

dchadd's picture

My two MDC's are constantly panicing back and fourth with each other. When both of our XSAN volumes are stopped it doesn't do it.

I ran cvfsck -j -wv and -nv on both volumes

still having the same issue. Please help.

David

Interval Since Last Panic Report: 10800479 sec
Panics Since Last Report: 3
Anonymous UUID: 213DA0DD-F5BC-4381-95EF-28DCF9D34CC8

Fri Aug 9 00:35:29 2013
panic(cpu 1 caller 0x244b0b): "zalloc: \"kalloc.8192\" (296 elements) retry fail 3, kfree_nop_count: 0"@/SourceCache/xnu/xnu-1699.32.7/osfmk/kern/zalloc.c:1766
Backtrace (CPU 1), Frame : Return Address (4 potential args on stack)
0x10e3dc8 : 0x2203de (0x6b08cc 0x10e3de8 0x229fb0 0x0)
0x10e3df8 : 0x244b0b (0x6b2fd0 0x6b157a 0x128 0x3)
0x10e3e78 : 0x22629a (0x44911fc 0x1 0x17ecdc 0x3010001)
0x10e3ea8 : 0x2263e9 (0x1698 0x1 0x10e3ed8 0x2115e7)
0x10e3eb8 : 0x2115e7 (0x1698 0x3 0x10e3f28 0x213758)
0x10e3ed8 : 0x222d01 (0x10f8 0x10e3f14 0x10e3f0c 0x2c3f24)
0x10e3f08 : 0x214063 (0x15eaae00 0x5f7f690 0x5a5ddc0 0x15eaae00)
0x10e3f48 : 0x21b25b (0x15eaae00 0x0 0x0 0x0)
0x10e3f98 : 0x2b7bb7 (0x5efef94 0x1 0x5efefc4 0x8)
0x10e3fc8 : 0x2e60c7 (0x5efef90 0x1 0x10 0x15e6c610)

BSD process name corresponding to current thread: hwmond
Boot args: srv=1 serverperfmode=1

Mac OS version:
11G63

Kernel version:
Darwin Kernel Version 11.4.2: Thu Aug 23 16:26:45 PDT 2012; root:xnu-1699.32.7~1/RELEASE_I386
Kernel UUID: 859B45FB-14BB-35ED-B823-08393C63E13B
System model name: Xserve1,1 (Mac-F4208AC8)

System uptime in nanoseconds: 422541669703
vm objects:12685920
vm object hash entri:1517760
pv_list:1569792
kalloc.16:3698688
kalloc.64:17186816
kalloc.128:368746496
kalloc.256:1056768
kalloc.1024:71114752
kalloc.2048:10928128
kalloc.8192:2424832
vm pages:22670868
threads:1304544
vnodes:13325180
namecache:3863760
HFS node:6372352
HFS fork:2381456
buf.8192:4587520
ubc_info zone:3027360
vnode pager structur:1513680
Kernel Stacks:3555328
PageTables:20250624
Kalloc.Large:3506744

Backtrace suspected of leaking: (outstanding bytes: 308096)
0x22629a
0x22647e
0x1b49ab4
0x1b41bbe
0x1b0a8be
0x1b0c23b
0x1b0cd5d
0x1acd6dd
0x1ab1a74
0x32cf8c
0x322ac9
0x31aea0
0x31b4ad
0x5f3c2a
Kernel Extensions in backtrace:
com.apple.filesystems.acfs(457.8)[410E70E4-4FDD-44BD-BF7D-5C83B874E699]@0x1a9d000->0x1bf0fff
dependency: com.apple.iokit.IOStorageFamily(1.7.2)[9164AEE7-BA92-45A2-BA9C-638B980193F1]@0xa28000

last loaded kext at 357666649348: com.apple.filesystems.afpfs 9.8.1 (addr 0x17bf000, size 376832)
last unloaded kext at 207159313667: com.apple.iokit.IOAHCIFamily 2.0.8 (addr 0x1128000, size 45056)
loaded kexts:
com.logmein.driver.LogMeInSoundDriver 1.0.2
com.apple.filesystems.afpfs 9.8.1
com.apple.nke.asp_tcp 6.0.1
com.apple.filesystems.smbfs 1.7.2
com.apple.filesystems.autofs 3.0
com.apple.driver.AppleHWSensor 1.9.5d0
com.apple.driver.AppleUpstreamUserClient 3.5.9
com.apple.driver.AppleMCCSControl 1.0.33
com.apple.kext.ATIFramebuffer 7.3.2
com.apple.iokit.IOUserEthernet 1.0.0d1
com.apple.iokit.IOBluetoothSerialManager 4.0.8f17
com.apple.driver.AppleBMC 2.0.2
com.apple.driver.AppleMCEDriver 1.1.9
com.apple.Dont_Steal_Mac_OS_X 7.0.0
com.apple.driver.AudioIPCDriver 1.2.3
com.apple.driver.ApplePolicyControl 3.1.33
com.apple.driver.ACPI_SMC_PlatformPlugin 5.0.0d8
com.apple.ATIRadeonX1000 7.0.4
com.apple.driver.AppleLPC 1.6.0
com.apple.driver.Apple16X50ACPI 3.0
com.apple.driver.AppleSEP 1.5.0
com.apple.driver.AppleRAID 4.0.6
com.apple.driver.XsanFilter 404
com.apple.iokit.SCSITaskUserClient 3.2.1
com.apple.AppleFSCompression.AppleFSCompressionTypeDataless 1.0.0d1
com.apple.AppleFSCompression.AppleFSCompressionTypeZlib 1.0.0d1
com.apple.BootCache 33
com.apple.driver.AppleIntel8254XEthernet 2.1.3b1
com.apple.driver.AppleUSBHub 5.1.0
com.apple.driver.AppleFWOHCI 4.9.0
com.apple.driver.AppleIntelPIIXATA 2.5.1
com.apple.driver.AppleLSIFusionMPT 3.0.5
com.apple.driver.AppleEFINVRAM 1.6.1
com.apple.driver.AppleUSBEHCI 5.1.0
com.apple.driver.AppleUSBUHCI 5.1.0
com.apple.driver.AppleACPIButtons 1.5
com.apple.driver.AppleRTC 1.5
com.apple.driver.AppleHPET 1.7
com.apple.driver.AppleSMBIOS 1.9
com.apple.driver.AppleACPIEC 1.5
com.apple.driver.AppleAPIC 1.6
com.apple.driver.AppleIntelCPUPowerManagementClient 195.0.0
com.apple.nke.applicationfirewall 3.2.30
com.apple.security.quarantine 1.4
com.apple.security.TMSafetyNet 8
com.apple.driver.AppleIntelCPUPowerManagement 195.0.0
com.apple.security.SecureRemotePassword 1.0
com.apple.filesystems.acfsctl 457.8
com.apple.kext.triggers 1.0
com.apple.filesystems.acfs 457.8
com.apple.driver.AppleSMBusController 1.0.10d0
com.apple.iokit.IOFireWireIP 2.2.5
com.apple.iokit.IOSurface 80.0.2
com.apple.iokit.IOAudioFamily 1.8.6fc18
com.apple.kext.OSvKernDSPLib 1.3
com.apple.driver.AppleGraphicsControl 3.1.33
com.apple.driver.AppleSMC 3.1.3d10
com.apple.driver.IOPlatformPluginLegacy 5.0.0d8
com.apple.iokit.IONDRVSupport 2.3.4
com.apple.driver.IOPlatformPluginFamily 5.1.1d6
com.apple.driver.Apple16X50Serial 3.0
com.apple.iokit.IOSerialFamily 10.0.5
com.apple.kext.ATI1300Controller 7.3.2
com.apple.kext.ATISupport 7.3.2
com.apple.iokit.IOGraphicsFamily 2.3.4
com.apple.iokit.IOSCSIBlockCommandsDevice 3.2.1
com.apple.iokit.IOUSBMassStorageClass 3.0.3
com.apple.driver.AppleUSBComposite 5.0.0
com.apple.iokit.IOSCSIMultimediaCommandsDevice 3.2.1
com.apple.iokit.IOBDStorageFamily 1.7
com.apple.iokit.IODVDStorageFamily 1.7.1
com.apple.iokit.IOCDStorageFamily 1.7.1
com.apple.iokit.IOATAPIProtocolTransport 3.0.0
com.apple.iokit.IONetworkingFamily 2.1
com.apple.iokit.IOUSBUserClient 5.0.0
com.apple.iokit.IOFireWireFamily 4.4.8
com.apple.iokit.IOATAFamily 2.5.1
com.apple.iokit.IOSCSIParallelFamily 2.5.1
com.apple.iokit.IOSCSIArchitectureModelFamily 3.2.1
com.apple.iokit.IOUSBFamily 5.1.0
com.apple.driver.AppleEFIRuntime 1.6.1
com.apple.driver.AppleKeyswitch 1.0.5f4
com.apple.iokit.IOHIDFamily 1.7.1
com.apple.iokit.IOSMBusFamily 1.1
com.apple.security.sandbox 177.9
com.apple.kext.AppleMatch 1.0.0d1
com.apple.driver.DiskImages 331.7
com.apple.iokit.IOStorageFamily 1.7.2
com.apple.driver.AppleKeyStore 28.18
com.apple.driver.AppleACPIPlatform 1.5
com.apple.iokit.IOPCIFamily 2.7
com.apple.iokit.IOACPIFamily 1.4
Model: Xserve1,1, BootROM XS11.0080.B01, 4 processors, Dual-Core Intel Xeon, 2 GHz, 2 GB, SMC 1.11f5
Graphics: ATI Radeon X1300, ATY,RadeonX1300, PCIe, 64 MB
Memory Module: BRANCH 0 CHANNEL 0/DIMM 1, 512 MB, DDR2 FB-DIMM, 667 MHz, 0x80CE, 0x4D3339355436353533435A342D4345363120
Memory Module: BRANCH 0 CHANNEL 1/DIMM 2, 512 MB, DDR2 FB-DIMM, 667 MHz, 0x80CE, 0x4D3339355436353533435A342D4345363120
Memory Module: BRANCH 1 CHANNEL 0/DIMM 3, 512 MB, DDR2 FB-DIMM, 667 MHz, 0x80CE, 0x4D3339355436353533435A342D4345363120
Memory Module: BRANCH 1 CHANNEL 1/DIMM 4, 512 MB, DDR2 FB-DIMM, 667 MHz, 0x80CE, 0x4D3339355436353533435A342D4345363120
Network Service: Cashman LAN, Ethernet, en0
Network Service: Metadata LAN, Ethernet, en1
PCI Card: ATY,RadeonX1300, sppci_displaycontroller, Mezzanine
PCI Card: Apple 2 Port 4Gbps Fibre Channel Card, sppci_fibrechannel, Slot-1
PCI Card: Apple 2 Port 4Gbps Fibre Channel Card, sppci_fibrechannel, Slot-1
Parallel ATA Device: MATSHITACD-RW CW-8124
Fibre Channel Device: SCSI Target Device @ 0
Fibre Channel Device: SCSI Target Device @ 1
Fibre Channel Device: SCSI Target Device @ 2
Fibre Channel Device: SCSI Target Device @ 3
USB Device: USB to ATA/ATAPI Bridge, 0x152d (JMicron Technology Corp.), 0x2352, 0xfd100000 / 2
USB Device: Frontpanel Controller, apple_vendor_id, 0x8261, 0x3d100000 / 2
FireWire Device: built-in_hub, 800mbit_speed

abstractrude's picture

have you tried starting up with a single controller or isolated any hardware issue?

also, what versions are you running???

-Trevor Carlson
THUMBWAR

singlemalt's picture

Are these FSM panics or kernel panics? And yeah which OS and which version of Xsan you're running is kind of important.

dchadd's picture

Thanks for replying,

I tried started up the controllers separately and it did the same thing.

These are kernel panics. They servers automatically reboot when it panics.

XSAN 2.3.2
OS 10.7.5

Still trying to rule out a fiber cable. Any other advise is greatly appreciated.

dchadd's picture

output of cvfsck -nv on my VIDEO volume

Seems to only panic when that volume is started
[code]mdc:~ root# cvfsck -nv VIDEO

BUILD INFO:

  1. !@$ Server Revision 3.5.0 Build 7443 Branch (457.8)
  2. !@$ Built for Darwin 11.0 i386
  3. !@$ Created on Thu Aug 23 16:33:18 PDT 2012

Created directory /tmp/cvfsck448a for temporary files.
Creating METAPOOL allocation check file.
Creating VIDEOPOOL001 allocation check file.

    • NOTE ** Read Only Check.

File system journal will not be recovered.
The results may be inconsistent and mis-leading.

Super Block information.
FS Created On : Fri Mar 22 08:35:34 2013
Inode Version : '2.5' - XSan 2.2 named streams inode version (0x205)
File System Status : Clean
Allocated Inodes : 741376
Free Inodes : 18037
FL Blocks : 34
Next Inode Chunk : 0x8c4e
Metadump Seqno : 0
Restore Journal Seqno : 0
Windows Security Indx Inode : 0x5
Windows Security Data Inode : 0x6
Quota Database Inode : 0x7
ID Database Inode : 0xa
Client Write Opens Inode : 0x8

Stripe Group METAPOOL ( 0) 0x17495c0 blocks.
Stripe Group VIDEOPOOL001 ( 1) 0x1b4a4740 blocks.

Building Inode Index Database 741376 (100%).

Verifying NT Security Descriptors
Found 671 NT Security Descriptors: all are good

Verifying Free List Extents.

Scanning inodes 741376 (100%).

Sorting extent list for METAPOOL pass 1/1
Updating bitmap for METAPOOL extents 3994 ( 0%).
Sorting extent list for VIDEOPOOL001 pass 1/1
Updating bitmap for VIDEOPOOL001 extents 6704956 (100%).

Checking for dead inodes 741376 (100%).

Checking directories 3902 (100%).

Scanning for orphaned inodes 741376 (100%).

Verifying link & subdir counts 741376 (100%).

Checking free list. 741376 (100%).
Checking pending free list.

Checking Arbitration Control Block.

Checking METAPOOL allocation bit maps (100%).
Checking VIDEOPOOL001 allocation bit maps (100%).

File system 'VIDEO'. Blocks-457852736 free-118699659 Inodes-741376 free-18037.

File System Read-Only Check completed successfully.
mdc:~ root# /code

singlemalt's picture

Ok so the the volume is fine and fairly new. My bet is FC hardware then. If you’ve got AppleCare coverage on the mdcs you could call AppleCare. They should be able to tell you how to get a core dump from the panic and have it analyzed. They’ve done it for me in past.

dchadd's picture

No applecare. I rebuilt the volume when we went from xsan 1 to 2. These are old xserve raid chassis and xserve's.

abstractrude's picture

so have you tried to isolate anything.

-Trevor Carlson
THUMBWAR

dchadd's picture

Pretty sure I got it working. I think it was the fiber cable on the metadata raid LUN. It's been up for 20 mins. Ill post back if I have any more issues. Thanks for your help!

dchadd's picture

Well it worked for about 45 minutes and then paniced again and is now panicing after 7-10 mins of up time. I replaced that same cable. I let you know how it goes.

abstractrude's picture

sounds about right. how did you isolate the issue?

-Trevor Carlson
THUMBWAR

dchadd's picture

I took a new SFP cable and went one by one until I found the cable causing the issue.

dchadd's picture

Running strong for two hours! :D

abstractrude's picture

nice work!

-Trevor Carlson
THUMBWAR

zoeyku's picture

sounds about right.

ashleyco's picture

sounds about right.