Does anyone know if the block size used by the metadata matches the block size used by the volume? I'm wondering for performance tuning for ZFS ZVOLs where you can specify the block size at creation time.
I have a client with a small Xsan volume (about 8TB). They got an used Xserve RAID, and we split it into 2 LUNs and added them into the volume to expand the storage size. We also did a defrag afterwards.
Everything seems to work, except FCPX now works very slow, especially during exporting. When we copy to / from the Xsan directly in Finder, it's still fast, so we are not quite sure what the problem is. One thing we did notice is when FCPX is working on local external hard drive, the CPU load is 99% all the time during exporting. However when working on the Xsan volume, the CPU load only hits 99% for a quarter of the duration during exporting. Sounds like something is preventing FCPX from working 100% that way it should.
Wondering if anyone has run into this before and can offer some insight. I am rebooting the whole thing tonight, and if that still doesn't fix it, then I'll have to backup, wipe, and re-do.
If I have most of my storage in a configuration where there is one RAID head with a bunch of JBODs attached so they all share a single controller and fibre port, is it still useful to let Xsan stripe across LUNs? If I didn't break the storage into smaller LUNs so Xsan can stripe them, I could be more economical and use RAID-6 with larger LUNs.
I have some raid units that are being run through an Atto SCSI->Fibre bridge. This unit went down so I shut down my SAN while the unit was getting repaired. Now that I have put everything back the LUNS that were showing up from the fibre bridge are not showing up anymore in Xsan Admin and thus the volumes are not able to mount. Interestingly the disks show up in Disk Utility and as Xsan formatted. Any ideas?
We have one SANBox 5602 in our facility, serving Mac Pro workstations with Atto Celerity FC-41 and 42 HBAs. It used to serve our Avid Unity with no issues. Last week I got rid of the Unity and upgraded all our systems to 10.8.4.
The problem is an invisible "quota" where only the first 6 systems to be powered on get to see the storage. Any consecutive machines see nothing on the fabric, yet ALL the devices (switch and HBAs) on all ports see a good link, and the switch reports the clients are active and logged in to the fabric.
I already went through the entire fiber infrastructure - ensuring all HBAs have the latest firmware and drivers, ditto for the switch, and matching SFP's and fiber runs. The switch is fully licensed.
I tried a switch factory reset; set a fixed port type and speed (F for the hosts, the storage only takes FL); ensured only hosts have I/O Streamguard; tried putting everyone in 1 zone, and even disabling zoning altogether; I even tried Device Scan on targets.
Nothing made any difference. The only workaround I know is to power-cycle everything which just restarts the "quota" again.
I should disclose that I'm not running XSAN but metaSAN... but this is independent of any SAN software; if the underlying fabric doesn't work then nothing works (besides, I'm still considering XSAN :-) ).
Qlogic inspected the switch service dumps and couldn't find a single thing wrong with it. I'm at my wit's end, and know most everyone here is experienced with similar fiber setups. Do you have any ideas??
So after withering through a full failure on one of my LUNs, well haven't really got through it but I got all my LUNs back online as far as I can tell from the WebPAM on all my Vtraks, my volumes will not mount and I need some guidance please!
The long end of the story is that I had 1 of 2 drives in my meta-data & journal volume drive fail at the same time as 4 of 6 drives in a data LUN died. With only 4 GR Spares available for that drawer, I think you all know what I was faced with. I have a vtrak Ex610Fd system with 4 E-class and 4 J-class with 4 GR spares per E&J all compounded in to 2 Xsan volumes.
I've attached a screen grab of my LUNs for reference if it helps.
Basically, Vtrak1 (my top E&J) had the failures and as a result I ended up successfully recovering the meta-data/journal LUN but the data LUN went offline and I ended up having no choice but to replace the bad disks and initialize it as a new LUN duplicating the naming/settings that were there before. Promise support was of little help and told me my volume was lost because of that. Which I don't buy but that's why I am turning to you all for some help here.
I can download and attach whatever other logs/images you need if it helps or try to explain more but what I need is some help figuring out how to get my volume back online.
I think that the exclamation point icons mean that Xsan can't see those LUNs but I have all 4 subsystems online and OK so I don't know why I'm stuck here.
Also, I realize that everyone sets things up differently but is it conceivable that my LUN schema provides for failure in this manner? What I mean is that is the XSan able to deal with a single LUN failing in a multi-LUN volume?
Hopefully that makes some sense?!?!!!
And thanks for anyone's time and effort in advance!
I've been reusing the same LUN numbers across my fibre targets ie 0-6 without problem until recently. I attached an additional target with the intention of creating a new Xsan volume on the same MDCs. This time just having the new storage (Xyratex) on the same SAN causes one of my other targets (Enhance) to be knocked offline. I was able to stop that from happening by changing the LUNs to a higher number on the Xyratex. However I still have to leave LUN 0 duplicated because both the Xyratex and the Enhance don't even show up on the Mac without having a LUN 0 present.
I've never heard of there being any sort of LUN restrictions in any of the Xsan docs or things I've read online. Is there some known reasoning behind what I'm experiencing? I'm still having some issue here because even though the Enhance no longer gets bumped off, the Xyratex has ridiculously slow speeds ~10MB/s and reports errors.