Fiber: Only the first 6 clients can see the storage?!

DLpres's picture

We have one SANBox 5602 in our facility, serving Mac Pro workstations with Atto Celerity FC-41 and 42 HBAs. It used to serve our Avid Unity with no issues. Last week I got rid of the Unity and upgraded all our systems to 10.8.4.

The problem is an invisible "quota" where only the first 6 systems to be powered on get to see the storage. Any consecutive machines see nothing on the fabric, yet ALL the devices (switch and HBAs) on all ports see a good link, and the switch reports the clients are active and logged in to the fabric.

I already went through the entire fiber infrastructure - ensuring all HBAs have the latest firmware and drivers, ditto for the switch, and matching SFP's and fiber runs. The switch is fully licensed.
I tried a switch factory reset; set a fixed port type and speed (F for the hosts, the storage only takes FL); ensured only hosts have I/O Streamguard; tried putting everyone in 1 zone, and even disabling zoning altogether; I even tried Device Scan on targets.

Nothing made any difference. The only workaround I know is to power-cycle everything which just restarts the "quota" again.

I should disclose that I'm not running XSAN but metaSAN... but this is independent of any SAN software; if the underlying fabric doesn't work then nothing works (besides, I'm still considering XSAN :-) ).

Qlogic inspected the switch service dumps and couldn't find a single thing wrong with it. I'm at my wit's end, and know most everyone here is experienced with similar fiber setups. Do you have any ideas??

DLpres's picture

One more development... so far I only had Atto Celerity HBAs in the SAN. Today I tried to add an Apple (LSI) 4Gb HBA. As soon as I booted the client, BOOM - my SAN volume was gone from ALL clients. The clients thought it was forcefully ejected (well, it was). The volume as well as its storage chassis was completely MIA from Disk Utility in all clients, until I restarted the storage.

My prime suspect is the storage, since a second storage chassis I installed does not suffer from any of those issues. I'm working with the vendor (slowly) to see if they can figure anything out (so far they can't...).

Has anyone ever seen anything like this?

xsanguy's picture

What kind of storage are you using?

DLpres's picture

Maxx Digital. It's an old fiber RAID that was mostly a one-off. Actual hardware is by QSAN.
At this point we shipped the controller to Taiwan for warranty service, we'll see what they find out.