Xsanity Sanity for Apple's Xsan and Final Cut Server.
  
Saturday, May 25 2013 @ 12:53 PM EDT
Topics
Storage (39)
People (1)
Xsan (103)
How To (26)
User Functions
Username:

Password:

Don't have an account yet? Sign up as a New User
Who's Online
thomasb 
Guest Users: 10
Sponsorship

Xsanity is proudly sponsored by:

Tekserve
The Old Reliable Mac Shop

Metadata Latency, Is Our Xsan System Setup Correctly?

 
Post new topic   Reply to topic    Xsanity Forums Forum Index -> Troubleshooting
View previous topic :: View next topic  
Author Message
xsanman
fully protected
fully protected


Joined: 23 Sep 2010
Posts: 11

PostPosted: Thu Sep 23, 2010 1:52 pm    Post subject: Metadata Latency, Is Our Xsan System Setup Correctly? Reply with quote

Hello all,

I've been tasked to trouble shoot our current Xsan environment, which has trouble with high latency in the 3K ms which breaks read/write access for users on occasion. The only way to fix this thus far is to unmount the volume and restart the server.

What I would like to know is if we have our Xsan system setup correctly. We have two sets of arrays. Array 1 = 1 x Vtrak E Calls with 2 x J Class expansions attached and Array 2 = 1 x Vtrak E Calls with 2 x J Class expansions attached, which is an identical setup.

The two separate arrays are not joined together with any SAS cables. They all converge into the same fiber switch tough. We have metadata mirrored on the first two drives of the first array and two dedicated Metadata controllers. The two arrays are logically connected via software, the two arrays are seen as one large production volume.

Is this setup up correctly? Should we have Metadata on the second array too, like a "multi-san" setup? Each shelf of 16 drives is setup in halfs. The first two column of 8 drives as raid 5 and the second 2 column of 8 drives as raid 5 also and so on. Should all 16 drives have been setup as one LUN?

I have limited experience with Xsan but have done lots of research to be come more educated about Xsan deployments and would like to know the answers to these questions, so I can plan to rebuild the system when time and schedule permits.

Thanks in advance for any help.
Back to top
View user's profile Send private message
mjsanders
Could work for Apple
Could work for Apple


Joined: 02 Nov 2005
Posts: 59

PostPosted: Fri Sep 24, 2010 6:12 am    Post subject: Reply with quote

At a glance your setup of arrays seems OK to me. You are not clear how the 8-disk-raid5-LUN's are grouped into the Xsan Volume. (this is visible in Xsan Admin) If you want more advice we need this.

See this http://support.apple.com/kb/HT1200 for common setups of Vtrak (not all are for Xsan!) and posts here, like this one http://www.xsanity.com/forum/viewtopic.php?p=38493

If you have problems with latency, first thing to check is the metadata network. (in Xsan the metadata connection from client to Metadata Controller (MDC) is over ethernet)
Typically a separate ethernet switch is used for this.
The port on the mac's used for this metadata ethernet should be secondary (to prevent broadcast and all other un-needed trafic on this network)

There are many posts here about this network setup..
Here is a deep discussion about the redundancy. between the lines you can learn a lot about the setups: http://www.xsanity.com/forum/viewtopic.php?t=8983

Second; Xsan is not ideal for many small files, it is performing best with large files (like HD video). What type of files is used on your san?

Which test do you use to see the latency of over 3000 ms?
Back to top
View user's profile Send private message Visit poster's website
xsanman
fully protected
fully protected


Joined: 23 Sep 2010
Posts: 11

PostPosted: Fri Sep 24, 2010 8:23 am    Post subject: Reply with quote

Hi MJ,

Thanks for taking the time to respond. Here is a picture of the volume information.



Uploaded with ImageShack.us

Our MDC's are on closed/separate network on a dedicated switch using separte interfaces on the Xserve's. The high latency is seen via cli in cvadmin running the 'latency-test' command. Based on our setup shouldn't the two separate arrays be attached via SAS cable so they act as one unit or does it not matter considering there are two Vtrac E subsystems required to do this, which is more expensive. I'm still learning, so I'm asking a lot of why questions:)

Just curious as to why the person setup the SAN this way, maybe there are advantages with two physically separate arrays connected via fiber switch but acting as one large volume? Would this cause latency?

Our company does data collection and there are literally millions of files on the SAN. Lots of small jpegs of photo collection of roads and Lidar information.

Thanks again for your help.
Back to top
View user's profile Send private message
mjsanders
Could work for Apple
Could work for Apple


Joined: 02 Nov 2005
Posts: 59

PostPosted: Fri Sep 24, 2010 12:23 pm    Post subject: Reply with quote

OK, so your metadata network seems well designed
Your Xsan Volume (ProductionData) is built out of many LUN's, which is normal.
The other thing to look for is the way data is written to the Pools (in your setup called Data-1 to Data-6), this is either round-robin, fill or balance.

For best performance for bandwith througput Video I would recommend more LUN's in one pool (4 or 6), but reading the type of files you use, bandwith is not a likely bottleneck, so you should not bother to change this.

Watch out: many (most) changes in the config of an Xsan volume mean that all data is lost. (so that's why most Xsan's will not be reconfigured a lot; backup a few TB of data, reconfigure and restore the same amount of TB's can take days)


You are wondering why the setup is like that:
There is a limit to the number of Vtrak expansion units you can add to an controller (E-class). For video (high bandwith) Apple recommends max 1 expansion unit, the hard limit is 4.
That many disks can provide a lot of MB/sec of bandwith, and the controller and its FC connections will become the bottleneck.

So your setup of 2 controllers with each 2 expansion units seems a good compromise between price and performance for an Xsan with mostly small files.

The Xsan software combines the FC devices (LUN's), and can be tuned better if you have more LUN's so your set is more or less built by the book. (apart from the pools of 2 luns, normally this should be 4 or 6)

I presume that all 4 FC ports on the Vtrak E class are used?

read this guide if you are new to xsan: http://images.apple.com/xsan/docs/Xsan_2_Admin_Guide_v2.2.pdf (presume you use xsan 2.2)
Back to top
View user's profile Send private message Visit poster's website
xsanman
fully protected
fully protected


Joined: 23 Sep 2010
Posts: 11

PostPosted: Fri Sep 24, 2010 1:41 pm    Post subject: Reply with quote

MJ,

Thanks again for your help. You've been very helpful! Regarding the LUN's, does it matter that the LUN's were created in vertical columns since drive numbers in the Promise array go from 1-16 and across instead of down, which would mean the LUN's would comprise of drives number that are out of sequence.

From what you've said it appears that having more LUN's in a pool increases performance. When would one use 4 vs 6 LUN's in a pool? I was aware that 4 is the limit of expansions but I thought since they were designed to handle 4 that throughput would not be an issue. We have 4 X GigE card in the Xserve serving up AFP. They are bonded going into the LAN switch. I don't know if a 10Gb card and module for switch would help end user access performance or it would be overkill.

I did notice that only two out of the four FC ports are being used on the E class chasis. I saw two cables hanging down from the FC switch but don't know why. I will have to ask the person that helped set it up.

I read that one can tune Xsan software to better handle small/large and types of files. Would that be considered a change of the Xsan volume? The company at this time has no backup of the data except a bunch of HD's that are used for archiving the data but there is no 'live' backup of the production data. It scares the hell out of me when it comes to changing anything in Xsan.

I would love to get another Xsan in production with the 2TB drives in the Promise setup and copy this data over and completely redo the current setup, but that might not happen for a while. BTW, all LUN's are RAID 5, which take for ever to rebuild. If it were up to me I would have done raid 6 and sacrificed a little space. Any thoughts on this? Also does the promise support global spares?

I sent you a PM but don't know if you got it?

Thanks!
Back to top
View user's profile Send private message
mjsanders
Could work for Apple
Could work for Apple


Joined: 02 Nov 2005
Posts: 59

PostPosted: Tue Sep 28, 2010 4:09 pm    Post subject: Reply with quote

The LUNS do not matter how they are 'aligned', vertical or horizontal.
Although it is possible to use disks of chassis (E and J), I heard an unconfirmed rumor that it is best for performance to keep disks for luns inside the same chassis. (all apple sample scripts do this)

inside a pool all data is striped over all luns, so having 4 or 6 luns in a pool means striping over 4 or 6 luns, meaning 4 or 6 IO's (possibly multipathed over all possible FC connections)

Each LUN should consist of minimal 7 (or 5) disks (otherwise the combined throughput of the disks is the bottleneck), so 4 or 6 depends of your total number of disks/chassis

If you connect all ports of the Vtrak, multipathing may make the san 2x faster.

If your Xserve (doing AFP resharing) is the only Xsan client, the dual FC port card in the xserve is the xsan bottleneck.
In general the ethernet and/or afp process will not fill the bandwith of the dual port fibre channel card of the xserve.

For tunining for small files (: read this post: http://www.xsanity.com/forum/viewtopic.php?t=8343
most important quote:
"The three things that seemed to make the biggest impact on performance was disabling ForcedReadAhead on the vtrak controllers, changing the read policy to ReadCache instead of ReadAhead, and dropping the stripe breadth on the metadata affinity / storage pool down from 256 to 32."

and this one (If you run afp home folders read it, I contributed some:)
http://www.xsanity.com/forum/viewtopic.php?t=4683

there are several performance issues with afp resharing, depending on typical usage (file size, # of open files, # of concurrent users, acl's used or not, etc etc.) search on this forum for afp resharing and start reading..... Keep in mind that these issues can be very version dependent (both OSX as Xsan versions)

I gues that setting up a second Xsan is the best way to do two things at the same time:
-chance to test your config (optimise the new xsan)
-have your old as backup (maybe with rebuilding=erasing in between to optimise it after all data is copied to new xan)

But you need the new hardware....

and yes, Vtrak supports global spares, optional revertible.
Back to top
View user's profile Send private message Visit poster's website
xsanman
fully protected
fully protected


Joined: 23 Sep 2010
Posts: 11

PostPosted: Tue Sep 28, 2010 5:02 pm    Post subject: Reply with quote

mj,

regarding your suggestions.

I disable the read ForcedReadAhead on the controller but had trouble locating the read policy and metadata affinity settings.

Also would any of these changes be potentially destructive to the current production data/volume.

Thanks again so much for taking the time to help!


"The three things that seemed to make the biggest impact on performance was disabling ForcedReadAhead on the vtrak controllers, changing the read policy to ReadCache instead of ReadAhead, and dropping the stripe breadth on the metadata affinity / storage pool down from 256 to 32."
Back to top
View user's profile Send private message
mjsanders
Could work for Apple
Could work for Apple


Joined: 02 Nov 2005
Posts: 59

PostPosted: Wed Sep 29, 2010 3:10 pm    Post subject: Reply with quote

you can change the Vtrak settings (forcereadahead, and ReadCache) without data lost, maybe a short pause if users are accessing files while you change the settings.

Change the stripe breath on the metadatapool I guess that is datadestructive.

But you have a backup don't you? Smile
Back to top
View user's profile Send private message Visit poster's website
xsanman
fully protected
fully protected


Joined: 23 Sep 2010
Posts: 11

PostPosted: Wed Sep 29, 2010 3:57 pm    Post subject: Reply with quote

mj,

I able to ocated the reachcache setting under the logical drives. Our current settings are:

Read Policy = ReadAhead
Write Policy = WriteBack

What settings would you recommend?

We have a total of 13 logical drives (two separate sub system E class with
2 x J classes attached.) One set has 7 logical drives, 1 for the metadata and 6 for the data, the other set has 6 for data. I would have to make changes to the readcache one logical drive at a time. Is it going to be a problem because there would be a different mix of read policies until all were changed. Lastly do you recommend changing the read policy on the metadata logical drive too?

To answer your question, unfortunately we do not have a backup our production data is our only copy. These 'challenges' are things I've been tasked with.

I'm trying to learn as much about the Xsan as possible since I was not involved with setting it up.

I can't thank you enough for your help so far!
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Xsanity Forums Forum Index -> Troubleshooting All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group
Best Viewed on a Mac | Suggested Browser: Whatever floats yer boat.