Performance tuning small Xsan

Sam Edwards's picture

Hi Folks,
I just set up a small Xsan and I'm pretty disappointed with the performance.
I have 2 promise E610FD's that were direct attached to two Red Hat Linux boxes running Autodesk Flame and Flare. Both have the latest firmware. One is an apple unit and the other is dark gray.
I used Apple's script to configure them for Xsan so they are identical. One has metadata and one has journal on it's raid1. They have two raid5's, one on each controller. They're full of stock 1TB hitachi enterprise drives. There are two spare drives per chassis.

I plugged both of them into a Sanbox 5600 switch. The switch is zoned to send each channel of FC from the raids directly to one controller port in the workstations. There is also a zone that allows the two channel MCD's to see all of the raid controllers.
Streamguard is configured for all of the computers but not for the arrays. The switch reports that all of the links are 4G. I am using copper cables for the arrays and fibre for the computers.

Both linux machines are connected with 4x FC ports to the switch.

My MDC's are minis with promise Sanlinks.

The Autodesk software used to report ~725MB/second reads with one raid direct attached to 4 ports. Now it reports ~650MB/second using both arrays. I was really counting on getting a bit of a speed boost by using two arrays but that didn't happen.

Any ideas where to look for the bottleneck? I have never direct attached both raids to one linux box but I had always assumed that two would be much faster than one. Slower is not what I had in mind when I stared this project.

-One idea that strikes me would be to upgrade the drives. Those 1TB drives are pretty long in the tooth. I believe they are all 7k1000's.

I've seen the suggestion on this site to use SSD's for MDC and Journal. How big would they need to be? I'm assuming they would need to go in the Promise Array, but maybe I could just get by with one pair....

Thanks in advance for your ideas.
Sam Edwards

xsanguy's picture

I combine Meta and Journal on all of my xSan installs.

Can you give me a screenshot of how your storage pools are configured?

What you probably have, is a situation in which you have 8 storage pools total (4 per chassis) but your xSan config is striping across just 4 of those storage pools at once. You can change this, but you have to go Custom.

Are you set to Balance or Round Robbin?

Finally, where are you getting your performance stats from?

xSan, especially with very big files, and multiple chassis, can often out perform a single raid chassis DAS. I've gotten up to 2GByte/second on 8gbit FC and 4 (not Promise) chassis.

Sam Edwards's picture

Hi XsanGuy,
Thanks for your response.

I can get you a screenshot on Monday.
The Apple configuration script was run on each E610FD creating:
1x two disk raid1 for metadata.
2x 6 disk raid5's for data (each bound to a different controller)
2x spares.
So I am using all of the storage pools. This is confirmed by the capacity, which is 20TB.
I'm set to round robin.

Autodesk flame has it's own test disk utility in preferences. And this application performance is 95% of why I need this performance.
What does Xsan mean when they talk about 'big files'? Flame deals primarily in frames of uncompressed 10bit HD or greater DPX and EXR files which range from 8.3MB to 45MB. I've been doing the tests in a 8.3MB/frame project.

This performance is confirmed by the kona system test running on the unused MDC.

A unix tech was running a few command line tools that bore out these results. I can provide these on monday.
He suggested we direct attach the storage to to try to isolate the issues between the switch and the storage. I'll be checking that on Monday as well.

I have a case open with promise.

Best Wishes,
Sam Edwards

Operate a small xSan in a road case.

xsanguy's picture

Hi Sam, you can be "using" all the storage pools, but not "using" all of them at once if you will.

I'd try setting that array to "Balance" initially, and then we should take a look at how your storage pools are striped, and their stripe settings, in xsan admin.

I'll send you a PM with my contact info, I can take a look real quick and give you a much better idea of what's going on.

Sam Edwards's picture

Hey Xsanguy
That would be great. Samedwards at mac dot com.

Operate a small xSan in a road case.

xsanguy's picture

Hey Sam, dropped you an email. I think this lies in your storage pool config in xsan admin and volume settings.

Sam Edwards's picture

Hey Xsanguy,
I didn't see the email. Anyway a Linux engineer reformatted the arrays with 256kb stripes using 7 disks for the raid 5s and now I'm getting over 1350mb/second, which is about what I'd hoped. He also updated the stornext client software to 4.x to better match the xsan3 hosts.
So for the time being I'm fine. I'm actually going to be away from the system for a couple of weeks.
I could probably use some help fixing up the metadata controllers some day down the line. Drop me an email if you'd be interested in doing a remote session to have a look.

Operate a small xSan in a road case.

xsanguy's picture

resent and PM'd - currious if ACLs are enabled on your volume and your binding of the redhat hosts. Give me a shout when you have a moment.

Sam Edwards's picture

Hey Ben,
I'll probably be back at that installation the week after next.

Operate a small xSan in a road case.