cvcpSync - Sync your Xsan volumes the fast way.
The problem
We have a large xsan volume of 28 TB. We want to sync this volume every night to another xsan volume of the same size. In the beginning when the volume was smaller rsync did the job. But now with 28 TB rsync is simply to slow to get the job done in one night.
The solution
I've made a script that uses rsync to get a list of all the changes and then use cvcp to actually copy the data. This dramatically improves the speed. I've tested with 145 GB of data. Rsync did it in 70 minutes and the script did it in 40.
While testing with this I noticed that the CPU and fibre could easily handle another cvcp command at the same time. So I changed the script to do just that. After that the script did the same 145 GB in around 20 minutes.
I am doing a sync now every night with an average of three hours. The amount of data is off course different every night. At the most it needs to transfer 1.5 TB. It might take a little bit longer then but it is still done in the morning.
The script
The script is to big to simply copy and paste here. I've put it on a website where you could download it. For free off course.
If you put the script in /usr/bin/ of /usr/sbin than you could simpy call the command cvcpSync instead of using the path everytime. Make sure you chmod 755 to make it executable.
The script had the following options (flags):
- -H Help.
- -d Debug mode.
- -r Do not perform a final rsync afterwards.
- -t Do not copy the .Trashes folder
- -cX Use X multiple cvcp commands. Default is 3.
- -sX Sleep for X seconds between process checks. Default is 1.
EXAMPLE:
cvcpSync -c5 -s2 /Volumes/MySan/Folder/ /Volumes/MyOtherLocation/Folder/
This would use maximum 5 cvcp commands at the same time. When all the 5 commands are running it would check every 2 seconds if there has a command ended.
-d Debug mode
Within debug mode you'll see a lot more messages appear on the screen. This is offcourse for debuging while writing the script. But it's very important for configuration as well.
-r : Do not perform a final rsync afterwards.
There is one problem with cvcp that is that it doesn't make folders.
So when I say copy /Volumes/mySan/myFolder to /Volumes/myRedundantSan/myRedundantFolder and myRedundantFolder didn't exist, cvcp would give an error instead of making the folder himself.
Because of this I have to make the folder before we can copy the data. This brings up another problem. I make this folder as root (or as the user running the script). So the folder has a different owner, permissions and a date stamp. To solve this problem I run another rsync after cvcp is finished. When all went well this final rsync will only change these folder owner, permission and data stamp as well as some resource forks.
The downside of this is that you run rsync two times. For those off us knowing rsync we know that it can take a while for rsync to build up a file list. So it has to do this again.
If owner, permissions and date stamp doesn't matter to you for your redundant san then you could turn this off. If you, like me, need an exact copy of the data then simply don't use -r.
-t : Do not copy the .Trashes folder
When copying a whole volume you'll copy all the trash as well. In our case with the 28 TB the .Thrashes folder sometimes contains more then 1 TB. So I think when people threw stuff away I don't need to copy that. This saves me data and time. If this is you to then use the -t option.
When you really want an exact copy of even the trash then don't use the -t option.
-c : Multiple commands
cvcpSync will use per default maximum three commands at the same time.
When you want more or less simply use -c[number] for how many you can handle.
So like -c2. See the configuration part for more details.
-s : Sleep
When all the commands are running we need to check when the currently running commands are finished. Per default it will check every second.
When you only have large files you could set this to a couple of seconds. When you have a lot of small files I suggest you keep this at 1. See the configuration part for more details.
So it's simply -s4 to wait 4 seconds.
Configuration
It's important to configure the script. This is how I do it.
First of all find your best configuration on a test server. If you have a test volume use that. Needless to say the server must have at least one xsan volume mounted and have cvcp installed. Please do not take this in production right away.
Initially run this script only in debug mode and with only one command.
So like this:
cvcpSync -d -c1 /Volumes/mySan/ /Volumes/myRedundantSan/myFolder
Now it's best to have activity monitor open while running this script.
While running this for the first time look at the CPU load and disk activity
As said before debug mode this will give you a lot of information on the screen. Look for this string within the debug data: "all_running : All available commands are running." This string means that the maximum amount of commands have been reached and we're looking if some process has been finished. If you have them a lot consider an extra command.
Most people will never use one command (that's why the default is 3). So in the beginning you will always need an extra command. But it's important to see how your server is doing while running cvcpSync.
Remember: When you want to use another extra command always check if your CPU and fibre speed can handle it!
When your CPU or fibre can't handle any more commands than use the -s option to wait longer.
This will in most cases help to reduce the CPU load as well.
But there is a downside. Say you have large and small files on your volume and three commands running simultaneously with a sleep of 5 seconds. When the maximum of commands is reached it will wait 5 seconds before it checks if some processes are finished.
So it's possible that cvcpSync was only copying small files that are done copying within the second. Still we told him to wait 5 seconds. Imagine doing this for a whole volume it could take some time.
You could also think that because this script will run at night you'll kick your server and fibre on it's tail and go full speed. I don't advise this. I reckon that when you don't need to kick your server on it's tail don't do it. Be nice to him, then he might be nice to you!
You can run this script automatically via launchd. I recommend you use Lingon to achieve this.
Finally a word to the wise...
Although I haven't had any problems with the script the risk is still all yours. Whatever may happen while running this script: I'm not responsible! [Ed. Note: Neither is Xsanity!]
This script was made and tested on Mac OS X 10.5.5.
You’ll need a working Xsan or storenext envirement. You’ll also need cvcp it self because the script calls this command many times.
This script was made and tested on a machine running Xsan 1.4.2. It's has not been tested on a machine Xsan 2 or higher!
I'm curious if this script works for you. Let me know!

Comments
I've got no reason to doubt
I've got no reason to doubt Jasper's skills as a programmer, but I do want to
remind every to use caution with scripts like this. Do as he recommends, and use
debug mode initially.
Please do post your experiences here!
---
Aaron Freimark
http://www.tekserve.com/vcard/af.vcf
Aaron Freimark
CEO, GroundControl
To add to this myself: Please
To add to this myself: Please don't run this on your metadata controller.
Really use a another server for this. We have a script server for this and that
works great.
What's better about doing
What's better about doing this compared to just using cvcp on its own? cvcp
has -t to use more threads, along with some other parameters to adjust the
resources used. I'm not trying to be obnoxious here. I'd really like to know if
your way is better, or just different.
I use 'cvcp -t 12 -uxy /Volumes/volname/Groups
/Volume/vol2name/Backups/Groups' to backup all group directories to
another volume. This copies updated files (-u) using 12 threads (-t),
retaining permissions (-x) and owner/group (-y). I wish cvcp
could create directories too, but specifying a top-level directory (Groups)
gets around that.
Thanks for sharing.
Hi,
Hi,
As far as I know the -u option does not delete files from the target.
Also I like to be able to use the -z function to have the same modification
dates.
This is not possible with -u in cvcp. It's also not possible to copy tar files with
-
u.
In short I just want a rsync. A real and exact, daily copy of my data. But then
faster then rsync.
Wht´s about the Apple KB
Wht´s about the Apple KB entry not to copy mediafiles via cvcp?
http://support.apple.com/kb/TA24864
Did you test that?
The Xsan command line tool cvcp does not copy file's extended attributes. Since some multimedia applications rely on media file's extended attributes for proper functionality, you should not use cvcp to transfer media files between Xsan volumes. <
The file list that rsync
The file list that rsync makes does contain the resource forks.
cvcp copies these recource forks as well.
Also the final rsync at the end does copy these. And because they are so small
it's okay that rsync does it.
So far I haven't had problems with this.
The other solutions apple suggest in that article (use the Finder of cp) are just
way to slow for us.
This is a wonderful thing. We
This is a wonderful thing. We are currently testing it with Xsan 2.1.1 in a 10.5.6 environment.
There is one request I would like to make, however. Would it be at all possible for you to add the functionality of an exclude list? We are backing up an Xsan of home directories and we would like to exclude some key items such as Library/Caches. Let me know what you think.
Thanks!
Will this only work from Xsan
Will this only work from Xsan volume to Xsan volume? I am syncing from our Xsan volume to a raid on a Linux machine. Any ideas?
I am syncing from a Xserve
I am syncing from a Xserve RAID to a SAN volume and it is working great, seems to be much quicker than rsync but I don't have any benchmarks to back that up.
Hi,
Hi,
Glad it works!
And a very good idea.
I reckon you know/use the -t flag as well? Don't copy trashes?
Will something like this be enough for you? A don't copy Caches option?
Or do you want to be able to exclude more folders/files?
Cheers!
Jasper
Hi,
Hi,
I'm really curious at your results.
Is it possible to run a rsync to compare time and throughput compared to
cvcpSync?
Jasper
If anyone is interested, I
If anyone is interested, I have modified this script to utilize rsync's exclude-from argument. You can create a file with files to exclude and pass it as an argument to the script.
Yes Please!
Yes Please!
If it's any good I can put it own the site.
I just found kbase article.
I just found kbase article. You may want to be careful using the cvcp command.
I don't know a whole lot about this but it may be something to look at.
http://support.apple.com/kb/TA24864?viewlocale=en_US
@ morrisce - You must've
@ morrisce - You must've missed the earlier post re: this KBase article.
On it's own, cvcp ignores resource forks, but this
script calls several other actors to play their respective parts (i.e. rsync).
It's my understanding (from
It's my understanding (from the posted KB article) that the extended attribute
limitation of cvcp was exclusive to Xsan 1.4. I just transferred an entire volume
(Xsan 2.2.2) without any issues. Can anyone else comment on this?
---
Peter G. Sengstock