Xsanity Sanity for Apple's Xsan and Final Cut Server.
  
Wednesday, June 19 2013 @ 04:00 AM EDT
Topics
Storage (39)
People (1)
Xsan (105)
How To (26)
User Functions
Username:

Password:

Don't have an account yet? Sign up as a New User
Who's Online
Guest Users: 10
Sponsorship

Xsanity is proudly sponsored by:

Tekserve
The Old Reliable Mac Shop

Final Cut Server
Automated Archive between Final Cut Server and BakBone NetVault: A Case Study

Final Cut Server's Archive feature is, at best, a primitive attempt to solve a very difficult problem related to its assets. After all, the assets can't live on your primary storage device forever. One day they need to be archived. I call the feature primitive because it performs the simplest of functions: it simply moves the Primary Representation of an Asset to an Archive Device. This device is any kind of Final Cut Server device: local storage, Xsan volume, nfs share, etc., with the added honor of being designated as an Archive Device. The corresponding Restore feature simply moves that Primary Representation back to its original device and path.

This last part is the real zinger. Who ever heard of an archive system that removes an archived file from its archive location? At best one should make a copy of the archived file if it needs to be restored.

There's not really much else to work with on the Final Cut Server side. For the majority of our deployments, therefore, we've had to think beyond this limitation to provide our customers with a fleshed out system for archive.


Initially, our suggestion was always a two-stage Hierarchical Storage Manager, or HSM. In these systems, the spinning-disk part of the system (stage one) is like a front porch on which you initially place the file. This spinning-disk storage is usually shared out on the network, and this is what we identify to Final Cut Server as the Archive Device. Final Cut Server then merrily places its files there are declares them archived. Then, according to rules you set within the HSM, the storage manager of the HSM quietly moves these files onto removable storage, usually a tape library (stage two). But here's the catch: the "stub," or base metadata of the file, stays on the spinning-disk storage, taking up only a few bytes. For all intents and purposes, the file looks like it's still there, but its data has been spirited away onto tape.

If Final Cut Server comes knocking for the file during a restore, the HSM can provide the actual file from the spinning disk, that is, if it hasn't moved the data essence onto tape. If only the stub is there, the HSM gets the tape where the data resides and feeds the data back onto the spinning disk. All this time, Final Cut Server simply tugs at the file until bytes start coming down the pipe, unaware of the fancy rouse that is being played under its nose to make it think that the file is actually there. And best of all, once Final Cut Server tries to delete the file from the spinning disk, as it wants to do, the HSM simply lies and says "yup, I've deleted it!" but then doesn't do so and giggles to itself.

Most importantly, HSMs allow regular ordinary end users using Final Cut Server to archive and restore files to and from their storage systems without one iota of intervention from archivists or admins. In fact, end users are usually completely oblivious to the monstrous equipment in the server room that drives these features. And well they should be. HSMs truly are "set it and forget it" solutions.

HSMs are, therefore, the best partnered storage for a complete archive solution for Final Cut Server. But guess what? They cost a lot of money. Besides, a lot of responsible facilities already have a lot of money and brain time (and usually a good deal of frustration and troubleshooting) invested in archival systems, complete with archive software.

Some of our customers chose BakBone's NetVault software, and sometimes on our suggestion. That's because we feel that NetVault is the most approachable system for all kinds of backup and archive strategies. It also has, by far, the most extensible command line interface of all the packages out there. Again, our opinion. I can't wait for the Ford vs. Chevy comments to begin.

But NetVault has none of this fancy stub file stuff in it. It does, however, have multiple-stage archiving, where you can place files onto virtual disk devices (or virtual tape libraries which literally mimic the operation of a tape library) before they go onto the real thing. You can also set age policies for these stages.

I felt that the integrator and support community needed a little push to realize that these kinds of archive solutions can be bolted onto Final Cut Server as well. The extra help that is needed are a few scripts that create an intelligent conversation between the two pieces of software. (I think they call this Middleware.) Remember, our goal here is an archival system for assets in a Final Cut Server catalog that end users, all by themselves and only using Final Cut Server, can initiate, without intervention.

So below you'll find four scripts, two that get fired from Final Cut Server, and two that get fired from BakBone NetVault, that create a completely automated archival system hanging off of Final Cut Server. These will work whether the assets are manually archived using the contextual menu's Archive selection, or whether they are archived via the Move to Archive response based on some fancy subscription that you've dreamed of.

The only difference between these scripts and the actual ones we deploy are:

1) These scripts are completely "street legal." They don't use any allegedly hidden features of Final Cut Server that may or may not be available for scripts. Therefore, those of you who are reading into this can freely tune the parts that seem obtuse at your own peril.

2) These scripts use rather primitive output files to keep track of NetVault jobs and their eventual status. Our deployments utilize a cute little sqlite database, which offers a little more confidence in case we need to research previous jobs or why jobs failed. Again, folks who know sqlite can substitute all the output commands (>>) for something more tasteful. And by the way, if you're not familiar with sqlite, take some time to learn it. It's open source (www.sqlite.org), already compiled into Mac OS X, and with a tiny bit of learning you can create some very robust and reliable database structures for your scripts.

A little explanation is in order as to how these scripts work.

First, what we are accomplishing is a single-file archive job, launched for every asset that the end user choses to archive, or for every asset that a subscription finds. You might exclaim, "That means hundreds or thousands of single-file archives cluttering up my beautiful NetVault catalog!" And my answer is yes, that's true, but NetVault can handle it just fine, and again, what we're after here is automation, not necessarily efficiency. During day to day operations, we want to allow end users to do this themselves, and we want to sit at our desks and watch the job requests come into NetVault without us having to lift a finger. We want to go to lunch at lunchtime and we want to go home at (heaven forbid) the same end of the day that the creatives do. If you're still with me, read on. If not, go back to making your archive jobs for the night.

You might suppose that we use the "Post-Archive Command" field from the Archive Device's configuration window (found in the Advanced Admin window). We don't, mostly because that command only passes one variable into the script, which is the full path of the archived file. That's nice, but not enough information to keep track of it for automation. Besides, we can construct the full path to the archived file with a little ingenuity. What we really want is that truly unique, never-repeated integer called the Asset ID. We use the Asset ID to drive a lot of the automation, mostly because it is unique in the entire catalog. Files can and often do have the same name on a large storage system, but the Asset ID ensures that we archive or restore _this_ instance of mommagotsprungfromjail.mov and not the one from the other folder.

So we fire this off with a subscription to a little-known field called Archive Status. This field is hard wired in Final Cut Server to only contain three values: Online, Offline or Unknown. If the Archive Status is Offline, that means that the Primary Representation of the asset has been archived, and if we subscribe to that when it changes to Offline, we are sure that the archive occurred moments ago. (Thanks again, Drew Tucker of APS, for this enlightenment.) The Archive Status field isn't added to any Metadata Groups: it's just sitting there in the soup of the database. So we add it to the Asset Filter group in order to subscribe to it. The subscription fires off a script response that in turn fires off our fcs_post_archive.bash script, and hands it four important variables:

"[File Name]" "[Stored On]" "[Location]" [Asset ID]

Notice that the first three get encased in "hard quotes." You can image hard quotes like well-made hard-sided luggage. They make sure that spaces and other characters get preserved and not get passed into the wrong passthru slots to the script. Asset ID doesn't need this because its an integer and would never have a space.

When a file is archived, Final Cut Server places it in the archive device, in a root-level folder that bears the Device ID of the device that it was originally on. Final Cut Server then replicates the entire path to the file from the root of the device as well. For example, in our lab Final Cut Server, the Xsan Volume is device number 7. A file that used to reside in

/Volumes/Xsan_Volume/Media/mommagotsprungfromjail.mov

would reside on the archive device thusly:

/Volumes/Archive_Device/7/Media/mommagotsprungfromjail.mov

The issue is that we can't get a device's Device ID passed to us. Really, we can't. I tried everything. Yes, some of you know another way, but remember, street legal here. So the fcs_post_archive.bash has a simple if-elif statement that takes the [Stored On] variable (which is the _name_ of the device) and converts it to the Device ID. In order to flesh out this statement with all the possibilities, just archive one file from each of your other devices and look at the number of the folder that Final Cut Server makes in the archive device. Do this repeatedly until you have a complete list, and create a separate elif line for that device.

If your memory is good, Final Cut Server assigns Device IDs in order of their creation, with the first six devices made at time of installation:

1 Proxies
2 Version
3 Edit Proxies
4 Library
5 Watchers
6 Media

This means your first "hand made" device starts with ID 7, and so on.

Once the Device ID is determined, we mix it with the other passed variables to construct the path to the file. We then use the Asset ID to create NetVault Selection Set and Backup Job Title names: truly unique names that can never be repeated. We then quickly output a line of text to a file so we can "remember" the association of the NetVault Job Title to the file's path. This is for restore, if necessary. Don't worry about "touch"ing these output files, because during the first run, these files should be made automatically.

We then root into the NetVault server. Yes, in order for this to work, the Final Cut Server and the NetVault server have to be "rooted" to each other, with RSA-key SSH logins, so they can command each other to do things without login passwords. If you are now screaming "but the security risks!" I would reply, yes, there are a few, but these are internal servers that are rooting into each other via RSA-key, which requires root access to generate the keys in the first place. So go have a smoke, then come back and read on.

Because Final Cut Server runs its scripts as the user you installed under (usually 501), you must root from that user to the true root user (0) of the NetVault server. (NetVault runs it scripts and only accepts commands as true root (0), always.) In turn, the NetVault server has to be rooted from its true root (0) to the user you installed under (usually 501). This way, it will be able to add its lines of text to the output files.

For those of you unfamiliar with how to create RSA-key based SSH between two OS X machines, there is a lovely one-pager that explains it all here:

http://www.bashcurescancer.com/setting_up_ssh_keys_for_access_without_password.html

We issue commands to NetVault to create a selection set based on this one file, then a job that archives this one file.

Now, over to the NetVault side. The two scripts that the NetVault server is responsible for firing are simple "reporting" scripts that dump lines of information into text files back on the Final Cut Server. These scripts need to be placed into the directory where NetVault likes to launch its scripts: /usr/netvault/scripts. nv_jobid.bash should be launched at the beginning of each archive or restore job, and nv_jobstatus.bash should be launched at the end. All these script do is take environmental variables about the NetVault job and bring them back over to the Final Cut Server for analysis. They actually get launched by configuring them as Pre and Post Scripts. These get configured in the Advanced Options tab in the NVAdmin GUI, then get saved as an Advanced Options set. Please note that fcs_post_archive.bash has a variable called nvbackupadvoptset which contains the name of the Advanced Options set. Since we ask for these advanced options when we submit the job, we are guaranteed that nv_jobid.bash and nv_jobstatus.bash get fired at the right times, and that we can track the job as it progresses.

Back to fcs_post_archive.bash now. The rest of this script waits for the lines about the job to be passed from the NetVault server, which eventually leads to the deletion of the file from the archive device (since it has been safely archived at that point) and then exiting.

Finally, the last script fcs_pre_restore.bash, does indeed get summoned from the Pre-restore Command field in the Archive device's configuration window. This script receives only one variable from Final Cut Server: the full path of the asset's Archive Copy Representation (this is what the Primary Representation is called when it has been archived). The script then "looks up" the NetVault backup job based on the path of the file (from the line of text inside the file located at nvcatalogpath), and then immediately submits a job to NetVault asking to restore this file. Please note that this script also invokes an Advanced Options set to make sure that NetVault fires off the nv_jobid and nv_jobstatus scripts. Restore job Advanced Options Sets _are_ different than Backup Advanced Options Sets. You need to make them for both archive and restore jobs, and call them appropriately in both fcs_post_archive and fcs_pre_restore.

A similar while loop waits for the report that the job was successful before exiting, at which point Final Cut Server discovers the file right where it thought it was all the time, and restores it back to its original location. Nifty!

Please enjoy these scripts and mod them up as you see fit. We should obviously state at this point that they are offered with no warrantees whatsoever as to their usefulness or accuracy, and Meta Media assumes no responsibility whatsoever for lost data as a result of their use. But at the same time you should also know that slightly more sophisticated variants of these scripts are running happily at several of our clients' sites.

Final Cut Server scripts:

http://www.metamediatech.com/Scripts/fcs_post_archive.bash
http://www.metamediatech.com/Scripts/fcs_pre_restore.bash

NetVault scripts:

http://www.metamediatech.com/Scripts/nv_jobid.bash
http://www.metamediatech.com/Scripts/nv_jobstatus.bash



Automated Archive between Final Cut Server and BakBone NetVault: A Case Study | 9 comments | Create New Account
The following comments are owned by whomever posted them.
Automated Archive between Final Cut Server and BakBone NetVault: A Case Study
Authored by: JonThompson on Friday, April 10 2009 @ 03:00 PM EDT
Is there then a restore function? I assume that FCS reports it as a file missing
(although I've not actually deleted an archived file, so I'm not sure how that
works.)

Does it then change that asset to unknown?

If so, you could have a script that runs and pulls the files from the backup, you
would essentially have your own HSM for FCS.
[ Reply to This ]
Automated Archive between Final Cut Server and BakBone NetVault: A Case Study
Authored by: MattG on Saturday, April 11 2009 @ 12:29 AM EDT
Actually, the fcs_pre_restore.bash script does the restore perfectly. It gets launched from the Pre-restore Command field inside your Archive device's config window. Yes, when archived, the asset reports its Archive Copy as missing. But when you select Restore from the Asset's contextual menu, FCS launches the fcs_pre_restore.bash command which then commands NetVault to return the file to the archive device, and then the script exits. And lo and behold, Final Cut Server sees it there and restores it to its original location. Indeed, very HSM-like behavior. Or, more like a Doug Henning magic trick to me...

---
Matt Geller
Meta Media™ Creative Technologies
Consulting & Integration | Proactive & Reactive Maintenance
Xsan Integration Specialists

[ Reply to This ]
Automated Archive between Final Cut Server and BakBone NetVault: A Case Study
Authored by: aaulich on Monday, April 13 2009 @ 05:37 AM EDT
Hello, Matt, great article! I tried something similar with Archiware PresSTORE and published a short article about it at http://www.andre-aulich.de/en/perm/connecting-final- cut-server-to-archiware-presstore- concept-overview My Final Cut Server-side scripts write the paths of assets, which they try to archive or restore, into temporary text files, which a dedicated launchd job reads once a minute. This way you can separate the PresSTORE server from the FCSvr system, because PresSTORE and the launchd job can run on a second machine. No hassle with ssh keys, especially after operating system upgrades. My scripts are pretty basic and need some optimization, but I guess, they work similar to your NetVault solution. But again, great article, and it's always good to have more than one option to do things. Thanks for this, André
[ Reply to This ]
Automated Archive between Final Cut Server and BakBone NetVault: A Case Study
Authored by: CharlieM on Monday, April 13 2009 @ 05:36 PM EDT
Great article, Matt. Keep them coming!
[ Reply to This ]
Automated Archive between Final Cut Server and BakBone NetVault: A Case Study
Authored by: mw10dot1 on Wednesday, April 15 2009 @ 01:45 PM EDT
Thanks for this Matt.
I have a question and i do not mean to sound picky. But what happens when you
restore a file and then send it back to archive? Do you just end up with two
copies on tape? Then when you try to restore which one gets restored?

Michael
[ Reply to This ]
  • Automated Archive between Final Cut Server and BakBone NetVault: A Case Study - Authored by: MattG on Wednesday, April 15 2009 @ 03:07 PM EDT
  • Productions and Assets
    Authored by: om_nick on Monday, April 20 2009 @ 03:57 AM EDT
    There are also scripts available at www.matrixstore.net that:

    - Archive entire productions with single click. This will take a production and all associated metadata and archive it to a MatrixStore Cluster.
    - Restore assets on demand. No tape management, no personnel wasting cycles managing storage.
    - An option to leave assets in the archive upon restore.
    - Backups up FCSvr DB and proxies to an FTP server (or MatrixStore)

    All the above scripts are 'street legal' and will ensure that your data and your metadata are protected and available online.

    Managing tape to ensure you always (5yrs minimum) have two authentic copies of your data is costly and not quite the hands free endeavour being muted here.

    http://www.matrixstore.net/2008/05/15/final-cut-server-and-matrixstore/
    [ Reply to This ]
    Automated Archive between Final Cut Server and BakBone NetVault: A Case Study
    Authored by: mrmacguy on Thursday, November 05 2009 @ 03:37 PM EST
    Very helpful writeup! But I'm following your instructions and not getting the
    script to fire off.

    1) I added "Archive Status" to the Asset Metadata category.
    2) I created a new Response that runs my specified script with the parameters
    3) I created a new Subscription watching for the Archive Status = "Offline"
    and "Trigger if changed" checked.

    When I archived an asset, my Archive device script fired off, but the
    subscription script did not. Did I miss a step? What's the best way to
    troubleshoot this?

    Thanks!
    [ Reply to This ]
    Story Options
    Best Viewed on a Mac | Suggested Browser: Whatever floats yer boat.