| View previous topic :: View next topic |
| Author |
Message |
kittonian partially protected

Joined: 17 Jul 2012 Posts: 6
|
Posted: Tue Jul 17, 2012 8:47 pm Post subject: MOV File Corruption on XSAN 2.3 |
|
|
We're getting very strange, but very damaging, file corruption happening with ProRes MOV files residing on XSAN volumes. While the files can still be opened without issue, there are frames within the MOV that are being corrupted (skips, garbled images, etc.).
Unfortunately this seems to be happening very randomly and it doesn't look like it's a specific volume problem as I created a brand new volume and saw the problem rear its head within a week.
Every time we have shut down the system and brought it back online I have run cvsck -jv, cvfsck -nv, and if there are any issues (which has only happened once) cvfsck -wv. All the LUNs are online and I don't see any errors in the logs.
Apparently this is something that has been around for quite a long time as the owner of this company used to work for Apple in the iTunes unit back in 2006 and they saw this same issue. Looks like Apple isn't telling anyone about it and haven't ever addressed it, but I'd love to hear any ideas/solutions from the community.
This only seems to happen with ProRes MOV files. All MPG files are just fine.
Here's a rundown of our XSAN hardware:
(2) Mac Mini MDCs running Lion Server with 2xSSD drives in a RAID 1 mirror with a TrendNet 1GB USB ethernet adapter and a Promise SANLink
Brocade DCX-8518 Fiber Switch
(6) Proware 42-bay chassis with dual 2GB RAID controllers and dual 1100 watt power supplies
(20) ATTO Faststream 8550 SAS->Fiber RAID controllers
(40) 24-bay JBODs
Most all of the clients are running Mac Pro machines with Lion (a few are running Snow Leopard) and we've got 8 Windows 7 machines running StorNext FX client software. |
|
| Back to top |
|
 |
brianwells Xsan Master

Joined: 22 Oct 2008 Posts: 80
|
Posted: Tue Jul 17, 2012 10:52 pm Post subject: |
|
|
I've seen issue this only once in all the years we have used Xsan. It was back in June 2008. Analysis of the DV25 movie file showed that it had a .DS_Store file written right in the middle of a video frame.
The best we could figure is that the corruption seemed to correspond with a pool filling up and our metadata controllers subsequently crashing. It appeared to be an isolated incident.
Last year I had a few occasions where RPL data appeared to be stomped by extended attributes, but this was easy to fix and it hasn't happened recently. |
|
| Back to top |
|
 |
kittonian partially protected

Joined: 17 Jul 2012 Posts: 6
|
Posted: Wed Jul 18, 2012 11:55 am Post subject: |
|
|
That's definitely interesting, and I thank you for sharing your experience. Unfortunately in our instance it's not coinciding with a full volume, nor is anything crashing or having issues as far as I can tell.
Our situation is not an isolated incident and is occurring on a daily basis to the point that XSan itself is now being questioned as to whether we need to scrap this entire project and begin again with completely different SAN software (not something I really want to do at this point unless absolutely necessary). |
|
| Back to top |
|
 |
abstractrude Xsan Master

Joined: 13 Mar 2008 Posts: 860
|
Posted: Wed Jul 18, 2012 1:22 pm Post subject: |
|
|
| Are you using native extended attributes? other than that feature. xsan 2.3 has been extremely stable in my experiences. |
|
| Back to top |
|
 |
kittonian partially protected

Joined: 17 Jul 2012 Posts: 6
|
Posted: Wed Jul 18, 2012 3:16 pm Post subject: |
|
|
| Nope, extended attributes are disabled. Only ACLs are enable for each volume. |
|
| Back to top |
|
 |
kittonian partially protected

Joined: 17 Jul 2012 Posts: 6
|
Posted: Wed Jul 18, 2012 6:57 pm Post subject: |
|
|
Here's some more detailed info.
What we are seeing is called macro blocking. It happens sporadically and there is no rhyme or reason to where/when it will occur, but here are the details (all files are ingested/encoded identically).
ProRes 4:2:2 HQ
extension MOV
YUV 10-bit color space
chroma subsampling
iframe only encoding
HD or SD material
Final Cut Codec (mac specific)
Decoded/Encoded with Quicktime via Final Cut Pro 7 as well as exporting with Quicktime directly and Compressor on occasion
After the file is ingested and placed on an XSan volume, it goes to QC. Once it has passed QC (i.e. all is well across all frames and audio) it sits on the XSan volume (i.e. no one touches it). Finally, before it gets delivered to the client, the audit team looks at it and they are seeing the macro blocking frame corruption.
Bear in mind that it doesn't happen on every file and one day the file could be fine and the next it has macro blocked frames. |
|
| Back to top |
|
 |
brianwells Xsan Master

Joined: 22 Oct 2008 Posts: 80
|
Posted: Thu Jul 19, 2012 9:49 pm Post subject: Could be After Effects |
|
|
I just remembered another situation that resulted in corrupted video files on our Xsan volume. We had some clips being referred to in an After Effects project while they were also being used in Final Cut Pro by someone else.
This should not be a problem, as both users are just reading the same files. However, it turned out that After Effects was writing out XMP metadata to the video files and these were occasionally corrupting when another user was also using them. We fixed the problem by disabling the writing of XMP metadata in the After Effects preferences.
We had a similar problem when an After Effects user overwrote a video file that was being used in Final Cut Pro. The solution was to have the user write the new video file elsewhere and use the Finder to overwrite the shared video file.
The only information we were able to get from Adobe is that their products do not really support being used on a shared volume like Xsan. |
|
| Back to top |
|
 |
kittonian partially protected

Joined: 17 Jul 2012 Posts: 6
|
Posted: Sat Jul 21, 2012 5:37 pm Post subject: |
|
|
| Thanks again for the notes. Unfortunately no one at the company uses After Effects, though that issue is a big one from what I've read. This is a case of a file being fine one day and then after residing on the XSan volume for whatever period of time, has frame corruption in the form of macro blocking. |
|
| Back to top |
|
 |
ogminlo Xsan Master

Joined: 29 May 2008 Posts: 149
|
Posted: Sun Jul 22, 2012 8:26 am Post subject: |
|
|
| You noted that the files might be fine one day and seemingly corrupt the next- when they are showing the macro-blocking is it consistent across clients? Can you park on a frame and step back and forth and see the same blocking pattern frame by frame? Is this visible in multiple apps, does it show in QTPlayer, FCP, and/or the output of an I/O card to a broadcast monitor? |
|
| Back to top |
|
 |
kittonian partially protected

Joined: 17 Jul 2012 Posts: 6
|
Posted: Mon Jul 23, 2012 2:21 am Post subject: |
|
|
| ogminlo wrote: | | You noted that the files might be fine one day and seemingly corrupt the next- when they are showing the macro-blocking is it consistent across clients? Can you park on a frame and step back and forth and see the same blocking pattern frame by frame? Is this visible in multiple apps, does it show in QTPlayer, FCP, and/or the output of an I/O card to a broadcast monitor? |
Yes, it is consistent across not only Mac clients but Windows as well, and the problems are visible using any client (FCP, Quicktime, VLC, etc.). |
|
| Back to top |
|
 |
keithkoby Xsan Master

Joined: 04 Apr 2006 Posts: 140
|
Posted: Mon Aug 06, 2012 8:28 am Post subject: |
|
|
Does the macro blocking only appear in a small quadrant of the picture on the affected frame? Say, like a bytes worth of data?
I'm going to wager that storage is your problem.
You're going to need to use the command in cvadmin (can't remember what it is) to see if all of the affected files are on the same [edit] storage pool.
We had a very similar problem to what your describing. However it seems to be with some other mfg of storage. A drive went somewhat bad (didn't report a problem to the raid controller correctly) in one of our luns that was in a storage pool dedicated to rendering. The problem wouldn't show itself in 50 mbs codecs, only in 100 mbs and higher. 100 mbs files were ok usually except for growing file captures. ProRes HQ files were particularly susceptible to the problem. Sometimes it would be difficult to catch on 2D HD work, but it would be very obvious on 3D stereoscopic (when viewing in 3D) because it would cause a visual mismatch between left eye and right eye.
I can't recall at the moment the exact description of what was wrong with the failing drive. Perhaps it was spinning ever so slightly too slow, but it didn't report an rpm error, so it just kind of fludged up the write when presented with ProRes HQ or the growing file non-pre-allocated writes.
If you have a good storage vendor, they'll replace the drives for you. Promise makes a device that copies block for block and is perfect for this type of situation so you can replace the entire storage pool (or particular lun) with fresh drives. (This will require an additional chassis while the copy happens...)
After you find out which storage pool is storing the bad files, test by making the storage pool have an affinity to something else so that your files won't write there. See if the problem goes away after you take it out of the equation. |
|
| Back to top |
|
 |
pyhcheung JBOD

Joined: 31 Mar 2013 Posts: 1
|
|
| Back to top |
|
 |
Gerard Been around the blocks

Joined: 19 May 2009 Posts: 28
|
Posted: Wed Apr 03, 2013 2:30 pm Post subject: |
|
|
I have a similar situation:
Within the environment, we are running:
Xsan v2.2.2
A single Promise RAID, 16tbs of storage, in a RAID 6 configuration.
Two Mac Pros acting as MDCs
Qlogic 5600 fiber switch
All fiber & NFS-reshare stations are running 10.6.8
All machines are bound to Open Directory
Using ACLs & no extended attributes.
Randomly, when files render out, whether via After Effects, Final Cut Pro, etc will become corrupted. When I try & play it with QuickTime, I get:
"The movie could not open. The resource map is incorrect". So, the file is unusable in whatever program (AE, FCP) it is being accessed from.
If I try and change the file format, to say .mp4, same error message. So, I am resorted to pull the file(s) from a previous backup & hope that bad copy wasn't saved or asked the user to re-render the file.
Odd thing, as detailed from previous posts here, the file will be corrupt then fine hours later. No rhyme or reason.
In my location, we use After Effects heavily, so I am going to disable the XMP metadata option within After Effects and see how that goes. |
|
| Back to top |
|
 |
|