Welcome to our community

Be a part of something great, join today!

  • Hey all, just changed over the backend after 15 years I figured time to give it a bit of an update, its probably gonna be a bit weird for most of you and i am sure there is a few bugs to work out but it should kinda work the same as before... hopefully :)

r3datamanger checksums really that important?

Juan Qi An

Member
Joined
Apr 1, 2009
Messages
20
Reaction score
0
Points
0
hi, im currently working as a data wrangler/3rd AC for a rental house. just needed some advice on data transfer. we did some tests in the studio and was trying to figure out the fastest but surest way to transfer the footage out of the red drives.

the r3d data manager has the option of checksums but the data transfer rate seems to be alot slower as compared to the normal drag-n-drop way. we also tried creating a read only disk image via the disk utility.

we shot a 30mins of footage (about 36gb worth) and the timings are as follows:

*all transferred on a brand new macbook pro

the r3datamanager took a total of 32mins including 9mins for the checksums.

dragging and dropping the footage into the mac only took just 9 minutes

while the disk utility option took about 15 minutes including about 6 minutes for the verification process.

there were no problems at all with all three ways of transferring which brings me to my point, is the checksums in the r3datamanager really that important? because im sure without going through that process it will save me a lot of time especially on location.

forgive me as the camera came in barely a week ago and both me and my boss are fairly new to this whole system and we're just trying to look for the best and fastest solution to data transfer.

any help would be much appreciated!
 
To be honest, I have been doing the drag and drop thing for over a year now and never had a corrupt file because of that. Unless someone comes up with real-life stories about corrupt data because of drag and drop I think I will keep doing so - the 3 times slower penalty is just too much for me. In that time I can do more important things, like making 2 more backups.

I do check that all data sized are the same though. And I do think it is a good idea to render out dailies on the set - a file you can render into a quicktime is a good file for me.
 
If the project is funded by yourself, by all means, skip the checksum. Its your risk. When the project is funded by someone else, do you have the right to increase risk of corrupted data just to save some time? Make it your clients choice...that way they decide on the risk factor.

My suggestion to the thousands of clients and shows that we have worked on is "Checksum, Checksum, Checksum" If they chose not to, they have made that informed decision.

The time lost is nothing compared to the hassle of the insurance claim.

John DeBoer
Director of HD Sales
SIM Video International
 
If the project is funded by yourself, by all means, skip the checksum. Its your risk. When the project is funded by someone else, do you have the right to increase risk of corrupted data just to save some time? Make it your clients choice...that way they decide on the risk factor.

But the question was: how important is it? Or in other words: what exactly is the risk? Just doing checksums for the sake of being safe can also be a waste of time and resources.
 
I'm going to go out on a limb and say, checksums are not important.

I have been running these kinds transfers since the advent of P2. I came to the conclusion that the processing time was too great for the average on set transfer.

I developed a specific method that I use onset:

I copy footage using verbose copy command in the OSX Terminal and carefully review the output to be sure that each file is transferred from each mag to two places, master and backup.

I always copy directly from the original storage to master and backups locations and never from master to backup.

Once the copy is complete I check all file sizes between all three drives, and I play back or scrub all of the _H proxies on the Master Drive.

At that point I carefully delete the original media and return it to the camera.

Now, are checksums mathematically superior to this method. Yes. There is an astronomically lower probability of failure if all files are summed and compared.

But keep in mind file size is a type of checksum that is fairly difficult to compromise. The copying program itself contains cyclic redundancy checks on the operating system level that add an invisible layer of mathematical security. It's very difficult to compromise a copy and not compromise the amount of data transferred.

In other words. If you are monitoring the transfer carefully you will know when it fails.

Disagree with this: great... Show us two files with different checksums but the same file size that both load up just fine in through the quicktime component.

Now maybe I'm loosing a few potential clients who insist that checksums are the only way to know that the bits on one medium are exactly the same as the ones on another. I can only counter that by saying that in my experience the real problem, and the reason why tapeless recording is so notorious is because of the prevalence of human error in the process.

In general automated systems reduce the amount of human vigalence involved in the transfer and thus increase probability of failure in the entire human/machine mechanism. Likewise very long transfer times due to time spent calculating checksums wreak havoc on the workflow and also increase the probability of human failure. Managing digital media is a skill just like using your hands in a changing tent. Checksums are an unnecessary time expenditure and a false sense of security.

Just my opinion.

IBloom
 
My suggestion to the thousands of clients and shows that we have worked on is "Checksum, Checksum, Checksum" If they chose not to, they have made that informed decision.

If you have had thousands of clients and transfered footage for them with checksums, then how many occasions have you had that the checksum failed? This I would really like to know.
 
I have been running these kinds transfers since the advent of P2. I came to the conclusion that the processing time was too great for the average on set transfer.


Now maybe I'm loosing a few potential clients who insist that checksums are the only way to know that the bits on one medium are exactly the same as the ones on another. I can only counter that by saying that in my experience the real problem, and the reason why tapeless recording is so notorious is because of the prevalence of human error in the process.

Ian:

I have been involved with several insurance companies and bond companies on this exact issue. Yes, I would probably trust you (based upon reputation and upon your very exacting posts on this forum) with my own show. However, if you were working on a show supplied by us, be prepared for me to "strongly suggest" to production that you run a check sum.

As insurance companies are starting to ask questions, I would bet that you will shortly see with the major production insurance companies writing the checksum clause into their policies.

As for the time factor.....Chris Parker just finished the pilot for the CBS series US Attorney. Three cameras, two full time, doing not only a checksum, but rendering out footage as well to confirm that all is well. These steps are being skipped on many shows out there, and it will only create problems. We already hear major producers stating "no red on my set". This is caused by people not being diligent in their workflow proceedures, as well as by producers not wanting to pay for the DIT.

I heard yesterday of a DIT that just dumps the footage to a drive and sends it to post.....and not even rendering short bits to confirm the footage is fine, let alone using a checksum. This is asking for trouble, and then when it occurs, the camera gets blamed.

As for having a file showing this problem, I do have one, but am under a NDA to show it. My question to the insurance company execs was as follows: Can this theoretically happen? The answer is yes. Same as running film through a mag to check for scratches.....9 times out of 10 it is fine, but that one time............on that one "money" shot......why always on the money shot????

Everyone needs to know, for this camera to succeed on the large productions, steps cannot be skipped due to time constraints, or there not being a fully qualified DIT. This will only lead to problems, and affect the usage of RED on shows. Yes, some steps can be avoided with little risk, but I doubt anyone will say no risk.

On a personal note, Ian, will you be at NAB? We should hook up for a brewski....



John DeBoer
Director of HD Sales
SIM VIDEO Worldwide
 
But the question was: how important is it? Or in other words: what exactly is the risk? Just doing checksums for the sake of being safe can also be a waste of time and resources.


I still say, let your client know the risks......and suggest every way possible for the risks to be avoided.

I have worked on more RED shows than most, including being the Technical Producer on the largest RED show ever (64 REDs), and I never would never allow it to be skipped on a shoot.

John DeBoer
Director of HD Sales
SIM VIDEO INTERNATIONAL
 
i do a mix...

two with checksum
one with drag and drop
one with copy and space (will go to the Posthouse and be back up`ed there too)
redcine---checking
output critical shots
compare number of shots and files size
are all the proxys there
did the camera number the files right (several camera shootings)
---if no copy that footage,rename with renamer for mac and recreate proxys...leave the original.
load H-proxys in FCP...mess a little around with them
TC to the audiotrack (red channel 2 via Lockit)

Reference Audio via Comtec to channel one...original soundfiles on sounddevice plus onset backup on cf cards+ 3 backups afterwards

Dailies:
not dailies in the classical way.
load and cut around with synced audio in FCP.

for DOP
REDCine MBpro with 24" Apple Cinema Display

End of day...
4 backups on different harddrives which will have different locations over night...

if the airport in Paris burns down, i still have 2 back ups
 
Well, of course as the program author, Im going to put my 2 cents in.

First, it seems like you could do some optimizations on your system. 32 minutes is a bit much for 36gb. However, remember that a proper checksum procedure is 3 file reads. So if it takes 9 minutes to read it once, it should 27 minutes to read it 3 times. However, in our experience with a speedy setup, you can get it to right around 20.

The advantage of R3D Data Manager over other programs generating checksums is that we save the checksums to be referenced later. So at any point in time you can ensure your data is exactly the same as was on the mag. If it doesnt match up, you should use move to a different backup (you did create 2 copies in 2 different filesystems and put them in 2 geographically different locations, correct?)

Rendering items is an option to see if the files are valid files. However, that would seem to defeat the point of being speedy. The poster said he is using a macbook pro, so rendering times are not going to be fast.

The thing to remember about R3D Data Manager is that it is meant to be a one click system. You can set it up and walk away - theres no need to watch the progress as it happens. The next version will have a text message option - if your computer has an internet connection, it will text you when its done and let you know pass/fail.

John brings up the excellent point that R3D Data Manager is the only software that Im aware of that fulfills bond company requirements. Thats an important point when you are playing with someone elses footage. In reality, when you get that call from the editor about a missing shot or incomplete transfer, what will you stand on to say it was complete? On every show Ive been on Ive received that call from the editor and in every case I was able to point to logs saying it transferred to our systems correctly. In every case it was because someone forgot about some footage (either to transfer it from the shuttle drive to the edit raid, or the shuttle drive went to the wrong place) and in every case I suggested they use R3D Data Manager to transfer footage in the future.

As for what exactly the risk is: the one perfect take. That is the risk you are taking by not ensuring that every copy is a valid copy. As the data manager, you must take every step to ensure that you data is valid at all times. Otherwise, why shoot? If you cant guarantee that a copy is exact, dont pull the camera out of the case. You'll save a ton of time that way. :)

Every modern hard drive will have hundreds to thousands of read errors per minute of sustained read. Your 36Gb of footage had at least 9000 read errors during a single copy. 99% of those are caught by the internal hard drive ECC. But can you guarantee that all of them were? And how do you know it was written correctly or that the ECC was generated correctly?

Finder is notorious for silently dropping errors. It will not report if the hard drive suddenly fills up, or if there is a read/write error, among other things. Since writing this software I have heard from literally dozens of people who were having trouble with finder copies. It really is amazing how often finder will fail, especially with larger copies.

Ian - obviously you are a more advance computer user. You probably have your system tuned in, do regular checks of your drives, etc. However, most people are not. Ive had a people call me out on my software telling me it was crap and that other methods were better - because my software was always failing their transfers. Then a week later they reply that their new harddrive failed or the card reader was not up to spec. Turns out that R3D Data Manager was identifying a problem that other software was missing or didnt check at all.

In addition, most people on set simply dont have time to read the output of complex copy commands or check data sizes. After a 14 hour day, all I want to do is go home, thats for sure. So R3D Data Manager has a simple success/fail - and if you want more info, you can check the report. If you want even more info, you can see the activity log.

But more importantly, R3D Data Manager removes a lot of area for human error. Select the root of the Red Media on set, and it will copy all the footage, every time. It will make sure everything was copied, and copied correctly. It will do 90% of the checks you should do as a DAS/DIT in an automated fashion, with one click. This way, you can click copy and do other more important tasks.
 
Hi Cuneyt:

That system seems perfect to me....check sum was done on 2 drives, and you have done several backups and redundant backups.


John DeBoer
Director of HD Sales,
SIM VIDEO INTERNATIONAL
 
If you have had thousands of clients and transfered footage for them with checksums, then how many occasions have you had that the checksum failed? This I would really like to know.

On shows Ive worked on personally, Ive had it fail at least 8 times. Ive transferred over 100 terabytes of footage to date. Each of the fails were on clients hard drives, most of them being "brand new drives".

I estimate that R3D Data Manager has transferred over 1.5 petabytes of footage to date. So you can figure out the error rate there if you like.

To add to the list that Kaya posted, I also do a zero-out of all destinations before I copy a single file. This way the drive will know which blocks are bad on the drive from the start and use other blocks to write around them. It will take 4 to 6 hours on a drive, but that process alone has identified bad drives that are not worth putting valuable footage on.
 
Hi Cuneyt:

That system seems perfect to me....check sum was done on 2 drives, and you have done several backups and redundant backups.


John DeBoer
Director of HD Sales,
SIM VIDEO INTERNATIONAL

i did forget, i use two different checksum softwares and different brands of the harddrives to spread the risk
 
On shows Ive worked on personally, Ive had it fail at least 8 times. Ive transferred over 100 terabytes of footage to date. Each of the fails were on clients hard drives, most of them being "brand new drives".

To add to the list that Kaya posted, I also do a zero-out of all destinations before I copy a single file. This way the drive will know which blocks are bad on the drive from the start and use other blocks to write around them. It will take 4 to 6 hours on a drive, but that process alone has identified bad drives that are not worth putting valuable footage on.

The hard drives (not RED DRIVES) in the post process are the biggest problem. Brand new drives scare me......their failure rate is quite high.

John DeBoer
Director of HD Sales
SIM VIDEO INTERNATIONAL
 
i did forget, i use two different checksum softwares and different brands of the harddrives to spread the risk

Oh - that reminds me of my favorite saying: Data management is really Risk Management. There is risk everywhere, but as the DAS/DIT, it is your job to mitigate it as best as possible.

As for the 2 different checksum softwares - Im not sure how much risk that avoids. In R3D Data Manager we use industry-standard checksum processes to create the checksums. There shouldnt be any difference between a checksum created in R3D Data Manager and a checksum created with other standard tools, provided the file was read correctly both times. In fact, thats another point about R3D Data Manager is that you dont have to have the program to confirm you files in the future. Just using the md5 command provided with every Mac since OS 10 will generate the same md5 checksum (again, provided a correct read of the same file both times).
 
I had 3 people message me about zeroing out a drive. Heres how to do it on a mac:

Using disk utility, select the drive, then the erase tab. Under security options select "zero out drive". It will take a long time, but its worth it.
 
every software can hang or something, if i run two programs paralel on a decent maschine i hope to minimize another risky source...but that checksums are almost the same...yep your right.
 
Okay this is getting interesting. At least now we are getting an idea of the risk that's there. Somebody correct me if I'm wrong but if I do the math:

Let's say 8 failures in 100 terabytes of footage. The feature films I have worked on shoot let's say 4 terabytes. So if I backup to just one drive the risk would be 8/100 x 0.04 = 0.32 failures. Which would mean a 1 in 3 chance something goes wrong.

But that's only if I copy to one drive. Now the tricky part of my math: if I make two copies from the master to separate drives what is the chance BOTH copies have an error in it? Let's say 3000 shots for this feature. If both drives have one shot that has a copy error then there is a 1 in 3000 chance that error is in the same shot, right? But the chance is 1 in 3 so we have to divide by 3 so that means over a period of 30 days with two cameras there is a 1 in 9000 chance that a single shot is unrecoverable because of a copy error.

Now, this is very much in theory, in reality I guess other things come into play. As said before: faulty card readers, bad hard disks, quirky firmware on CF cards which could make the risk on a certain shoot much higher, up to the point where is it almost inevitable. But that's where a DIT's skill come into play, detecting faulty hardware and keeping up to date with developments.

Don't get me wrong, I don't think checksum software is a bad idea. I was just wondering about the risk.
 
Back
Top