Jon Thomasberg
Well-known member
[Set aside for future Rankings List info here]
Follow along with the video below to see how to install our site as a web app on your home screen.
Note: this_feature_currently_requires_accessing_site_using_safari
That is undoubtedly an immense Cinebench score! Once you get your GTX 580 try running your R3Ds through Premiere Pro - I have a feeling you will get real-time playback at full-res (i.e. 4K). Matching your monitor's resolution, which I am guessing is 2560x1440/1600, should be a piece of cake!
I built a similar system. ......
That is undoubtedly an immense Cinebench score! Once you get your GTX 580 try running your R3Ds through Premiere Pro - I have a feeling you will get real-time playback at full-res (i.e. 4K). Matching your monitor's resolution, which I am guessing is 2560x1440/1600, should be a piece of cake!
In RCx Pro Beta 8, even after I maxed out the performance settings, I could not get it to use more than 40% load on my CPU cores. Most of the the time fluctuating between 32-40%. Also, the load was not equally distributed among all 12 threads. Thus, many of the threads were ~10% utilization. Memory usage never even came close to maxed-out /saturation on either system RAM or Video RAM on my GTX580. So it looks like a code thing in RCX Pro that is limiting it from taking advantage of all the horsepower.
Urgh. That is a major disappointment. Why the hell isn't RCX leveraging all the power it can? Why no GPU/CUDA support? Because of RED Rocket, I guess.
Whoa whoa... I mean, yes, that's a logical reason NOT to implement GPU/CUDA (or even proper multi-threading support), but I think the real reason is solely complexity. When it comes to using GPUs for general processing, both AMD/ATi and nVidia have their own way of doing things, which sucks because it's not a one-code-is-efficient-on-all-graphics-hardware situation, meaning RED would have to write different sets of code for ATi and nVidia (bleh.) And when it comes to multi-threading/multi-CPUs, it's actually way more difficult to actual write code that can split itself up properly/efficiently between the different cores/threads.
I think the easiest/quickest thing that RED should do is render different clips on different cores/threads. That way, each core/thread is handling it's own clip(s) and it'd be kind of a brute-force way of using multi-cpus. Not efficient, but still effective. I know in windows you could set the affinity of applications pretty easily, so theoretically, if you ran multiple instances of RCXp and set each one to a different core/thread, and it should work pretty good. But of course that's a bit of a pain in the ass from the end-user perspective... But still it'd be 4 or 8 times faster than just setting up a single batch transcode on one core (it also assumes you have more than one clip that needs transcoding and they all use the same look settings.)
Writing code to multithread is not that hard when you have discreet frames. You just fire off a separate thread for each of several frames. There is no interframe compression that I know of.
There is a question I have, however : Can an SDK caller call the SDK multiple times for the same r3d in seperate threads, to decode the frames faster? It may be that the SDK prevents that sort of activity.
Most of the work decoding r3d is the j2000 decode, that is not very easy to do on a GPU ( CUDA or open-cl ). But the demosaic is very doable in a GPU. It wouldn't speed things much, it's not the hard part.
You are free however to decode multiple clips simultaneously yourself. Just use redline commands. Not GUI but doable.
I have tripled my speed this way. So someone with an old i7 920 can 'out benchmark' a less savvy SB-e owner, as far as bulk trans-coding goes !!
-Les Dittert
One downfall to this, after using this workstation, my decked-out iMac 27" seems pathetically slow in comparison.