Welcome to our community

Be a part of something great, join today!

  • Hey all, just changed over the backend after 15 years I figured time to give it a bit of an update, its probably gonna be a bit weird for most of you and i am sure there is a few bugs to work out but it should kinda work the same as before... hopefully :)

Render Farming

Dominik Muench

Well-known member
Joined
May 23, 2008
Messages
2,301
Reaction score
1
Points
38
Location
Gold Coast - Australia
Website
www.instagram.com
hi guys


I was wondering if anyone could point me towards a program or tool that allows me to set up several MAC's in a network as a render farm ? Or even MAC's and PCS together if thats possible ?

I do my rendering with clipfinder.

thanks
 
You would have to install Hackintosh on the PC's in order to use Apple Qmaster on them assuming they have Intel processors.

-michael zaletel
(shooter)
 
thanks guys

Jake: Oh cool , i worked with the rubber monkey guys on a feature last year, i will check out that program for sure.

michael: yep they are all intel CPU's

Dusty: So same files on all computers, with the same settings, pointed to the same directory ? and that should do it ?
 
You want the R3D source to be on one computer/drive and set the destination drive to be the same computer/drive and leave that computer out of the cluster. Make sure that drive is mounted on the desktop (finder preferences) of each qmaster node.

-michael zaletel
(shooter)
 
You want the R3D source to be on one computer/drive and set the destination drive to be the same computer/drive and leave that computer out of the cluster. Make sure that drive is mounted on the desktop (finder preferences) of each qmaster node.

-michael zaletel
(shooter)

If the computer where your destination drive is setup is "Controller and Services" why do you need to leave it out of the cluster?
 
If the computer where your destination drive is setup is "Controller and Services" why do you need to leave it out of the cluster?

This may not be necessary in some instances but here are two reasons:

1. Applications like Shake require an absolute path to the file depending upon the setting for UNC in your .h file. The path to a hard drive on "another" computer on the network is the same for all "other" computers on the network but different for the computer that holds that hard drive. The host computer may fail in the cluster because it cannot find the file using the given network path.

2. Because you will be reading from and writing to the computer containing the R3D file, you will see overall performance gains if that computer is not also working on the compression or rendering.

-michael zaletel
(shooter)
 
This may not be necessary in some instances but here are two reasons:

1. Applications like Shake require an absolute path to the file depending upon the setting for UNC in your .h file. The path to a hard drive on "another" computer on the network is the same for all "other" computers on the network but different for the computer that holds that hard drive. The host computer may fail in the cluster because it cannot find the file using the given network path.

2. Because you will be reading from and writing to the computer containing the R3D file, you will see overall performance gains if that computer is not also working on the compression or rendering.

-michael zaletel
(shooter)

Thanks Michael. I look forward to testing this. I have indeed noticed quirky inconsistencies with QMaster and maybe it is because I am including my host in the cluster.
 
1. Applications like Shake require an absolute path to the file depending upon the setting for UNC in your .h file. The path to a hard drive on "another" computer on the network is the same for all "other" computers on the network but different for the computer that holds that hard drive. The host computer may fail in the cluster because it cannot find the file using the given network path.

2. Because you will be reading from and writing to the computer containing the R3D file, you will see overall performance gains if that computer is not also working on the compression or rendering.

The first issue can be avoided if you're not working with a file on the boot volume. A non-boot volume called "Project" will mount at /Volumes/Project. If you then share this entire drive (and name the network share the same as the drive), other computers will also mount it at /Volumes/Project.

The second issue tends not to have all that much impact; reading from a hard drive and pushing data over a network is not a particularly CPU-intensive operation these days. With most small clusters, having an extra render node will benefit you more than having a "dedicated" server. (Of course, you could just serve the files off of a G5 or other older computer that couldn't contribute to the actual transcoding effort anyway.)
 
Just posted a detailed thread offering additional important troubleshooting tips in one place:

http://reduser.net/forum/showthread.php?t=30266


-michael zaletel
(shooter)

Once again, thanks for taking the time. Unfortunately this made my situation worse. I ended up going back to the way I had it and at least I can use it albeit a bit quirky at times. In short, I can only get a managed cluster to work at all, even after going through every single item on the checklist twice. Whether my target drive is loaded on the desktop of the other machines or only on my controller there is no difference in performance. Also- if I include my controller in the cluster or leave it out- still, no difference in performance.

This is NOT a slam on Michael's post. I am simply putting my experience out there for others to digest.
 
Hi Dan:

So what exactly is happening in the Batch Monitor at this point? Waiting...? Error -42? Are all nodes participating in the job? Remember that Compressor will split up a compression task into equal parts and it's only as strong as the weakest link. In other words, one slow node will slow the entire thing down. Better off to only have fast nodes on when using Compressor. This is not necessary with Shake because Shake will divvy up rendering by frames and so even the slowest node can contribute without slowing down the overall job. If you've turned off Qmaster on your slower computers and your fast nodes are all contributing to the job (segments), you should see performance gains. Let me know if not. For example, I have a couple of old G4's on my network, if I have those active when I use compressor, the job is SOOO SLOOOW but if I disable those nodes, it's really fast. When doing Shake stuff, I turn everything on and even though some nodes only do 1 frame per 10 seconds vs. 2 frames per second for 8-core, that little bit still helps. I've found you can only output DPX or other single file image sequences with Shake when using Qmaster because it has a real problem with QuickTime (That's reason for error creating quicktime files message in batch monitor).

Let me know more and I'll try to help.

-michael zaletel
(shooter)
 
Hi Dan:

So what exactly is happening in the Batch Monitor at this point? Waiting...? Error -42? Are all nodes participating in the job? Remember that Compressor will split up a compression task into equal parts and it's only as strong as the weakest link. In other words, one slow node will slow the entire thing down. Better off to only have fast nodes on when using Compressor. This is not necessary with Shake because Shake will divvy up rendering by frames and so even the slowest node can contribute without slowing down the overall job. If you've turned off Qmaster on your slower computers and your fast nodes are all contributing to the job (segments), you should see performance gains. Let me know if not. For example, I have a couple of old G4's on my network, if I have those active when I use compressor, the job is SOOO SLOOOW but if I disable those nodes, it's really fast. When doing Shake stuff, I turn everything on and even though some nodes only do 1 frame per 10 seconds vs. 2 frames per second for 8-core, that little bit still helps. I've found you can only output DPX or other single file image sequences with Shake when using Qmaster because it has a real problem with QuickTime (That's reason for error creating quicktime files message in batch monitor).

Let me know more and I'll try to help.

-michael zaletel
(shooter)

I am currently not using Shake so therefor I have only the Compressor boxes checked (including "managed") on all nodes in the cluster. There are a total of 10 nodes. 4 are Nehalem MacPro's, 2 are Intel 8 Cores and 4 are Quad Core G5 Power PC's circa 2006. All computers are involved in each task and the first 50-60% of each job FLIES. At that point the "remaining time" indicator starts to go in reverse until the point where I have to pause the job and then resume it (sometimes that allows it to finish quickly) or cancel it all together. I have done everything on the checklist you provided but I get the best results when doing a managed cluster from 1 computer (controller plus services) while all others are services only.

Another thing I'm confused about is managed clusters vs quick clusters. I want to set up an easy system whereby all stations can utilize Qmaster rendering power without a big setup hassle. Any ideas? And by the way, you are going WAY above and beyond the call of duty here Michael so "call Apple Support" is a valid response.
 
Hi Dan:

No problem at all.

1. You do not want to check the boxes for managed under Services on ANY node. That's what the Quick Cluster with Services is for. You only want to check the boxes for Share. In your case just the top box for Distributed processing for Compressor.

2. You want to use Quick Cluster with Services on your primary controller node meaning the top radio button in Qmaster preferences.

3. On ALL of the other nodes, you want to only click the radio button "Services Only" under "Share this computer as" and then ONLY check the checkbox for share NOT managed under services for Compressor Distributed Processing for Compressor.

Are you saying you already tried it the way outlined above and it did not work?

Finally, the G5's are the bottleneck for Compressor. Try turning off services on those 4 G5's and see if your job runs faster with only the other six. My guess is that it will run faster. As I said, if running shake, those can pitch in but if not, they may actually slow things down. See, Compressor splits the encoding job up into 10 equal parts. If the Intel Machines can do their parts 3 times faster than the G5's, then Compressor is waiting on the 4 G5's to finish their parts. If compressor instead only splits the job into 6 parts and all 6 are finished at around the same time, then the entire job is finished more quickly.

Let me know if the above does not resolve your problem.


ADDED: When you click SUBMIT in Compressor and the screen comes up where you choose the cluster. Choose the QuickCluster Node you created and DO NOT check the box that says "Include unmanaged services on other computers". I realize all of this may not be intuitive but that's how Quick Clusters work.

-michael zaletel
(shooter)
 
I need to add a couple more things to this.

1. Your QuickCluster will only show up in Apple Qadministrator under Clusters on the computer that is set as the Quick Clusters. If you open Apple Qadminstrator from any other computer, you will not see it BUT you will see it AND be able to select it from the drop-down for Cluster in Compressor after you click Submit. BTW, i just noticed it is not even possible to check the box "Include unmanaged..." in that window once you select a QuickCluster.

2. Compressor attempts to optimize for each job figuring in data transfer and combining segments and etc. Since you have some REALLY fast machines, if you are only trying to convert a 3GB file to H.264, Compresor will NOT use your other cluster nodes to speed up the job, it will likely choose your fastest 8-core Mac and assign the entire job to that computer. Try giving Compressor a harder task like converting a 15GB file to H.264 or ProRes and you will see it kick into gear and pull in all the nodes. One reason for this logic is because Qmaster is built for workgroups and so it expects more jobs to be coming any second. Even though it could do a 20 minute job in 5 minutes by involving all nodes, it figures, why tie up those other computers and keeps them free for next job.

Hope this helps you and others. I'm also hoping others will add to this and correct anything I've said that isn't 100% accurate.

Thanks,

-michael zaletel
(shooter)
 
The Quick Cluster method you detail does not work here. I set it up as you described and submit the job and it goes through the motions as if everything is working correctly but never makes any progress (progress bar says unknown time remaining and Batch Monitor says 1 job submitted). The only way I can make anything work is to setup a managed cluster with both "share" and "managed" checked on all computers, "services only" checked on all computers except the controller which has "controller and services" checked. This method works flawlessly but I just want to be sure I'm getting everything I can out of my Compressor tasks. For now, I'll just stick to this setup because it works but continue to experiment. Something tells me all of this is going to change for the better when Final Cut Suite 3 and Snow Leopard come out.
 
One other tip to point out is that you can add the network volume (by simply dragging and dropping it) to your startup items under System Preferences -> Accounts -> Login Items. This way you don't have to manually remount the shared volume on each computer after a restart.

-michael zaletel
(shooter)
 
Back
Top