Welcome to our community

Be a part of something great, join today!

  • Hey all, just changed over the backend after 15 years I figured time to give it a bit of an update, its probably gonna be a bit weird for most of you and i am sure there is a few bugs to work out but it should kinda work the same as before... hopefully :)

Help: $5K i7 machines beating $20K dual-Xeons?

Tom Lowe

Well-known member
Joined
Jan 5, 2007
Messages
8,520
Reaction score
1
Points
0
I'm getting some strange results here, working in After Effects and Premiere.

My $5K i7 machines (not even overclocked) are making mincemeat out of my $20K dual Xeon systems. The i7 machines are about 2x or 3x faster. And the dual-Xeon machines also have way, way more RAM. My main dual Xeon machine here has 256GB ram, plus a Titan, while the i7 machine has only 64GB ram and a GTX 680. Yet, the i7 is much faster for Adobe tasks.

What the heck is going on here? The main tasks I have been using are motion-tracking/stabilization in AE and editing/rendering in PP.

Any ideas why I am seeing this kind of performance discrepancy?
 
i7 3960x vs 2690 Xeon Octocore. Source drives are not the issue, because they are roughly the same.
 
Had seen the same results a couple of months back at a friends post house. He spent a bomb on some big machines.. Fire breathers ! Similar specs to what your talking.
I was helping out on some shots next to him in After effects on my laptop and he was mighty pissed off when we did some back to back renders. Sometimes we were quicker rendering out the same shot.
Not sure if its software or hardware related issues.

We both have smoke as well, and my laptop can render out as quick as his big Xeon Octcore. ? sometimes quicker.

Actually my Laptop out performs my maxed out 12 core Mac Tower on some render tasks.

if the i7 laptops are flying, I'm really wondering what these new MAC towers are going to perform like.
 
Last edited:
i7 3960x vs 2690 Xeon Octocore. Source drives are not the issue, because they are roughly the same.

Tom,
use the same video card?
my dual 2670 and faster than i7 in times of rendering R3D files
 
As I mentioned, the dual-Xeon machines have the same or superior GPUs, so it can't be that.

Is it possible that the processors are designed for very different tasks -- ie, the i7 is designed for general computing, while Xeons are made mainly for servers with databases, large numbers of queries from the web, etc?
 
As I mentioned, the dual-Xeon machines have the same or superior GPUs, so it can't be that.

Is it possible that the processors are designed for very different tasks -- ie, the i7 is designed for general computing, while Xeons are made mainly for servers with databases, large numbers of queries from the web, etc?

is certainly that too.
I must say that when I use the premiere CC Warp stabilizer my i7 goes faster, it has more frequency than my xeon (3.2ghz i7 vs 2.6ghz xeon ) , and obviously is not optimized for dual cpu .,After Effects or premiere is poorly multithreaded.
You may notice opening the task manager.
If you edit only the native R3D files dual Xeon make a difference.
As for the new build-REDCINE x 21 .. dual Xeon is more fast.
 
Multi-threading apps ...

Multi-threading apps ...

I'm getting some strange results here, working in After Effects and Premiere.

My $5K i7 machines (not even overclocked) are making mincemeat out of my $20K dual Xeon systems. The i7 machines are about 2x or 3x faster. And the dual-Xeon machines also have way, way more RAM. My main dual Xeon machine here has 256GB ram, plus a Titan, while the i7 machine has only 64GB ram and a GTX 680. Yet, the i7 is much faster for Adobe tasks.

What the heck is going on here? The main tasks I have been using are motion-tracking/stabilization in AE and editing/rendering in PP.

Any ideas why I am seeing this kind of performance discrepancy?

Are AE and PPro optimized for multi-threading? .... if the app isn't specifically written for multi-core, multi-threading processing you're not going to see any benefits no matter how many CPUs or RAM you throw at it - the main difference in performance will be clock speed ... are AE and PPro still fundamentally 32 bit apps or have they been re-architected to take advantage of 64 bit processing?

A single 3960x i7 runs way much faster than a single 2690 Xeon - ask any gamer.

Can someone from Adobe confirm the underlying code architecture of AE and PPro and whether they take advantage of multi-core CPUs and multi-threaded processing?
 
Had seen the same results a couple of months back at a friends post house. He spent a bomb on some big machines.. Fire breathers ! Similar specs to what your talking.
I was helping out on some shots next to him in After effects on my laptop and he was mighty pissed off when we did some back to back renders. Sometimes we were quicker rendering out the same shot.
Not sure if its software or hardware related issues.

We both have smoke as well, and my laptop can render out as quick as his big Xeon Octcore. ? sometimes quicker.

Actually my Laptop out performs my maxed out 12 core Mac Tower on some render tasks.

if the i7 laptops are flying, I'm really wondering what these new MAC towers are going to perform like.
I've seen this too... we were all blown away when our HP Core I7 mobile Workstations (~$6k each, 8 Core/32GB RAM) were outperforming $12k Z820's (16 core/96GB RAM) at some image processing tasks. Looking at resource usage we came to the conclusion that it was the clock speed advantage on the 8770w's compared to the z820's that made the difference for tasks that weren't able to leverage all those cores.
 
Open up task manager and check the CPU usage on both when rendering out the same project. You can right click on the cpu graph to show logical cores also if you want to be super thorough. It should show you how many of the cores it's using.
 
just curious, in AE have you gone into the preferences and selected render multiple frames simultaneously? I always make sure and allocate the maximum amount of ram even though it may say it is using less cores...I just get better results. Give that a shot and you should be destroying the regular i7. Otherwise any single threaded operation the faster cpu will beat the one with more cores and less clock
 
Most of the performance issues in Tom's case point to the multi-threadedness (or lack of) within the softwares being used. Motion tracking and stabilization in AE, as well as many other tasks that compute away on your footage are not multithreaded or at least don't scale well to more than a few threads. Rendering in AE does scale quite well, however you do have to make sure you allocate enough RAM and the software is told to use multiple threads or simultaneously render frames. Even with that, there are bottlenecks. Codec modules or the pieces that assemble the rendered images, and often do the specific encoding, are rarely multi-threaded. This applies to both QuickTime and DNxHD as well as other popular ones. One of the reasons for not being multithreaded is that they can only assemble frames into the encoded video file in serialized order. So it becomes a sequential process that rarely scales beyond a thread or two.

The i7 3960X is a beast and one of the best CPUs Intel has released to date. Core for core, it trounces the latest and greatest Xeon CPUs as it has a similar cache implementation and fewer cores running at higher clock speed. The Xeons do have advantages and under the right conditions will be the superior CPU. Generally speaking, even though many applications are becoming multi-threaded or multi-CPU aware, few can really utilize those abilities and even fewer still can make use of it on any grand scale. With current software, it makes far more sense to purchase a single 6 or 8 core CPU running at a higher clock speed and load it up on RAM. Dual Xeons are very useful for highly parallel computational tasks like 3D rendering and other visualizations. However, even in that regard, it makes more sense to go with dual 6 or 8 core and shy away from the latest dual 10 and 12 core boxes.

Intel doesn't make it, but I think the sweet spot for our industry would be an 8-core 3.6GHz CPU with a capability to do a 4.4GHz boost when needed.

I know I'm starting to drift off topic a bit, but this is why Apple has shifted to single-CPU with the new Mac Pro. It fits the market's needs, even though most people don't see it that way. And for those considering a new Mac Pro, don't be too quick to buy that 12-core monster. The 8-core may be a much better fit. There's a 10-core CPU that is a perfect in-between for a lot of us, but it doesn't seem Apple will offer that, at least not at first.


I have the same thing going on here. My 3960X based PC that I built nearly 2 years ago is a monster. I have $15K in my Z820 with dual 2687W CPUs and 128GB RAM and the 3960X system beats the piss out of the Z820 in many applications. Z820 has a Titan GPU, I have a GTX690 in the 3960X system and 64GB RAM. On paper, it's less than half the system and 1/4 the price. In reality, it's the superior system for Photoshop, general work, 3D modeling, rigging and animation, etc.. The Z820 is the superior system in Resolve, editing in Premiere, rendering from Lightwave, Modo, Maya… Two very different systems and they perform better at different tasks.
 
Last edited:
i7 Sandy Bridge dominates the Premiere Pro benchmarks because clock speed and disk speed matter far more than # of cores.

http://ppbm5.com/DB-PPBM5-2.php


AE rendering benchmarks show that dual Xeon E5-2687W's are actually slower than a single 3960X even at stock speeds.


after%20effects.png


With the 3960X overclocked to 4.6+ Ghz, the gap widens.

A friend of mine dropped $20+K on a tricked out Xeon setup a few months ago. He now deeply regrets it because my overclocked 3960X (4.8 Ghz) runs laps around it.

As Jeff said, the Xeon has its uses. The 3960X is indeed still a beast. Just know what you're getting into.
 
As others have pointed out, Premiere Pro and After Effects simply do not scale with number of cores after 6 or 8 for a lot of usage scenarios. The 3960X would have a ~15% higher clock speed (depending on Turbo conditions) to go with the faster memory bandwidth per core. The single threaded performance advantage alleviates a major bottleneck. Overclocking only takes things further ahead. Also, I have found GK104 based GPUs (i.e GTX 680, 770 etc.) to consistently outperform GK110 GPUs (i.e. Titan, 780) in Premiere Pro. I would guess for the same reason as well - the GTX 770 runs at 1.05 GHz while 780 is stuck at 0.85 GHz, and Premiere Pro really does not use beyond the 1536 shaders, if that much. (That said, Speedgrade and Resolve do speed up on Titan) Furthermore, Premiere Pro's Mercury Engine is single precision compute oriented, which does not benefit from GK110's compute features. Finally, the drivers for GK104 are more mature and Premiere more optimised for it at this point.

Just goes to show, all the fastest hardware in the world would do nothing if your software does not utilize it. I would have to say, at this point in time the best system for Adobe CC would run Core i7 4960X (preferably overclocked) with GTX 770 with fast (>2133 MHz) 64 GB RAM and of course a fast storage subsystem.

I do hope in the future Adobe CC - and other post related apps - scale up to dozens of cores. Some things do already, but not enough.
 
I do hope in the future Adobe CC - and other post related apps - scale up to dozens of cores. Some things do already, but not enough.

I think they'd be silly not to be working their asses off towards that end; Intel's MIC processors [Many Integrated Cores] are just over the horizon, already being shown off at tradeshows and whatnot.
 
Most of the performance issues in Tom's case point to the multi-threadedness (or lack of) within the softwares being used. Motion tracking and stabilization in AE, as well as many other tasks that compute away on your footage are not multithreaded or at least don't scale well to more than a few threads. Rendering in AE does scale quite well, however you do have to make sure you allocate enough RAM and the software is told to use multiple threads or simultaneously render frames. Even with that, there are bottlenecks. Codec modules or the pieces that assemble the rendered images, and often do the specific encoding, are rarely multi-threaded. This applies to both QuickTime and DNxHD as well as other popular ones. One of the reasons for not being multithreaded is that they can only assemble frames into the encoded video file in serialized order. So it becomes a sequential process that rarely scales beyond a thread or two.

The i7 3960X is a beast and one of the best CPUs Intel has released to date. Core for core, it trounces the latest and greatest Xeon CPUs as it has a similar cache implementation and fewer cores running at higher clock speed. The Xeons do have advantages and under the right conditions will be the superior CPU. Generally speaking, even though many applications are becoming multi-threaded or multi-CPU aware, few can really utilize those abilities and even fewer still can make use of it on any grand scale. With current software, it makes far more sense to purchase a single 6 or 8 core CPU running at a higher clock speed and load it up on RAM. Dual Xeons are very useful for highly parallel computational tasks like 3D rendering and other visualizations. However, even in that regard, it makes more sense to go with dual 6 or 8 core and shy away from the latest dual 10 and 12 core boxes.

Intel doesn't make it, but I think the sweet spot for our industry would be an 8-core 3.6GHz CPU with a capability to do a 4.4GHz boost when needed.

I know I'm starting to drift off topic a bit, but this is why Apple has shifted to single-CPU with the new Mac Pro. It fits the market's needs, even though most people don't see it that way. And for those considering a new Mac Pro, don't be too quick to buy that 12-core monster. The 8-core may be a much better fit. There's a 10-core CPU that is a perfect in-between for a lot of us, but it doesn't seem Apple will offer that, at least not at first.


I have the same thing going on here. My 3960X based PC that I built nearly 2 years ago is a monster. I have $15K in my Z820 with dual 2687W CPUs and 128GB RAM and the 3960X system beats the piss out of the Z820 in many applications. Z820 has a Titan GPU, I have a GTX690 in the 3960X system and 64GB RAM. On paper, it's less than half the system and 1/4 the price. In reality, it's the superior system for Photoshop, general work, 3D modeling, rigging and animation, etc.. The Z820 is the superior system in Resolve, editing in Premiere, rendering from Lightwave, Modo, Maya… Two very different systems and they perform better at different tasks.

Jeff,

So are you saying the 12 core is poorer value or the 8 core may perform better than the 12 core ? I'm assuming this is process or speed of the 8 core being faster than the 12 core ?

thanks,

Dave
 
I think they'd be silly not to be working their asses off towards that end; Intel's MIC processors [Many Integrated Cores] are just over the horizon, already being shown off at tradeshows and whatnot.
Not every computing task is capable of running efficiently on multiple cores. Logically... tasks that are highly sequential (C relies on the output of B which relies on the output of A) can't be executed simultaneously. Luckily, in imaging there are many opportunities for efficient multiprocessing - but there will always be some things that will run faster on a single really fast processor than they will on many not as fast processors.
 
Jeff,

So are you saying the 12 core is poorer value or the 8 core may perform better than the 12 core ? I'm assuming this is process or speed of the 8 core being faster than the 12 core ?

thanks,

Dave

I'm reserving all final judgement until Apple shows exactly which configurations will be available. We know of a couple, but word on the street is there will be one or two more. At this stage, based on the available info, both the 8 core and 12 core are useful configurations and it really comes down to the software you will be running. The 8-core is clocked at 3GHz, but Intel will also be releasing a 3.2GHz version of it in the coming weeks. The 12-core clocks in at 2.7GHz. It offers 25% more cache per thread than the 6-core, which can help to offset the difference in clock speed. Or at least help to offset it for processes that are bus-intensive where the cache can make a significant difference. The 12-core would be the superior choice for large relational database applications or visualizations or renderings that continuously draw upon a finite set of repetitive data.

Adobe Premiere does a fair job at using multiple threads and actually handles stacked pipeline threading or "HyperThreading" quite well. Even with that, it tends to top-out around 10 to 12 threads, regardless of how many cores/threads you may have. So if the system is going to be primarily used for editing in Premiere and other apps that are not multi-threaded, the 8-core, or even the faster-clocked 3.5GHz 6-core, could be a much better choice than dumping a ton more money into a 12-core. OTOH, DaVinci Resolve doesn't handle the HyperThreading as gracefully and often scores better with it turned off, as do several rendering apps on the market. But Resolve also scales better to additional cores than Premiere does and for a Resolve workstation, I would most likely look more to the 12-core. I would want to benchmark both or see what others say before making a decision though, especially if the 12-core is significantly more expensive.

This is an area where I think Intel's pricing structure works against them and works against the industry. Sure, the complexity is greater and production yields are quite a bit lower on the 12-core chips than the 6-core chips, making the 12-core chips out to be quite a bit more expensive. Unfortunately, that leads people to believe that the 12-core is always the superior choice and is going to excel at every task.

There are of course real benefits to having a 10 or 12 core system if you really know how to use it and actually run the software to use it. You can always have Maya rendering in the background on 6 cores/ 12 threads while you continue to work in Premiere using another 10 or 12 threads and not interfering too much. Once again, I bring up the whole thing about knowing what you're doing because the thread scheduling in current desktop OS's is crap. Partially to blame on the OS, but in reality it still comes back to how these softwares are written. Windows actually makes this process a bit easier to some extent because within the GUI you can directly access CPU/ thread affinity and priority settings for individual apps, whereas under OSX or Linux you have to drop into the terminal. On Windows, I made a small app several years ago to automate the assignment of affinity and thread priority to apps as I choose.

I suppose this is delving into some of my "secret sauce" that proliferates through my workflows and I've posted an older 32bit version of my little utility here on the forums not too long ago.

Personally, I'm hoping Apple will release a 10-core Mac Pro, since Intel makes a couple very compelling 10-core Xeon chips. However, I'm of the mind that the 8-core is going to be the sweet spot for many, even most users. 3GHz is a decent speed and the Turbo can offer a nice boost on top of that, but one of the real advantages here to the 8-core is the 25MB cache, so just a fuzz more than double of what comes on the 6-core and not much less than what is included on the 12-core.

I'll see how it all prices out, but based on the info I have right now and since I'll be ordering my first Mac Pro ASAP without any real-world benchmarks to compare, I'm planning to go for the 8-core with dual D700'S (possibly dual D600's) and the 1TB of flash storage. I'll put in 64GB RAM, either aftermarket or Apple, not sure. I would prefer 128GB and I know Apple will offer that at some point too. The problem is that 32GB 1866MHz RDIMMs just don't exist where we can buy them yet...
 
First you need to post more details about your system. Motherboard? Cards are in which slots?

Multiprocessor systems behave differently than single processor systems when PCI-E cards such as GPUs, raid, rockets, etc are in use, so it takes a little understanding to configure properly.
They must share access to the source data, each processor is linked to specific PCI-E slots, if they need to access data from the other CPU's pci-e allocation then it must go over the QPI link slowing things down.
In theory dual QPI links on E5 series processors are plenty fast enough, but in reality they slow things down, Cuda memory bandwidth is cut in half if the card is not in the slot tied to the CPU which has the applications affinity.


Is NUMA enabled in bios? Is Hyper threading disabled? Are C-states set to C0? Its PowerCfg set to High performance?
Are you using a rocket or software decode?

For certain tasks that are not multi-threaded - R3D software decode being one of them, a higher clocked single processor will be faster, for most tasks in AE this is not the case. The warp stabilizer was poorly threaded, the Oct. update was suppose to address this, haven't tested to to see if it did improve. Are you running the latest CC update?

$20k is ridiculously over priced for a dual 2690 machine, so I would hope you would get some support form your vendor.




I'm getting some strange results here, working in After Effects and Premiere.

My $5K i7 machines (not even overclocked) are making mincemeat out of my $20K dual Xeon systems
Any ideas why I am seeing this kind of performance discrepancy?
 
Back
Top