Both AMD and NVIDIA have long been battling it out to be the best professional GPU for CAD workstations that perform well in both 3D rendering and viewport performance. With NVIDIA typically being at the forefront, polishing their CUDA technology which is widely accepted by GPU rendering software. Whereas AMD have been playing catch up for some time and with GPU rendering software having little support for OpenCL, many 3D artists opt for NVIDIA when it comes to GPU rendering.
GPU rendering aside, both AMD and NVIDIA offer superior products that can handle large complex scenes filled with millions of polygons within 3D applications such as 3ds Max and Maya. But NVIDIA CUDA technology, due to its simplicity was easily and quickly integrated into GPU rendering applications such as NVIDIA’s own iray, V-Ray RT and Octane. By pre-compiling their code, scenes would load quickly on to the GPU, something that is not currently possible with AMD and OpenCL.
Chaos Group’s V-Ray RT was one of the only rendering software solutions to show interest in both CUDA and OpenCL upon release, and it continues to offer support for both platforms in V-Ray 3.0. Unfortunately early tests of AMD hardware showed slow rendering performance and usually resulted in crashing V-Ray RT. But as the specs of the AMD GPUs had shown promising performance, it was down to the drivers rather than hardware. Since V-Ray RT supports various hardware configurations and multiple software platforms, AMD initially struggled to support and match the performance of NVIDIA. We asked Chaos Group to share their thoughts on rendering with AMD GPUs:
We’re happy to help AMD meet customer demands with our commitment to Open CL support on V-Ray RT. – Chaos Group
So with continuing support from Chaos Group, AMD have been busy and along with their newly released AMD FirePro W9100 they have also released their latest 14.20 driver. We test the AMD FirePro W9100 professional GPU and compare how it performs in Chaos Groups V-Ray RT.
The W9100 has 2,816 stream processors (the equivalent factor to NVIDIA’s CUDA technology). Although a step up from the previous W9000, the core clock speed has dropped slightly from 975MHz to 930MHz. But with it comes the new Hawaii XT GL architecture bringing support for OpenCL 2.0 and packs an impressive 5 TFLOPS of Single-Precision performance. A much wider 512-bit memory bus delivers 320GB per second memory bandwidth meaning large amounts of data can be read very quickly, which is very much needed as the W9100 comes with a whopping 16GB of GDDR5 memory to fill. All this is needed though as the demand for 4K production increases, the six display ports are capable of running up to six 4K displays.
|AMD FirePro W9100||AMD FirePro W9000||AMD FirePro W8000|
|Core Clock (MHz)||930||975||900|
|VRAM||16GB GDDR5||6GB GDDR5||4GB GDDR5|
|Architecture||Hawaii GCN 1.1||Tahiti GCN 1.0||Tahiti GCN 1.0|
Rendering performance benchmark in V-Ray RT
We put the W9100 to test as we run 3ds Max 2014 and V-Ray 3.00.07. In addition to the W9100 the workstation supplied by Armari Ltd comes with an Intel Core i7 4930K.
In our tests the FirePro W9100 rendered this interior benchmark scene (courtesy of Chaos Group) in 2 minutes and 55 seconds. We rendered the same scene using the NVIDIA GTX TITAN which is based on the Kepler GK110 architecture and delivers similar rendering speed performance to the NVIDIA Quadro K6000, NVIDIA’s flagship professional workstation GPU. Although the W9100 is around 40% slower, the results show great improvement on previous AMD GPUs and driver versions.
Next we tested a studio lighting type scene (courtesy of Chaos Group) that has 13 million polygons to see if geometry heavy scenes perform differently. The W9100 rendered the scene in 2 minutes and 44 seconds and The GTX TITAN rendered it in 1 minute 49 seconds using CUDA. In this test the W9100 is around 34% slower so performance tests vary from scene to scene as textures, geometry and lights affect how the GPU performs. Since the GTX TITAN is pretty much on par with the NVIDIA Quadro K6000 professional CAD GPU in terms of rendering performance, this makes the W9100 a viable alternative choice to NVIDIA.
OpenCL Accelerated Parallel Processing (APP) technology
With support for OpenCL 2.0 it is now possible for applications to run on both GPU and CPU simultaneously and AMD refer to this as Accelerated Parallel Processing (APP) technology. The W9100 fully supports APP and with this V-Ray RT can see both the CPU and GPU as rendering devices.
In our tests, the render time for the interior scene dropped from 2 minutes 55 seconds down to 1 minute 54 seconds when we combined the W9100 GPU with an i7 4930K CPU. We have seen similar technology in NVIDIA’s iray as they can also combine both GPU and CPU for GPU rendering.
The fact that the CPU has managed to knock off a minute of the W9100 render time, shows that recent CPUs are becoming quite powerful in this regard. However GPUs have the upper hand as it starts to get very expensive to build multi-CPU configurations, where as it is much more cost effective to have a low end CPU along with multiple GPUs in one single workstation.
Final frame rendering
Due to the bounced light an interior scene is difficult to render and we wouldn’t expect to find a clean result that matches the production renderer. This type of scene would benefit from a multi-GPU set up rather than a single one. Alternatively it could be rendered using the production renderer instead with Brute Force + Light Cache. V-Ray RT is a Brute Force + Brute Force method, so interior renders such as these will be slow in comparison to the production renderer. Until we see Brute Force + Light Cache in V-Ray RT GPU, final frame rendering on a single GPU is best left for small products in a studio environment.
Interactive performance in Active Shade mode
Where the W9100 really impressed us was with interactivity in Active Shade mode. We were able to spin, pan and zoom around in the viewport in good frames per second (FPS) with no lag whilst getting fast GPU rendering. This allowed us to continue working on the scene and get instant feedback on lighting and materials.
Professional CAD GPUs are designed to perform better than consumer gaming GPUs in 3D software, for example other benchmark tests have proven that the AMD FirePro and NVIDIA Quadro GPUs offer better viewport performance in wireframe mode. However for the likes of Autodesk and 3ds Max we have seen a huge improvement in viewport shading and due to these changes, high end consumer gaming GPUs can give better viewport performance in shaded mode or nitrous realistic mode compared to professional GPUs.
In the past when we have tested GPUs such as the NVIDIA GTX TITAN, we experienced a lot of lag when using Active Shade in GPU mode and unable to continue working on the scene whilst it is updating the render. This is because a gaming GPU such as the GTX TITAN will render fast however it does not offer as good Active Shade viewport performance as a professional workstation GPU. This can be slightly improved by lowering the quality settings of V-Ray RT so that it doesn’t try to eliminate noise so quickly but with the W9100, we didn’t need to do this. It performs really well with no lag using the default settings in V-Ray RT GPU. See below a video screen grab of 3ds Max showing the W9100 in action.
It is clear that AMD now offer a solution that caters for both fast GPU rendering as well as for handling large 3D scenes. Up until now this is something only NVIDIA could offer with the likes of the Quadro K6000 with their high number of CUDA cores and good viewport performance. The W9100 combined with a CPU using APP technology offers excellent additional performance with the freedom to combine various CPU options.
Instant feedback on lighting and materials with no lag in V-Ray RT GPU makes the W9100 a great GPU to use in Active Shade mode. Most scenes would not use the full 16GB so you will end up paying for a lot of GPU memory that you may never need. Perhaps this is where the upcoming W8100 would be a better choice as it has similar performance to the W9100 but instead comes with just 8GB of GPU memory.