Since the introduction of GPU accelerated rendering, the demand for faster GPU computing has increased. The NVIDIA Quadro is the industry standard for 3D applications as their robust performance, specialised drivers and large amounts of video memory have long been a necessity when dealing with large 3D scenes. However as GPU accelerated rendering has become increasingly important to 3D artists, many have discovered that the gamer oriented GeForce GTX range, with their focus on raw speed by utilising CUDA technology can offer high levels of rendering performance over professional features. Gaming GTX based products have generally not required large amounts of on board memory but offer a large number of CUDA cores with high clock speeds designed for high performance gaming.

Why do we need CUDA?

CUDA technology is a parallel computing platform which makes computational tasks take much less time by synchronising and organising data simultaneously. The result is that you see much more detail within games at any one time. This same technology is also used in GPU accelerated rendering to speed up ray tracing calculations.

Renderers such as Octane, V-Ray RT and iray draw their power from CUDA technology, and the total number of CUDA cores on a GPU has become an important factor when choosing which GPU to buy. Quadro cards such as the recently announced Quadro K6000 have a high price tag which may make it a difficult purchase for 3D artists working with a constrained budget. This leaves most stuck for choice as it is becomes a tough call to decide if memory is more important or speed. Typically if you do a lot of GPU rendering and require quick feedback then a GTX card would suit. If you work with a lot of heavy geometry that demands a large amount of memory then a Quadro would be the better option. But what if you need both?

GeForce GTX TITAN

GTX-TITAN-001

The GeForce GTX TITAN is equipped with 6GB GDDR5 memory and delivers high performance with a whopping total of 2688 CUDA cores. When we completed our benchmark tests, on average we discovered it was only 15% slower at completing a render compared to the GeForce GTX 690. The GTX TITAN is just a single GPU so to almost match the performance of a dual GPU is impressive.

We asked Sean Killbride who is the Technical Marketing Manager at NVIDIA what he thinks about the GTX TITAN and why it has been a success in the CG industry.

The GTX TITAN provides a great opportunity for users who need high performance and high memory, but don’t require the additional professional features or certifications of the Quadro professional line. TITAN certainly has the potential to allow 3D professionals to get the maximum benefit out GPU based ray-tracing.

The Kepler GK110 family

The TITAN is based on NVIDIA’s latest chip architecture, the Kepler GK110. The GeForce GTX 700 series, Tesla K20/K20X and the newly released Quadro K6000 are all on the same architecture but each have an alternate defined purpose.

GTX 780 GTX TITAN Quadro K6000 Tesla K20X
CUDA Cores 2304 2688 2880 2688
Core Clock (MHz) 863 837 ~900 732
Boost Clock (MHz) 900 876 N/A N/A
Memory 3GB GDDR5 6GB GDDR5 12GB GDDR5 6GB GDDR5

The GTX 780 has less memory at 3GB and is also around £300 cheaper than the GTX TITAN. Performance wise the two cards are very similar so it would certainly seem as if the GTX 780 is really a TITAN LE, a lighter and slightly less powerful option.

The Quadro K6000 offers an amazing 2880 CUDA cores combined with 12GB of GDDR5 memory. It will be the first CAD industry standard GPU that matches and if not surpasses the performance of the other GK110 GPU’s that are currently available. By comparing the Quadro K6000 with the GTX TITAN, you may see little performance increase in GPU rendering speed unless your scene regularly exceeds 6GB. However what you will have is a certified GPU that not only renders 3D scenes quickly but will also handle large amounts of data in both rendering and viewport performance.

The Tesla K20/K20X is a direct compute GPU which is not designed to run display graphics. It is to be joined up with a second GPU so that the Tesla can do all the rendering and the other GPU does the display graphics. This can be seen in NVIDIA’s Maximus technology where the Tesla K20 or K20X is paired with a Quadro K-series such as the K5000 or K6000.

Q&A

We thought we would take a moment to answer some of the important questions asked by our readers on GPU accelerated rendering.

The GTX TITAN supports double-precision, does switching this on in the NVIDIA control panel make my renders any faster?

MintViz: At the moment double-precision is not supported by any GPU renderer and it is possible that it never will be. The reason for this is because double-precision is designed for complex simulations typically seen in scientific research. For now GPU rendering does not need it and it will in fact reduce the overall performance as it will use more memory, cache and bandwidth. Single-precision is sufficient for the type and size of scenes we see today.

GPUs such as the GeForce GTX 690 have dual GPU and memory, will the total memory available be combined?

MintViz: Graphics cards that have dual GPU and memory are seen as two separate GPU’s and the entire scene must fit onto each one. The GTX 690 for example has a maximum of 2GB memory on each GPU.

NVIDIA Maximus technology uses the Tesla K20 combined with a Quadro which is expensive. Can I use GeForce instead?

MintViz: : You can setup your workstation to use multiple GeForce GPU’s for rendering. Once they are connected, in NVIDIA iray you can choose which GPU’s to use in the hardware resources dialog in render setup. In V-Ray RT you can use the Choose OpenCL Devices tool which is found under Start Menu > Programs > Chaos Group > V-Ray RT Adv for 3ds Max > Select OpenCL devices for V-Ray RT.

My graphics cards are set up using SLI, how does this affect the performance of GPU accelerated rendering?

MintViz: SLI is not required for GPU rendering, applications such as iray, V-Ray RT and Octane recognise each GPU without SLI. In fact using SLI can reduce the performance of the GPU.

In this article you haven’t mentioned AMD GPU’s, why not?

MintViz: Even though on paper the performance of some of the high end AMD GPU’s such as the AMD Radeon HD 7990 match the NVIDIA GeForce GTX TITAN, software support for OpenCL GPU accelerated rendering is limited. As a result, tests have shown slow performance compared to NVIDIA CUDA technology.

Our verdict

Each GK110 GPU will deliver roughly the same performance in rendering speed with the GTX TITAN and Quadro K6000 coming out on top. Because the entire scene must be able to fit onto the GPU memory, it results in many users having to optimise their workflow. But keep in mind that behind the scenes, clever optimisations are occurring which can make 5GB – 6GB scenes become only 1GB on the GPU due to only the necessary information for rendering being loaded. This makes the GTX TITAN a worthy contender for all round GPU accelerated rendering as it delivers high performance with a small price tag compared to the Quadro K6000 and comes with 6GB memory to handle the larger scenes.