The world around us is filled with billions of light rays emitted by the sun or other artificial light sources. As light hits objects and surfaces, rays bounce, break, and reflect in various ways until they eventually reach our eyes; it is this complex interaction that creates our view of reality. Ray tracing is a lighting technique for 3D graphics that mimics this real-world approach and while it produces the most realistic results, the process has traditionally been too complex for computers to create 3D graphics in real-time.
It is widely used today for creating ultra-realistic renders for advertising and movies but in these cases, it can take hours to generate each frame even on today’s massively powerful compute servers. Ray tracing is a buzzword that’s generating huge amounts of excitement (or hype, depending on your point of view) as the way forward for real-time graphics. In this article, we look at ray tracing, and what is making it a reality now.
Shrinking the problem
In a 3D game, the world is made up of objects, which when combined, consists of millions of triangles. The most basic function of ray tracing is sending out a ray and following its path through the 3D world to locate the first object it reaches to determine how it should be lit. Testing even a single ray with every object in the scene to see if they intersect is too power inefficient and computationally expensive to be practical in real-time.
To use ray tracing, we therefore need to solve this problem.
This can be done by building a ray tracing acceleration structure. To do this we draw a box around our game world which we then subdivide into smaller boxes, and those boxes into yet smaller boxes until, ultimately, we have a small box, containing a manageable number of triangles. We call this the scene hierarchy structure, and it is this that helps us reduce the problem to something current graphics processors can handle efficiently.
It works because when we send a ray into the game world, we first check the ray against the biggest box. Does the ray hit our world at all? If yes, we check the next level of smaller boxes. At this stage, we find that our ray has penetrated some boxes, but not others. We, therefore, eliminate those it has missed and dive deeper only into the boxes hit by the ray until we find where the ray has intersected with triangles. At this point, we eventually find our hit.
This hierarchical structure enables us to find the nearest intersection of ray and triangle without having to test every triangle in the scene. It greatly simplifies the problem, so it can be done much faster.
After the geometry processing phase where the hardware does the work of animating objects, we feed those triangles into a specialized piece of hardware called the scene hierarchy generator, which generates the acceleration structure described above. We have also added some specialized ray/box/triangle testers, which are dedicated fixed-function hardware to do the tracing of the ray through this acceleration structure and locates the intersection of rays and triangles. Doing all of this in dedicated hardware is much faster and more area and power-efficient than using software-programmable pipelines.
So, what’s the next step once the hardware has determined that a ray has hit a triangle? What happens is that we trigger a fragment shader, a small program which determines the colour at that specific location of that triangle, a step that is fundamentally similar to traditional rendering. From this fragment shader program, we then send out more rays into the 3D world, and as this process is repeated it builds up our ray traced scene.
The coherency problem
But now we have a new problem. We are sending lots of rays into the scene, but how do we efficiently process all of this? We need to fetch boxes and triangles from our acceleration structure in memory, with each ray triggering a fragment program each time it hits an object.
Unfortunately, rays are erratic things and they don’t necessarily travel in the same direction. In technical terms, we describe this as incoherency – which is problematic. Non-coherent data access is bad for modern GPUs. It’s a bit like trying to look up information in an alphabetic Rolodex but the names are given to us in a completely random order – cue lots of flicking back and forth, taking up precious time and consuming precious energy.
Worse still, as rays bounce around randomly, they also hit different objects and triangles, which need to be coloured and shaded differently, which will trigger different shader programs. However, GPUs like to process shaders in a parallel manner. This is precisely where GPUs are strong: in their ability to process data in a massively parallel way that gives them an advantage over other processors such as CPUs. This is because their arithmetic logic units (ALUs) are single instruction, multiple threads (SIMT) in nature. If every ray triggers a different shader, it will not work on a GPU, as that would require a multiple instruction, multiple threads (MIMT) architecture, which are silicon area and power inefficient.
A solution is the coherency engine developed by Imagination Technologies, which tracks rays and finds order among the chaos of all the rays in a scene.
If you look at images (below), the light rays may initially look random. However, if you look more closely, you’ll note that actually, there is coherency.
To explain, note how some parts of the object reflect the same yellow object. Despite the seeming chaos, some rays are going in the same direction and hitting similar objects. Our coherency engine looks for this, sorting these rays into groups and thus making them easier for the GPU to process. This is the ‘magic’, as we regain data access and execution efficiency and thus also lower power and bandwidth in our processing.
The benefits of hybrid rendering
So that’s great; we can now we can trace rays and do it efficiently. However, as we said before, reality bounces billions of rays around to make the images our eyes see, so even with all our efficiency gains creating an entire scene using ray tracing would still be problematic. The answer? Hybrid rendering.
While traditional rasterized rendering does a good job today, it struggles with spatial interactions such as lights/shadows, reflections and refractions – precisely the complex things which ray tracing is good at. With hybrid rendering, we grab the best of both worlds, using rasterization for simple objects and then blasting rays from our shaders to selectively create a limited number of spatial ray traced queries to create hyper-realistic shadows, lighting effects, and accurate reflections. By using this hybrid approach, we massively reduce the number of rays we need to trace, which finally brings us into the realm of real-time performance.
Ray tracing on a phone: is it really possible?
The simple answer is yes. The GPU in today’s smartphones has made massive strides since they first launched, not only in terms of feature-set but also in real-world achievable performance. Indeed, premium smartphones are already breaching the 1 TFLOPS compute barrier, previously the exclusive arena of dedicated gaming consoles. The real question in play here then is efficiency. Smartphones depend on battery life, and as ray tracing is more efficient than traditional rendering methods it stands a good chance to be quickly added to the mobile experience.
Using the innovations described above, Imagination is making efficient ray tracing possible. In smartphones, the cost of faking shadows and reflections in games with traditional rasterization is already very high. In a modern game engine, such as Unity or Unreal, reflections are generated using cascaded shadow maps. This requires rendering the geometry of the screen many times and writing out shadow map lookup tables to memory, all of which costs cycles and bandwidth, consuming significant GPU and system power.
With ray tracing, we send a single ray out to the light source, and if that ray hits anything but the light, we know the fragment is in shadow. This is far simpler using our streamlined hyper-optimized ray tracing solution and is hence a lower power solution than the pre-processing required by cascaded shadow maps.
When analyzing our prototype ray tracing hardware from 2016 we found that for shadows, reflections and other techniques, the power consumed was often less than half, and the resulting quality was much higher. The realization here is that a complex, yet “fake” technique consumes more power than the simplicity of ray tracing, which gives far more realistic results, making it not only suitable but desirable for modern premium smartphones.
AI and super-resolution
While smartphone-based ray tracing is one option, we are equally excited by the rising popularity of cloud gaming, bolstered by the growth of 5G networks and edge computing. Here again, the bandwidth and power efficiencies enabled by our ray tracing architecture is likely to be critical.
Innovation is always needed to deliver more for less and, as such, we are very excited by the rapid advances in AI processing. Together with neural networks these can be combined with ray tracing to offer even higher efficiency. For example, as we trace only a relatively rays for the sake of efficiency, we may get noisy results back. Neural networks show great promise for noise reduction to fill in the missing details using learned “intelligence”. Again, this is how reality works as our brain also fills in many gaps left by our limited human vision system.
Another great potential concept is super-resolution. This again leverages the power of neural networks, this time to intelligently learn how to fill in missing details to allow GPUs to render at a lower resolution and thus raise performance and reduce power consumption, while still maintaining visual quality.
There is no doubt then that real-time ray tracing has a bright future making this undoubtedly an exciting time for anyone interested in the 3D graphics. As it’s based on real-world physics, ray tracing offers the highest levels of realism, but it also delivers great efficiency, versus the hacks and approximations that we have been using up to now. The low power rasterized graphics, pioneering ray tracing work, and continued innovation in AI and neural networks, all combine to help take graphics to the next level.
Kristof Beets is senior director of product management PowerVR, Imagination Technologies, having held various developer, product and marketing roles in the company for over 19 years.