This is the first part of a two series posts regarding Single Pass Stereo (SPS). They were written in order to provide more information of how Single Pass Stereo is working and raise awareness about it’s problems. In this first post, I will describe SPS at a high level, provide insights on how it works and discuss the advantages and disadvantages. Lastly, I will offer my opinion if it is worth implementing it, as stated in the title of this post.
The second part Single Pass Stereo: wrong depth cues, discomfort and potential risks tackles the problem that SPS is having with certain Head Mounted Displays (HMDs), showing the problem in detail, providing actual code for anyone to test, why many games are shipped with it and what can be done to prevent it.
What is Single Pass Stereo
When SPS can help us
What are the cons of using SPS
Alternatives and final verdict
Let’s start. Is it worth it?
Too long; didn’t read: Yes and no… it depends! Be careful on which Head Mounted Displays you enable it!
Disclaimer: There will be many abstractions and most low-level parts omitted in this post. Advanced people be prepared. Also, some OpenGL terminology will be used.
What is Single Pass Stereo
Single Pass Stereo is a technology introduced by NVIDIA  that aims to improve the performance of Virtual Reality (VR) and Augmented Reality (AR) applications (combined are XR) by exploiting the way XR applications are rendered. It is available for DirectX, OpenGL and is integrated in Unreal Engine and Unity.
In most XR applications and the simplest way to approach rendering is: for every frame, the application has to process the scene twice; once for the left eye and once for the right eye. There is a very slight difference between the left and right view (as shown in the next image) and this is what NVIDIA is trying to exploit by employing the Single Pass Stereo technology.
As we can see in the image above, the views are slightly offseted in the X axis, assuming that X points to the right, Y points up and -Z is forward from the eye to the scene. Observing what is actually seen from each eye, there is great overlap between the drawn objects resulting in sub-optimal performance since there is redundant work being performed twice per frame. To find the redundant work, we have to check a top level view of the rendering pipeline.
- The frame starts.
- The geometry is sent to the GPU or is saved there already.
- For every eye (usually first the left and then the right):
- Process the initial geometry (this can include animation, new geometry generation etc.)
- Figure what pixels will need to be drawn from the resulting geometry.
- The resulting pixels from the geometry are processed.
- The frame ends.
- The images for both eyes are sent to the user.
The step that SPS is trying to optimize is 3.1 because the resulting geometry for both eyes is the same. Essentially, there are 2 geometry passes and 2 pixel passes and SPS is reducing the geometry passes to 1.
Notice that Single Pass Stereo is reducing the processing time for geometry while leaving the processing of the pixels the same. This is the most important thing to remember and it is the source of the frustration of many developers that have implemented/enabled SPS but don’t see any improvements.
When SPS can help us
As mentioned before, the geometry processing passes are reduced to 1 per frame so the case where SPS can help us is ONLY when there is huge geometry processing; that is the application is geometry or bandwidth bound. That’s it. That’s the only case where SPS can boost an application’s frametime.
If the application is using very detailed models or generating a lot of geometry then the frametime can drop at most about 20% – 35%. Expecting to have a 50% reduction is not feasible because there is still the second pixel pass for the right eye and we can expect a bit of background processing by the GPU even if the geometry passes are reduced to one.
If an application is doing heavy post processing, with different image effects, it will see no benefit at all. And that’s why many developers and users can become frustrated when they don’t see any performance gains when enabling SPS; from either misunderstanding the specification or no knowledge of what SPS is, they expect to gain something from it but their application does not fall into the specific case that SPS is trying to optimize.
What are the cons of using SPS
A small caveat in the specification
The biggest disadvantage of Single Pass Stereo is that it only works with HMDs that have co-planar views and their projections are symmetric. There is one very small caveat in the specification  and the “Single Pass Stereo Programming Guide for OpenGL section 2.3” found in the VRWorks documentation that states:
Only the “x” coordinate is expected to differ between these views, so the “yzw” coordinates for secondary position are also obtained from the “gl_Position” and writes to “gl_SecondaryPositionNV.yzw” are ignored.– https://www.khronos.org/registry/OpenGL/extensions/NV/NV_stereo_view_rendering.txt
By reducing the geometry pass to 1, the position of the right eye is produced in the same geometry pass as the left. When NVIDIA hardware reads the right eye position, the actual position that they process is the LEFT one but with the RIGHT x coordinate. The extension creators assume that the projection for both eyes are symmetric and have co-planar displays, so the changes in geometry position between the eyes are only in the X eye axis. When the assumption holds the results are correct and this was true for most first generation headsets; but it is not true moving on from here! For HMDs that use asymmetric projections for their eyes, the Y screen axis is different too. So, the geometry for the right eye is at the correct x screen coordinate but the wrong y screen coordinate.
Also quoting the readme of the gl_stereo_view_rendering project in the VRWorks samples:
Single Pass Stereo can render two views simultaneously with the limitation that the resulting coordinates can only differ in the X component. This is sufficient for VR on HMDs which have one display for both eyes or two displays which are coplanar.– Readme of the gl_stereo_view_rendering project in VRWorks samples.
Multi-View Rendering lifts these limitations and can render multiple views with arbitrary projections.
This is a big deal because most of the new headsets do not have a single display for both eyes or their displays are not co-planar and they don’t use symmetric projections. This is done in order to maximize their field of view or achieve higher resolution etc.
Let’s see an example with the view and projection matrices taken from an HTC Vive HMD which has co-planar views but asymmetric projections for the eyes. The near and far planes for the projection matrices are 0.02 and 100.0 for both eyes.
By taking a point (1.0, 1.0, 1.0) sitting at the 1 unit in every axis, we multiply it with the view and projection for every eye and get the following result.
Multiplication with left projection and view: (-0.7631, -0.7175, 1.0205)
Multiplication with right projection and view: (-0.8246, -0.7192, 1.0205)
The x coordinate is different, as expected, but the y coordinate is different too! This is something that SPS can’t handle because it discards the y and z coordinates from the right eye.
I’ve prepared a Visual Studio project and provide a prebuilt binary to test if you will have any problems using Single Pass Stereo with your headset.
Wrong per-eye calculations
Another thing that can (very easily) go wrong are the pixel calculations that depend on the view, with lighting being the most prominent example.
When building a single-screen graphics engine the view matrix is passed from the vertex shader to the fragment shader in order to simplify the lighting math calculations and have correct results for the specular components etc. If VR support is required for a single-screen engine, the changes that are required are trivial. The code can be exactly the same with the single-screen engine and just run the whole process twice while passing the different view matrix each time. There will be no shader changes and everything will stay the same. This is where things go wrong when implementing SPS in such an engine (which is the common case).
Running an engine like the above but with SPS, the view matrix will only be passed once because the geometry pass is reduced to 1. And this means that only the left view matrix will be taken into account for the calculations even for the right eye. This will result in incorrect results for everything in the right eye that uses the view matrix! This expands to every attribute passed to the fragment shader that is specific to each eye.
This can be solved by using the gl_Layer attribute in the fragment shader that is provided by NVIDIA and helps determine the eye that is currently processed. The changes to have correct per-eye fragment calculations require to pass all the attributes for the left and right eye at the same time into the fragment shader and then use gl_Layer to determine which to use. And that brings us to the next thing that can go wrong.
Performance degradation due to attribute heavy shaders
One of the first tips that one will encounter when searching for shader optimization is: reduce the data flow between shader stages to the bare minimum. Having huge data chunks sent between the different shader stages can really nuke an application’s performance. When the data transferred between the shaders exceed the capacity of the L1 cache in the GPU processors, then things can really start to slow down because of the latency required to retrieve them.
Remembering the changes required for correct per-eye calculations with SPS, we can see that the changes are the exact opposite of the optimization tip. Passing all the attributes for both eyes at the same time can bring an application in the above situation, resulting in slowdown instead of speedup! Using SPS can improve the application’s frametimes in terms of geometry processing efficiency but it can slow it down if the data required per eye are a bit on the heavy side. Profile your application!
My only and major concern with Single Pass Stereo is that it works with HMDs that use the same symmetric projection matrix for both eyes. If the projections are not symmetric, then it will produce the wrong image for the right eye. I don’t care about the work I have to do to get my engine running with it, I care about showing the correct images to my users.
This can be bad for our perception of our surroundings. Our brain is trying to fit the pieces and perform stereopsis with wrong input data which can result in nausea or even accidents from the wrong perception of the world. This is reinforced with the comments of a Reddit user with  (to whom I provided the answer) claiming that they cannot drive correctly in a racing game and having frequent headaches while using SPS. By disabling it, the perception of the world returns to normal. Quoting: “So I play a racing game […] and they have implemented SPS as an option. I noticed that their reflections are not correctly rendered in the sim and that’s fine. However, after playing around with it a bit i also felt like certain things just… lacked depth“.
Google Cardboard, Samsung GearVR and Oculus Rift all have symmetric projections for the eyes and SPS is working correctly. All Vive products, Valve Index, Varjo products and XTAL use asymmetric projections. I can’t comment for Pimax products as I haven’t researched on those.
Don’t be scared if you have been using Single Pass Stereo in a wrong HMD until now. Our brain is so marvelous that it will readjust after we stop providing it the “wrong” input. Check the “Backwards brain bicycle” and the “Upside down goggles” by George M. Stratton.
In games, very few will meet the criteria to actually see big improvements by using SPS but many professional applications will benefit from it. Profile your application, read the documentation and if you are not sure, then experiment with it!
Alternatives and final verdict
There are some alternatives available developed mainly by Oculus and these are OVR_multiview  and OVR_multiview2 . The first extension solves the problem with the wrong image of the second eye but it brings with it a whole lot of restrictions like: no tesselation control or evaluation shaders and no geometry shaders. OVR_multiview2 is introduced in order to lift one limitation of the first extensions but it is still not usable for programs that heavily rely on the restricted features.
NVIDIA introduced another extension to remove the limitations for tesselation and geometry shaders of OVR_multiview and that is EXT_multiview_tessellation_geometry_shader . Now that’s good news! Unless we take a look at which graphics cards support this extension from the OpenGL Hardware Database and see that only 0.19% of the cards can operate with it and they are all professional grade and very expensive. So, that leaves the simple consumers (even the enthusiasts) out of the potential customer base.
Even though the OpenGL landscape may seem a bit grim, the multiview extensions have better availability in Vulkan where if we search for “VK_KHR_multiview” in gpuinfo, the support is almost at 60%. I don’t know if this accounts for the fact that this extension has been made core from Vulkan 1.1 onward .
So, my final verdict is that Single Pass Stereo can be worth and help some applications (because 20-30% of the frametime is a huge boost) but be extra careful about the HMD that the application will be displayed to.
I would love to hear about your experiences with SPS and if you find any inaccuracies, please do tell me!
Until next time,
Stay positive and look after yourselves!
UPDATE 8-Aug-2020: Added the link to the project I have prepared to locally test the SPS projection problem.
UPDATE 13-Aug-2020: Reformatted for the inclusion of the second part.