Single Pass Stereo: wrong depth cues, discomfort and potential risks

This is the second part of a two series posts regarding Single Pass Stereo (SPS). They were written in order to provide more information of how Single Pass Stereo is working and raise awareness about it’s problems. In this post, I tackle the problem that SPS is having with certain Head Mounted Displays (HMDs), showing the problem in detail, providing actual code for anyone to test, why many games are shipped with it and what can be done to prevent it.

In the first post Single Pass Stereo: is it worth it?, I describe SPS at a high level, provide insights on how it works and discuss the advantages and disadvantages. Lastly, I will offer my opinion if it is worth implementing it.

Contents

Introduction
What is the problem with SPS
Investigation and proof
Consequences
Potential risks
What can be done

References
Appendix A: how to check the SPS problem in Unity
Appendix B: the projection matrices of Valve Index

Introduction

This will be a simple introduction and I recommend reading the first post in order to gain a better understanding of the technology and then hop back here for a more lengthy discussion of it’s problem.

The objective of Single Pass Stereo [1] is to reduce the processing time of each frame and accelerate the XR applications. This is done by exploiting the inherit property of the human visual system and that is: what is seen from each eye is slightly offseted on the X axis, i.e. the line that extends from one eye to the other. The image of the right eye is a little to the right from what the left eye sees.

This technology was designed following the first modern generation HMDs, such as Oculus DK1, and Google’s hype about smartphones as VR headsets. The design of these devices was providing an easy way for Single Pass Stereo to work since their displays were either one big display or two displays that were co-planar. Unfortunately, this does not hold true for newer HMDs, such as the HTC Vive and the Valve Index which are widely adopted, as shown in the previous post.

Update 17/9/2020: The newly announced Oculus Quest 2 by Facebook will not have any problems because it uses a single display for both eyes.

What is the problem with SPS

The problem with Single Pass Stereo is the exact reason it was created. The exploited property of the human vision system and the first generation headsets which does not stay true in most new ones. That is, the new headsets come with displays that are not co-planar or use asymmetric projections in order to increase field of view etc. This creates the need to track where SPS is enabled because sometimes it will work and sometimes it will “mostly” work.

The problem that this variance in the display systems creates is that the right eye view, with SPS enabled, is not exactly the same as the one generated when SPS is disabled. By doing the maths from point to image, one can observe that the point coordinates for the left and right eye do not only differ in the X axis but in the Y (up) screen-space axis too. The X axis is the line that is formed from one eye to the other while the Y axis is perpendicular to that pointing upwards. This creates a disparity of features on a second level (they already have a disparity on X and now there is a disparity on Y) that is slowly messing up with our perception of depth. This problem is used in many games and applications due to the implementation in many games engines like Unity and Unreal.

Investigation and proof

For this matter, I have created a Github project “Single Pass Stereo with OpenVR” that is showing this problem and providing a way for anyone to locally test with their HMD if they will face this problem from applications using SPS. The project is based on Valve’s OpenVR sample with OpenGL which I extended to support Single Pass Stereo. To build the project locally, download the OpenVR SDK, open the project “hellovr_opengl” and simply replace the main file with the one I provide in the repository. I, also, offer a prebuilt binary for Windows 10 in order to run it without a development environment.

So, having as ground truth the rendering system of Valve’s sample, extending it with Single Pass Stereo should surface any problems that the technology might have. The project contains three rendering modes that can be changed on the fly:
1. The ground truth “Default” rendering mode.
2. The “Single Pass Stereo” rendering mode.
3. The “Combined” mode where both modes are run overlaid for each frame to further spot the differences.

When the “Combined” mode is used, the scene is drawn twice: first with “Single Pass Stereo” and then with the “Default” mode. In this specific mode, the scene is given a red tint when drawn with SPS in order to easily distinguish the differences. The project does not require the use of controllers and one simply has to use the keyboard to change modes. By pressing the key D, the “Default” mode is exclusively on. By pressing the key S, the “Single Pass Stereo” mode is exclusively enabled and by pressing the key A, the “Combined” mode is used. Using these modes, we can observe the scene and specific features very fast without the need to remove the headset.

As was pointed at the previous post and at the start of this one, the right eye image is offseted on the Y axis too. Below is a gif that is showing both left and right eye images and alternating between “Default” and “Single Pass Stereo” modes. The headset used is a Valve Index and is placed on a rigid desk surface to remove any vibrations caused from the head.

Both left and right eye images. The right side is slightly moving up and down when changing between “Default” and “Single Pass Stereo” modes.

Setting our attention to the right eye, the image is clearly moving up and down providing proof that there is indeed an offset between the generated images when comparing Single Pass Stereo with the ground truth. Moving on, below is, again, a gif that is showing a view of the scene in the “Combined” mode. The red tinted cube sides are drawn with Single Pass Stereo while the white ones are the ground truth.

The eye views using the “Combined” mode. The fact that both red and white cube faces are visible shows that SPS is not producing the same result as the ground truth.

The left eye image is wholly rendered with a red tint which means that SPS and ground truth results are exactly the same. Depending on the hardware, the scene might be wholly white but that will not change the fact the the left eye image is correct with Single Pass Stereo.

The right eye image is where the problem is now more apparent. Not only there are both white and red tinted cube faces, they do not stay the same when the direction of the HMD is changing! This complicates the matter more because there is not a consistent problem across the scene; it’s an error that is changing depending on where the user is looking which can change the position of things even in front them.

The problem of Single Pass Stereo is not just a glitch in the view and projection matrices and it produces visible artifacts.

My project on github for everyone to check.

For people who want to test this in Unity, I have directions at the end of the post.

Consequences

Stereopsis and human visual system

Obviously, the consequences of such a problem are not very obvious, otherwise it would have been fixed already. Also, I am not an optometrist and I will discuss things from my knowledge about stereo vision. I encourage any optometrists to jump in and add or correct me!

The problem is that one eye sees the correct view and the other eye is not. The right eye is seeing the image a bit offseted on the Y axis. It wouldn’t be much of a problem if the offset was on the X axis. Our brain already expects a disparity on the X axis and that’s why many people don’t even set their IPDs correctly; the image is just a bit blurry. But the offset on the Y axis is causing a new family of problems and that is related to depth perception.

The way we understand the depth in the space that we are is by using simple trigonometry (actually not that simple but the essence is this). The brain sees something from two slightly different views (left and right eye) and then uses them to determine it’s depth. The depth perception is not millimeter accurate and we can’t correctly measure how much something is far away but we can contrast it with the rest of the scene and we get the relative positions of everything around us. That’s why most people have difficulty guessing how far away something is. For more information, you can search for “stereopsis” on Google Scholar and see research from the 70s and even how stereopsis works on cats and toads or fast check it on Wikipedia.

Throughout our life, our brain has gotten used to how things look from our eyes and what are the differences between the two images. The eyes stay in the same position and their X axis stays the same. What SPS does is that it changes the Y axis of the images and by changing the Y axis, it provides an altered image in a way that the brain is not expecting. This causes our brain to enter a state of adjustment in order to adapt to the new input to manage to extract meaningful information from it.

Wrong depth cues and discomfort

Entering this state of adjustment, the eyes are being strained because they are moving a lot more in order to find a comfortable position and provide correct input to the brain. This causes irritation and can lower the time that one stays with a headset on. To give you an analogy, imagine running and, some days, the ground is 1cm higher under your right foot, wherever you are stepping. Maybe you will not notice it from the start but you will surely feel that something is wrong and it will affect your performance.

While in the state of adjustment, we can deduce that the depth cues that are produced are not correct. Our perception of the scene is not correct and there are times that the user will think that an interaction should have happened but it didn’t. For example, a moving ball that the user should catch but they didn’t, although the brain is 100% sure that it should be in their hand.
Personal experience: I was playing the Vertigo Remastered game (spoiler alert for first boss) in my Valve Index and I reached the first boss. My weapon was an energy sword and the boss was shooting some energy balls from it’s eyes and had some tentacles trying to hit me. The tactic was to cut the tentacles and deflect the energy balls back at it. When a tentacle was coming from my left side, I was hitting it 100% of the times but when it was coming from my right side, I was missing most of my slashes. At the time I was just frustrated. When it was time to deflect the balls, I had again the same problem! I was missing most of the balls from my right side! I thought that it is no coincidence and by stumbling upon the creator’s Twitter account, I learned that they use SPS and the game engine is Unity. My aim is not to bash the creator because the game is great, but to point facts about SPS from personal experience.

Another person [2] has experienced something similar which I mention on the first post but I will also quote here: “So I play a racing game […] and they have implemented SPS as an option. I noticed that their reflections are not correctly rendered in the sim and that’s fine. However, after playing around with it a bit i also felt like certain things just… lacked depth“.

This is not something random and there are people that are more sensitive to this change than others.

Potential risks

This section is purely theoretical and I use my personal experience using a VR application with SPS in a wrong HMD for about 4 months. You can skip this section if you want.

For these 4 months, I was spending half of my day in an HTC Vive Pro developing and playing. When I was not wearing the HMD, the real life didn’t seem so real; it’s as if everything around me was not correctly placed. I had a medical eye check but everything was fine.

Someday there was that weird bug where an overlaid mesh was correctly displayed in the left eye but a little displaced in the right. The problem was that the base mesh was drawn using SPS but the overlay was not and that’s when it hit me. The problem was not in the overlaid mesh that was displaced but it was actually the base mesh that was displaced because of SPS. I disabled SPS from then and after about one month my depth perception was back to normal.

If other people have experienced something similar, I invite you to talk about it!

In real life, our visual system is in it’s default mode and everything is working fine. Entering a VR application with SPS enabled and an HMD that is not having co-planar displays, the visual system is firing up the adjustment mode in order to survive in this new world. After a while, it still hasn’t completely adjusted but it is more comfortable with the input. Entering real life again, the input is changed to the normal one but the visual system needs to readjust to the new input. Although the adjustment will not take long, it still needs time depending on the time spent in VR. The longer the exposure to the wrong input, the longer the time to readjust.

Having half of the time being spent in VR with wrong SPS and the other half in real life, my brain was always in a state of adjustment because there was no longer a ground truth. The inputs were always changing and there was no time to finally settle to something. Losing depth perception is a big deal and can be the cause for many problems, trivial or not, because it is critical to how we interact with our surroundings.

What can be done

It should be obvious that this is a problem and it needs to be solved before it progresses in even more applications and more users are exposed to it.

The first solution that I can think of without disregarding it completely is to check what HMD the application is shown to and enable/disable it accordingly. This requires some maintenance because it needs to be updated every time a new headset enters the market.

This is an easy task for most games because the creators use third-party game engines such as Unity, Unreal, Godot etc.; the burden falls on the developers of the third-party engines that should have this background check. The game developers using these engines will not have to worry about it but should be given a warning informing them of what to expect by enabling Single Pass Stereo.

Specifically for Unity and Unreal, they are using the Single Pass Instanced (or Stereo Instancing) rendering technique [3] [4] [5]. This technique is doing manually what NVIDIA is performing in the graphics driver when Single Pass Stereo is enabled. It is done in order to have some of the benefits of Single Pass Stereo and it does provide a boost in performance but it is still slower. Using this technique, and setting up the scenes in a specific way, does not create the artifacts of SPS because the positions are produced correctly (because there is no pruning of coordinates from the graphics driver). However, how can we know if Unity/Unreal (and any other engine) enable NVIDIA’s Single Pass Stereo instead of Stereo Instancing, when available, to get the best performance possible but produce wrong images along the way? We need transparency in this matter for both developers and players to give them correct expectations.

Developers with in-house engines should confirm what HMD their application is targeted to. If the target is the Oculus Quest, then SPS can be simply implemented and forgotten because it is having co-planar displays. If their audience is the whole Steam user base, then I have bad news. They will either have to disable it altogether or perform the HMD check I mentioned above. Again, they should profile their application and see if they really benefit from SPS or it is just extra cost, as I described in the previous post.

To help remedy the added work of checking every single headset there is, HMD manufacturers could state if their HMDs are compatible with Single Pass Stereo or not. This will remove the need of engine developers from manually testing all devices or asking their users to do so and create a more open environment between users and manufacturers.

The second solution is to disable SPS completely and use the NVIDIA Multiview technology [6] that was introduced with the Turing generation. This technology removes the limitation of SPS, effectively solving it and supports the rendering of more than 2 views simultaneously. This can be great for HMDs that use more than 2 displays to display the world. This technology is added in OpenGL through extensions [7] [8] [9] and it is core since Vulkan 1.1 [10]. Of course, DirectX already has it.

The third “solution” is to force the option to enable/disable SPS into any applications settings which just patches the problem. The game engines will somehow need to place the setting and the users will have the option to enable or disable Single Pass Stereo. There should be an informational window explaining to the user what to expect.

I call on everyone that is developing an engine with SPS, to take action to solve this problem. I call on everyone that uses an application with SPS to contact the developers and inform them of this problem. This needs to be a joint movement to ensure the safety of everyone.

The current VR technology is full of compromises and assumptions of how the visual system works in order to produce a “good enough” result (see minimum resolution and minimum display refresh rate). Let’s, at least, not show the wrong things to our users.

Thank you for reading until now and I hope that you stay safe in these hard times for the world.

References

[1] https://developer.nvidia.com/vrworks/graphics/singlepassstereo
[2] https://www.reddit.com/r/vive_vr/comments/ehg13j/question_about_single_pass_stereo/
[3] https://developer.oculus.com/documentation/unity/unity-single-pass/
[4] https://docs.unity3d.com/Manual/SinglePassInstancing.html
[5] https://docs.unrealengine.com/en-US/Platforms/VR/DevelopVR/VRPerformance/index.html#vrinstancedstereo
[6] https://developer.nvidia.com/vrworks/graphics/multiview
[7] https://www.khronos.org/registry/OpenGL/extensions/OVR/OVR_multiview.txt
[8] https://www.khronos.org/registry/OpenGL/extensions/OVR/OVR_multiview2.txt
[9] https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_multiview_tessellation_geometry_shader.txt
[10] https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/VK_KHR_multiview.html

Appendix A: how to check the SPS problem in Unity

This problem can be easily observed in Unity following the steps. If you don’t observe any problems, then your HMD is compatible with SPS.

  1. Create a new Unity project.
  2. Download the SteamVR plugin and import it using it’s directions.
  3. Go to: File -> Build Settings -> Player Settings -> XR Plugin Management -> OpenVR where you can see the option “Stereo Rendering Mode”.
  4. Selecting “Single Pass Instanced”, SPS is enabled.
  5. Open the Scene: Assets -> SteamVR -> InteractionSystem -> Samples -> Interactions_Example.unity.
  6. Press Play and wear your HMD.

Two things are the most obvious in this scene. The first thing is the flaming torch. Observing the base of the flame, close and open one eye each time and observe that you don’t see the corresponding parts in each eye. The view of it is different. Also, there is a clear irritation when looking directly at it with both eyes.

The second one is at the other side of the scene, on the big target on which we can throw the things from the tables. When a thing reaches the target correctly, a ribbon-like effect is shown that is causing the same irritation as the flame. Perform the same procedure with both eyes to realize that this has a problem too.

Now, disable Single Pass Stereo from the setting in step 3 and perform the same checks again. Everything will be rendered normally.

Appendix B: the projection matrices of Valve Index

Following are the projection matrices for both eyes taken from SteamVR using a Valve Index with near clipping plane at 0.1 and far clipping plane at 30. The values are rounded to the first different digit between the two matrices.

The left and right projection matrices for a Valve Index which are, indeed, asymmetric.

13 thoughts on “Single Pass Stereo: wrong depth cues, discomfort and potential risks

  1. Your article is very interesting, as I’ve been using valve Index with Nvidia card (2080Ti) for past year.

    Something has often felt “off” with stereoscopy in some applications, but not in others. This has left me feeling clumsy after couple of hours in Index, especially in personal space.

    I assumed it was an issue with the reprojection system that adjusts for canted displays on index. Your information is most revealing, thanks! Is there anyway to determine which of my applications are using SPS?

    Liked by 1 person

    • Thanks a lot! I am glad I got it sorted for you!

      I don’t really know how to determine it in any application. A first thought that comes to mind is to get a graphics debugging program like RenderDoc or NVIDIA Nsight Graphics and capture a frame of an application. SPS works with layered framebuffers (in OpenGL at least) and search for it in the captured data.

      Liked by 1 person

      • Have you looked at the Index’s optics in depth?

        I’ve asked Doc_OK (Oliver Kreylos) but he doesn’t have access to an Index to test.

        It’s interesting that Valve still haven’t published their “Deep dive” on Optics a year on from launch, and the Large (wide) face gasket was withdrawn before consumer launch but both large and small provided with early Dev kits?

        Like

      • I haven’t looked at the Index’s optics in depth because I don’t know what to look for 😛

        From the Deep Dive series, we can already take a hint of how the displays are arranged. I could try to interpret the contents of the view and projection matrices but that would not give us the complete picture of what is happening.

        Liked by 1 person

      • I appreciate your reply. I’m super interested to know more / why about valve’s choice of canted optics, dual element lenses, the field of view and display limitations (black edges appearing at minimum eye relief). Questions I may never get answers to (I have asked valve) 🧐

        Liked by 2 people

      • I am really looking forward for the optics deep dive too. I thought we could get a better look at it with the disassembly post but we didn’t get any info about the setup of the lenses and displays. Like, why do we still have the glare effect? Is it so hard to reduce it? Or did they reduce it but the setup for the reduced SDR increased it and remained the same? Many questions!

        Liked by 1 person

      • It’s tricky getting information from Valve, I had some communication with one of their audio engineers last summer when I was writing my article on index ear ergonomics for Tony’s Skarredghost blog. But since then…radio silence! All very mysterious (I guess they might just be super busy!)

        Liked by 1 person

      • Thanks for your kind words. I had some great feedback from index users, and was doing some work with VR Cover before the coronavirus pandemic hit, to try and get a wide face gasket to market with a skin compatible cushion. Its all been pushed back like everything during this strange year!

        Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s