Viewing Photos and Videos in VR Explained! (Part 1) (original) (raw)

Preface

In the last two years there has been an explosion in both consumer and professional-level 3D photography equipment. But along with that rapid development there are limited industry standards, creating a confusing consumer experience.

If you get the chance to view a VR180* photo on a virtual reality headset, you’ll understand why there is so much excitement. And if you’ve viewed a “VR photo or video” but were unimpressed, I am certain it’s because you were misled by one of the many poor substitutes that claim to be VR but aren’t.

* more on that standard later

Selection of VR180 Cameras available in 2020

But back to the growth of 3D photography… and now combine it with the steady growth of Virtual Reality headsets (which have their own issues with divergent standards) things become even more confusing.

An attempt for clarification of terms:

Through out these posts, I’m going to try to avoid using the terms “2D and 3D” when I talk about these technologies. When used to describe this technology, the terms are overused, inaccurate, and confusing. Instead, I’ll use the more accurate description of monoscopic or stereoscopic (learn more on my previous post).

I know it is a mouthful, but TLDRmonoscopic is regular photography (where a single image is viewed by both eyes, including panoramas) and stereoscopic is where there is a different image for each eye in order to simulate depth.

And specifically, my purpose is to describe the advantages and challenges of viewing these images on Virtual Reality headsets like the Oculus Quest, Valve Index, or Windows Mixed Reality Head Mounted Devices (HMDs).

Over the last 200 years there have been many other ways to view these images (from 3D TVs to the Nintendo 3DS to the 3D movies of the 1950s to the Stereoscopes of the late 1800s). But put those experiences aside. It’s 2021 and we’re focusing on the incredible power of VR!

Viewing traditional monoscopic content in VR

Lets start with the basics and assume you just want to view a traditional photograph or video on your VR headset.

Viewing a traditional 2D image in Virtual Reality

Viewing traditional photos can be great in VR, but even in this simple case, you’ll immediately notice any limitations of the VR hardware. This may be due to the resolution of the image, the color quality and resolution of the VR’s screens, or even the quality of the lenses. The good news is that these limitations are significantly less noticeable when we start to make use of stereoscopic content.

But even in the best case, the visual experience of viewing a traditional photograph or movie in VR isn’t likely to be better* than viewing the same image if it was a physical printed photograph or a video on a high quality television.

* maybe not better, but perhaps more accessible

Monoscopic Photography in VR: What are the benefits?

Viewing 2D 360° in VR

To step things up, you can view a 360° photo or video in VR. The big advantage here is the increased field-of-view (FOV). A good analogy would be to call this a “personal IMAX experience”.

Viewing a 360° image in Virtual Reality

Unfortunately, the immediate challenge that comes with creating a quality visual experience with 360 monoscopic video in VR is in the resolution of the image you’re viewing.

Imagine a 4k television displayed on a 1 meter (3 feet) wide screen which is 1 meter away from you. In order to get an equivalent resolution for a 360° video at the same resolution, we’d need a enormous source image.

circumference = 2πr * 3840 pixels wide ≈ 24,000 pixels wide

That’s the equivalent of a 24k television! Of course instead, what we usually end up with is a much smaller source image. Current best case is usually an 8k video (which is still huge) and that video is stretched to fit across the 360 degrees of screen.

Stereographic Projection*

Depending on the goals, this two dimensional image could be mapped to a three dimensional cylinder, sphere, or hemisphere.

* don’t confuse **stereographic **with stereoscopic.. we are still focused on a single image for both eyes. In this case, the prefix stereo is for the two halves of the sphere.

Types of 360 mapping (cylinder, sphere, hemisphere).

Cylinder: Using a cylinder allows for a simple mapping that just wraps a traditional 2D image around the viewer. It’s easy to create from 2D photographs, but does not result in a particularly great visual experience in VR because (1) it leaves the ceiling and floor open, and (2) the pixels at the top and bottom are farther away from the camera usually breaking the desired effect. It might look good for a distant skyline along the horizon, but with anything nearby it is difficult to hide the distortion.

Sphere: Mapping a an image to a sphere is the most common approach and generally what people mean when they say “360° photos and videos“. Unfortunately, it’s also what many people mean when they say “3D photos and videos” or “VR photos and videos“. I find this fact both confounding and frustrating. Yes, it is true you can view the video video in VR, but it has no depth and is a sad and disappointing substitute to what is actually possible and truly amazing about stereoscopic 360 video when viewed in VR.

Calling 360° monoscopic video “VR Video” or “3D Video”.

I digress.

There are some nuances in spherical mapping around how to translate a 2D image onto a spherical surface. This is more commonly raised in the discussion around types of mapping projections. To visualize this easily, consider the well understood polar distortions of Mercator projection and apply that to how a 2D video .mp4 video file might get mapped onto a sphere.

Hemisphere: Finally, mapping the image to a hemisphere (or dome) gives that traditional “IMAX” experience and is not particularly common in VR because you don’t see anything below the horizon line. Of course the advantage is that it cuts the rendering requirement in half.

360° Challenges

360 Challenges: Immersion without Depth

In VR, all these monoscopic formats improve the sense of immersion by increasing the field of view, but because both eyes still are viewing the same image, they lack the sense of depth most users expect when they experience VR.

360 Challenges: Field of View

Another important consideration for 360° visualization is that at any point, the majority of the scene (70% or more) is out of the viewer’s field-of-view (FOV). The scene is behind them.

Current VR headsets offer up to 110 FOV

Human’s have a horizontal FOV of about 120°, and current VR headsets offer a FOV between 95° to 110°. So even in the best case, two thirds of the image is unseen.

For a 360° photograph in VR this might actually be ideal. The viewer will experience a fully render scene in every direction and they have the agency to decide which direction they want to look.

However, for a 360° video, what is unseen remains lost to time (or until the user rewinds to watch it again from a different perspective).

This creates a set of challenges for story telling. For example:

360° Challenges: Nausea

As a result of the above limitations, it’s not surprising that spherical 360° video has seen primary success for capturing unscripted entertainment, like documenting events and athletics. (As you may have heard, 360° video is especially popular with the GoPro crowd.)

Unfortunately, when viewing GoPro-like experiences, there is a comfort challenge. We all know that some people are particularly sensitive to motion sickness when the camera is in motion but you are not (or vice versa). Unfortunately, this sensation is significantly heightened when the visual motion fills your FOV (like in VR). We can limit the nausea as long as:

Which means many of the things that 360 video is so good for are the same things that can make VR video the equivalent of the Vomit Comet.

360° Challenges: Bandwidth

Even further, there is the technical challenge that the video feed should be 8 times bigger than 4k video and is 3 times bigger than what is in the user’s field of view at any point in time. A high resolution video might look amazing, but when posted and transferred online, the large filesize reduces the audience. (Remember how frustrating it was to wait for a high quality video to complete buffering 10 years ago. That’s where many are now that 360° video.) The alternative is a low resolution video stretch across a 360 degree sphere… which is a blurry mess of a visual experience.

360 Challenges: Stitching

There’s one more important challenge about all types of 360 photos and videos… and that goes back to how they are created. In order to capture images in all direction, the camera must have multiple lenses pointed in different directions. The images from these lens overlap and the pixels must be merged or stitched together.

In order for this stitch to look good, the photographer (or their software) must decide what is the “default distance” to use when merging pixels from multiple cameras. Due to the effects of perspective, an object that is not in the center of the field of view will have a different horizontal or vertical position depending on how far away it is.

This sample image shows geometrical registration and stitching lines in panorama creation. (Wikipedia)

Usually this is default stitching depth is set to 2-3 meters away. But if an object is significantly closer or farther away from the camera and appears along the seams (where the camera images were stitched together) the image can appear warped or distorted.

Left: GoPro Max (2 fisheye lenses)
Right: Panono 360° (36 fixed-focus lenses)

At a minimum, a 360° photo requires two 180° fisheye lenses pointed in opposite directions. In this case there is only a single seam that needs stitching (but it is at the edge of the lenses visual field which can be another source for distortion). For improved resolution, many professional 360° cameras will use many more cameras. The result is higher detail, but can lead to more stitching issues.

360° Monoscopic Content in VR: What is it good for?

Continue to: Viewing Photos and Videos in VR Explained! (Part 2)

John Pile Jr is a Principal Software Engineer Lead at Microsoft (AR/VR/HoloLens) and author of 2D Graphics Programming for Games. More info on LinkedIn. View all posts by John Pile Jr