In the last few years, Lytro has made a major pivot away from consumer-facing digital camera products now to high-end production cameras and tools, with a major part of the company’s focus on the ‘Immerge’ light-field camera for VR. In February, Lytro announced it had raised another $60 million to continue developing the tech. I recently stopped by the company’s offices to see the latest version of the camera and the major improvements in capture quality that come with it.

The first piece of content captured with an Immerge prototype was the ‘Moon’ experience which Lytro revealed back in August of 2016. This was a benchmark moment for the company, a test of what the Immerge camera could do:

Now, to quickly familiarize yourself with what makes a light-field camera special for VR, the important thing to understand is that light-field cameras shoot volumetric video. So while the basic cameras of a 360-degree video rig output flat frames of the scene, a light-field camera is essentially capturing data enough to recreate the scene as complete 3D geometry as seen within a certain volume. The major advantage is the ability to play the scene back through a VR headset with truly accurate stereo and allow the viewer to have proper positional tracking inside the video; both of which result in much more immersive experience, or what we recently called “the future of VR video.” There’s also more advantages of light-field capture that will come later down the road when we start seeing headsets equipped with light-field displays… but that’s for another day.

Lytro’s older Immerge prototype, note that many of the optical elements have been removed | Photo by Road to VR

So, the Moon experience captured with Lytro’s early Immerge prototype did achieve all those great things that light-field is known for, but it wasn’t good enough just yet. It’s hard to tell unless you’re seeing it through a VR headset, but the Moon capture had two notable issues: 1) it had a very limited capture volume (meaning the space around which your head can freely move while keeping the image in tact), and 2) the fidelity wasn’t there yet; static objects looked great, but moving actors and objects in the scene exhibited grainy outlines.

So Lytro took what they learned from the Moon shoot, went back to the drawing board, and created a totally new Immerge prototype, which solved those problems so effectively that the company now proudly says their camera is “production ready,” (no joke, scroll to the bottom of this page on their website and you can submit a request to shoot with the camera.).

Photo courtesy Lytro

The new, physically larger Immerge prototype brings a physically larger capture volume, which means the view has more freedom of movement inside the capture. And the higher quality cameras provide more data, allowing for greater capture and playback fidelity. The latest Immerge camera is significantly larger than the prototype that captured the Moon experience, by about four times. It features a whopping 95 element planar light-field array with a 90-degree field of view. Those 95 elements are larger than on the precursor too, capturing higher quality data.

Google Partners With Magic Leap to Secure Key Tech for AR Headsets

I got to see a brand new production captured with the latest Immerge camera, and while I can’t talk much about the content (or unfortunately show any of it), I can talk about the leap in quality.

Photo by Road to VR

The move from Moon to this new production is substantial. Not only does the apparent resolution feel higher (leading to sharper ‘textures’), but the depth information is more precise which has largely eliminated the grainy outlines around non-static scene elements. That improved depth data has something of a double-bonus on visual quality, because sharper captures enhance the stereoscopic effect by creating better edge contrast.

Do you recall early renders of a spherical Immerge camera? Purportedly due to feedback informed by early productions using a spherical approach, the company decided to switch to a flat (planar) capture design. With this approach, capturing a 360 degree view requires the camera to be rotated to individually shoot each side of an eventual pentagonal capture volume. This sounds harder than capturing the scene all at once in 360, but Lytro says it’s easier for the production process.

The size of the capture volume has been increased significantly over Moon, though it can still feel limiting at this size. While you’re well covered for any reasonable movements you’d do while your butt is planted in a chair, if you were to take a large step in any direction, you’ll still leave the capture volume (causing the scene to fade to black until you step back inside).

And, although this has little to do with the camera, the experience I saw featured incredibly well-mixed spatial audio which sold the depth and directionality of the light-field capture in which I was standing. I was left very impressed with what Immerge is now capable of capturing.

The new camera is impressive, but the magic is not just in the hardware, it’s also in the software. Lytro is developing custom tools to fuse all the captured information into a coherent form for dynamic playback, and to aid production and post-production staff along the way. The company doesn’t succeed just by making a great light-field camera, they’re responsible for creating a complete and practical pipeline that actually delivers value to those that want to shoot VR content. Light-field capture provides a great many benefits, but needs to be easy to use at production scale, something that Lytro is focusing on just as heavily on as they are the hardware itself.

Netflix is Selling the '3 Body Problem' Headset, But Sadly It's Just a Prop

All-in-all, seeing Lytro’s latest work with Immerge has further convinced me that today’s de facto 360-degree film capture is a stopgap. When it comes to cinematic VR film production, volumetric capture is the future, and Lytro is on the bleeding edge.

Newsletter graphic

This article may contain affiliate links. If you click an affiliate link and buy a product we may receive a small commission which helps support the publication. More information.

Ben is the world's most senior professional analyst solely dedicated to the XR industry, having founded Road to VR in 2011—a year before the Oculus Kickstarter sparked a resurgence that led to the modern XR landscape. He has authored more than 3,000 articles chronicling the evolution of the XR industry over more than a decade. With that unique perspective, Ben has been consistently recognized as one of the most influential voices in XR, giving keynotes and joining panel and podcast discussions at key industry events. He is a self-described "journalist and analyst, not evangelist."
  • J.C.

    Replacing 360 degree video with this seems awesome, but how much drive space does one of these recordings take up? If one 10-minute experience eats up the lion’s share of someone’s cellphone storage, they won’t replace 360 degree video. I’m assuming here that this software is aimed at cellphone VR, but even if it’s aimed at Vive/Rift headsets, a 50gb+ file for a “video” would be hard to swallow.

    • crash alpha1

      unless it streaming, hmm

      • Guygasm

        Hence OTOYs big push with streaming lightfield tech.

      • J.C.

        Well, let’s break that down then, yes?

        If a 10 minute experience with this tech ate up 50GB, then you’d have to be able to stream 1gb/minute. Do you have that sort of internet connection at home? Do you think the general public does? Do you think that sort of connection is going to become commonplace ANYTIME soon?

        Just saying “streaming” doesn’t magically sort out the bandwidth issue.

        • > Do you think that sort of connection is going to become commonplace ANYTIME soon?

          Hum, yes actually. According to Wikipedia South Korean ISP are deploying 1 Gb/s connections at $20/month. That would give several GB per minute. We can use this to see where Europe/US will be in a few years time.

        • Ethan James Trombley

          That’s only about 16mb/s which isn’t that much. Also you use the 1gb/min as a solid example but no one knows the actual size.

        • TokenLiberal

          I have gigabit at home, 1000mb – and i’ve speed tested it, and I “NEVER” get 1gb/m on any sites.

          I’m way faster than most places I go.

          • Jesterchat

            Just a question, are you also considering the bit/byte difference?So a 1000Mbs (megebits) connection will get you at 100% capacity only 125MB (megbytes)/s? I’ve overlooked this more than once myself.

            But yeah, it’s gonna depend on the upload rates of the servers and everything inbetween too.

          • TokenLiberal

            yeah of course. i’ve never seen anywhere close to 125MB down. More like 20, at most.

        • Matt

          That’s totally incorrect. You’re thinking in terms of watching a regular video. In a regular video you might watch the whole thing from start to finish. But in a 3D lightfield experience it would be like you are watching only sections of a video. e.g. if you don’t explore the bedroom then you won’t need to stream that section of a 3d house. Therefore, the servers will only stream what you need to see based on where you are in the scene and where you are looking.

          Yes the video lightfields could potentially be huge, but you don’t need to send the whole file down, only what the 3D glasses consume. Lightfields are also very compressible. It doesn’t consume much more bandwidth than a skype video call.

          • J.C.

            Aaaand you seem to think they can stream your data with so little latency that it can send ONLY the frames you need, RIGHT THEN. Hate to break it to you, but that’s impossible, even with fiber. It could, perhaps, make some broad assumptions about where you could actually move your head to and send you a PORTION of the data, but that’s still going to involve a rather large chunk of ALMOST redundant data.
            It’d require some hefty work on the server’s part, no matter what, and that’d be for ONE SINGLE STREAM. How many customers could a single server handle?

          • Matt

            It isn’t perfect. There is a long way to go to get a perfect experience. But it’s rapidly getting better. What we have currently it actually works better than you imagine.

    • Doctor Bambi

      Mobile VR headsets currently wouldn’t be able to take full advantage of a light field recording seeing as they have no positional tracking.

    • epukinsk

      You don’t need the entire 50gb file when you are displaying the light field on the device. That’s just the raw data that gets synthesized into a light field. The light field itself is highly compressible. 24 frames of video can be compressed smaller than 24 individual photos, because there is repetition from frame to frame. Similarly a light field can be highly compressed because there is repetition between different angles. A totally matte surface should theoretically compress down to the same size as a flat image, since each point is the same color from every angle.

      Otoy has confirmed that light fields are highly compressible. Something like 4x a 360-degree video, not 100x.

      • Diego Galvez Pincheira Xango

        The volumetric information changes little in a frame by frame basis unless you are at the sea, with the movement of the waves. Theorically you only need the basic information and the information from the moving actors/enviroment

      • chuan_l

        You’re making the assumption that —
        Apart from developing the camera optics , that Lytro somehow has a magical way to process , synthesize the light data from all those images in real time. Even the Nvidia Iray stuff takes many hours on a DGX-1 cluster to generate a single frame with limited viewpoints.

        • epukinsk

          It’s a two part process. They first process the images into a light field offline, in however long it takes. During playback you just have to reproject the light field which isn’t particularly more computationally intensive than rendering a JPEG.

          Rendering a JPEG is similarly much faster than compressing a raw image into one.

    • chuan_l

      Maybe it’s not the best comparison —
      But Nozon does a similar thing with rendering out all the viewpoints ( for translation & rotation ) inside a 1m square cube. The demo they had on the HTC Vive was about 6 Gb of data , compressed for a single frame.

  • Lucidfeuer

    They seem to be drifting sideways more and more. The Lytro camera and immerge were amazing projects, the Immerge was the future, but now it seems their project make less and less sense.

    • Ian Shook

      Every time I hear about a new lytro upgrade they halve the FOV. Pretty soon it’ll be a 1deg FOV but It’ll be the best degree of data you’ve ever seen.

    • kontis

      No, technically this one makes a ton of sense, because it does what light field actually needs to work and is made the same way academia did this decade+ ago. It’s a well-known proven method. Lytro Immerge was a sci-fi concept that looked cool and that’s it. It would not be possible to capture a large volume with it.

      • Lucidfeuer

        So yeah that’s loser shit. Going step backs to an irrelevant decade old method instead of a multi-billions highly potential device. Maybe for research or cinema studio professionals but even then I fail to see the use of a one sided-lightfield camera.

        • CHEASE

          It won’t be hard to just splice 5 different shots together. It might be a challenge for a character to cross a threshold between two shots, but those kinds of challenges have always been part of filmmaking. That’s what makes filmmaking technique interesting; how to fool the viewer into thinking they are seeing something that is seamless. It has always involved compromise. Better than not doing it at all.

          • Lucidfeuer

            I disagree: when technology implementation is accessible to creative they can create new technic without disturbing the creative flow. When “nobody is doing it at all”, it certainly never for a lack of people with ideas and endeavors but because the tech or implementation is shit. And this is such a case.

          • CHEASE

            How do you know it’s shit? It’s brand new. This is the only company with one of these. Another bottleneck is VR headset ownership, which will determine to a large degree how much investment there is.

          • Lucidfeuer

            My job doesn’t involved waiting naïvely or hypocritically for something obvious to happen. I could see there Lytro Immerge camera having been used since it’s their best tech, but nobody did…

          • CHEASE

            Ignoring the strange first sentence, if it was their best tech, and no one used it, why are you wanting them to build another one like it?

          • Lucidfeuer

            To build a butter one that is more likely to be used rather than a worst machine?

  • Ian Shook

    This is really exciting and I wish I could experience this.

  • I would be curious to try a demo of a video like that…

    • benz145

      If you have a PSVR you can try this experience which is a volumetric video (though I don’t believe it is technically a light-field capture):

      • I looked that over very carefully, and I can see that it is a series of flat video elements (filmed in 3D) that float in a scanned set. It’s a clever illusion but not volumetric video.

        I seem to recall an earlier company who did 2 really impressive VR demos on the Dk2 using the same technique, but their name escapes me. I wouldn’t be surprised if they developed that video for Sony.

        • chuan_l

          You’re thinking of Kite & Lightning —
          The technique is called “billboarding” from games , and typically used to represent far away objects. As stereoscopy is only able to be perceived to a certain distance you can get away with 2d elements on the current displays.

          • You are correct, sir! Kite and Lighning! Love their stuff.

            It is worth noting that if you move the billboard object just right, you can get away with video elements that get VERY close to the end user.

            In “Senza Peso” they had a MASSIVE floating head that appeared right next to the user, filled up the entire frame of my goggles! It still looked 3D because it had good shading and moved rapidly into view and out again. If you hit the viewer fast and hard, you can get away with otherwise flat video elements. There just isn’t the time for the brain to realize how flat they really are.

          • Patrick Hogenboom

            Also, our brain is hard-wired to see faces as 3d objects. Just think of the rotating mask illusion. I bet it would have worked far less effectively with anything other than a face.

          • brandon9271

            “Senza Peso” was very impressive! It actually took me a little while to figure out that it was real time 3d with billboarded live action elements. As long as my brain is fooled, I don’t care how they do it :)

          • Patrick Hogenboom

            I understand, but as a developer I very much do care ;)

          • brandon9271

            I actually care too​, the geek in me is curious and fascinated with the tech. :) I’m super excited about this volumetric 360 video wizardry. It’s definitely “the future”

  • They are shooting for the high tech film and archival industry. GBs and throughput doesn’t matter when big money, in house and detail is paramount. I love my Illum and been an immersive VR producer since the 90s. Low end accessability is amazing now but saturated. Pushing the high end is interesting…

    • Sponge Bob

      saturated ???

      how ?

  • Pretty awesome stuff. Can’t wait to try some of their upcoming videos. Great article.