Eye-tracking—the ability to quickly and precisely measure the direction a user is looking while inside of a VR headset—is often talked about within the context of foveated rendering, and how it could reduce the performance requirements of XR headsets. And while foveated rendering is an exciting use-case for eye-tracking in AR and VR headsets, eye-tracking stands to bring much more to the table.

Updated – May 2nd, 2023

Eye-tracking has been talked about with regards to XR as a distant technology for many years, but the hardware is finally becoming increasingly available to developers and customers. PSVR 2 and Quest Pro are the most visible examples of headsets with built-in eye-tracking, along with the likes of Varjo Aero, Vive Pro Eye and more.

With this momentum, in just a few years we could see eye-tracking become a standard part of consumer XR headsets. When that happens, there’s a wide range of features the tech can enable to drastically improve the experience.

Foveated Rendering

Let’s first start with the one that many people are already familiar with. Foveated rendering aims to reduce the computational power required for displaying demanding AR and VR scenes. The name comes from the ‘fovea’—a small pit at the center of the human retina which is densely packed with photoreceptors. It’s the fovea which gives us high resolution vision at the center of our field of view; meanwhile our peripheral vision is actually very poor at picking up detail and color, and is better tuned for spotting motion and contrast than seeing detail. You can think of it like a camera which has a large sensor with just a few megapixels, and another smaller sensor in the middle with lots of megapixels.

The region of your vision in which you can see in high detail is actually much smaller than most think—just a few degrees across the center of your view. The difference in resolving power between the fovea and the rest of the retina is so drastic, that without your fovea, you couldn’t make out the text on this page. You can see this easily for yourself: if you keep your eyes focused on this word and try to read just two sentences below, you’ll find it’s almost impossible to make out what the words say, even though you can see something resembling words. The reason that people overestimate the foveal region of their vision seems to be because the brain does a lot of unconscious interpretation and prediction to build a model of how we believe the world to be.

SEE ALSO
Abrash Spent Most of His F8 Keynote Convincing the Audience That 'Reality' is Constructed in the Brain

Foveated rendering aims to exploit this quirk of our vision by rendering the virtual scene in high resolution only in the region that the fovea sees, and then drastically cut down the complexity of the scene in our peripheral vision where the detail can’t be resolved anyway. Doing so allows us to focus most of the processing power where it contributes most to detail, while saving processing resources elsewhere. That may not sound like a huge deal, but as the display resolution of XR headsets and field-of-view increases, the power needed to render complex scenes grows quickly.

Eye-tracking of course comes into play because we need to know where the center of the user’s gaze is at all times quickly and with high precision in order to pull off foveated rendering. While it’s difficult to pull this off without the user noticing, it’s possible and has been demonstrated quite effectively on recent headset like Quest Pro and PSVR 2.

Automatic User Detection & Adjustment

 

In addition to detecting movement, eye-tracking can also be used as a biometric identifier. That makes eye-tracking a great candidate for multiple user profiles across a single headset—when I put on the headset, the system can instantly identify me as a unique user and call up my customized environment, content library, game progress, and settings. When a friend puts on the headset, the system can load their preferences and saved data.

Eye-tracking can also be used to precisely measure IPD (the distance between one’s eyes). Knowing your IPD is important in XR because it’s required to move the lenses and displays into the optimal position for both comfort and visual quality. Unfortunately many people understandably don’t know what their IPD off the top of their head.

With eye-tracking, it would be easy to instantly measure each user’s IPD and then have the headset’s software assist the user in adjusting headset’s IPD to match, or warn users that their IPD is outside the range supported by the headset.

In more advanced headsets, this process can be invisible and automatic—IPD can be measured invisibly, and the headset can have a motorized IPD adjustment that automatically moves the lenses into the correct position without the user needing to be aware of any of it, like on the Varjo Aero, for example.

Varifocal Displays

A prototype varifocal headset | Image courtesy NVIDIA

The optical systems used in today’s VR headsets work pretty well but they’re actually rather simple and don’t support an important function of human vision: dynamic focus. This is because the display in XR headsets is always the same distance from our eyes, even when the stereoscopic depth suggests otherwise. This leads to an issue called vergence-accommodation conflict. If you want to learn a bit more in depth, check out our primer below:

Accommodation

Accommodation is the bending of the eye’s lens to focus light from objects at different distances. | Photo courtesy Pearson Scott Foresman

In the real world, to focus on a near object the lens of your eye bends to make the light from the object hit the right spot on your retina, giving you a sharp view of the object. For an object that’s further away, the light is traveling at different angles into your eye and the lens again must bend to ensure the light is focused onto your retina. This is why, if you close one eye and focus on your finger a few inches from your face, the world behind your finger is blurry. Conversely, if you focus on the world behind your finger, your finger becomes blurry. This is called accommodation.

Vergence

Vergence is the inward rotation of each eye to overlap each eye’s view into one aligned image. | Photo courtesy Fred Hsu (CC BY-SA 3.0)

Then there’s vergence, which is when each of your eyes rotates inward to ‘converge’ the separate views from each eye into one overlapping image. For very distant objects, your eyes are nearly parallel, because the distance between them is so small in comparison to the distance of the object (meaning each eye sees a nearly identical portion of the object). For very near objects, your eyes must rotate inward to bring each eye’s perspective into alignment. You can see this too with our little finger trick as above: this time, using both eyes, hold your finger a few inches from your face and look at it. Notice that you see double-images of objects far behind your finger. When you then focus on those objects behind your finger, now you see a double finger image.

The Conflict

With precise enough instruments, you could use either vergence or accommodation to know how far away an object is that a person is looking at. But the thing is, both accommodation and vergence happen in your eye together, automatically. And they don’t just happen at the same time—there’s a direct correlation between vergence and accommodation, such that for any given measurement of vergence, there’s a directly corresponding level of accommodation (and vice versa). Since you were a little baby, your brain and eyes have formed muscle memory to make these two things happen together, without thinking, anytime you look at anything.

But when it comes to most of today’s AR and VR headsets, vergence and accommodation are out of sync due to inherent limitations of the optical design.

In a basic AR or VR headset, there’s a display (which is, let’s say, 3″ away from your eye) which shows the virtual scene, and a lens which focuses the light from the display onto your eye (just like the lens in your eye would normally focus the light from the world onto your retina). But since the display is a static distance from your eye, and the lens’ shape is static, the light coming from all objects shown on that display is coming from the same distance. So even if there’s a virtual mountain five miles away and a coffee cup on a table five inches away, the light from both objects enters the eye at the same angle (which means your accommodation—the bending of the lens in your eye—never changes).

That comes in conflict with vergence in such headsets which—because we can show a different image to each eye—is variable. Being able to adjust the imagine independently for each eye, such that our eyes need to converge on objects at different depths, is essentially what gives today’s AR and VR headsets stereoscopy.

But the most realistic (and arguably, most comfortable) display we could create would eliminate the vergence-accommodation issue and let the two work in sync, just like we’re used to in the real world.

Varifocal displays—those which can dynamically alter their focal depth—are proposed as a solution to this problem. There’s a number of approaches to varifocal displays, perhaps the most simple of which is an optical system where the display is physically moved back and forth from the lens in order to change focal depth on the fly.

Achieving such an actuated varifocal display requires eye-tracking because the system needs to know precisely where in the scene the user is looking. By tracing a path into the virtual scene from each of the user’s eyes, the system can find the point that those paths intersect, establishing the proper focal plane that the user is looking at. This information is then sent to the display to adjust accordingly, setting the focal depth to match the virtual distance from the user’s eye to the object.

SEE ALSO
Oculus on Half Dome Prototype: 'don't expect to see everything in a product anytime soon'

A well implemented varifocal display could not only eliminate the vergence-accommodation conflict, but also allow users to focus on virtual objects much nearer to them than in existing headsets.

And well before we’re putting varifocal displays into XR headsets, eye-tracking could be used for simulated depth-of-field, which could approximate the blurring of objects outside of the focal plane of the user’s eyes.

As of now, there’s no major headset on the market with varifocal capabilities, but there’s a growing body of research and development trying to figure out how to make the capability compact, reliable, and affordable.

Foveated Displays

While foveated rendering aims to better distribute rendering power between the part of our vision where we can see sharply and our low-detail peripheral vision, something similar can be achieved for the actual pixel count.

Rather than just changing the detail of the rendering on certain parts of the display vs. others, foveated displays are those which are physically moved (or in some cases “steered”) to stay in front of the user’s gaze no matter where they look.

Foveated displays open the door to achieving much higher resolution in AR and VR headsets without brute-forcing the problem by trying to cram pixels at higher resolution across our entire field-of-view. Doing so is not only be costly, but also runs into challenging power and size constraints as the number of pixels approach retinal-resolution. Instead, foveated displays would move a smaller, pixel-dense display to wherever the user is looking based on eye-tracking data. This approach could even lead to higher fields-of-view than could otherwise be achieved with a single flat display.

A rough approximation of how a pixel-dense foveated display looks against a larger, much less pixel-dense display in Varjo’s prototype headset. | Photo by Road to VR, based on images courtesy Varjo

Varjo is one company working on a foveated display system. They use a typical display that covers a wide field of view (but isn’t very pixel dense), and then superimpose a microdisplay that’s much more pixel dense on top of it. The combination of the two means the user gets both a wide field of view for their peripheral vision, and a region of very high resolution for their foveal vision.

Granted, this foveated display is still static (the high resolution area stays in the middle of the display) rather than dynamic, but the company has considered a number of methods for moving the display to ensure the high resolution area is always at the center of your gaze.

Continued on Page 2: Better Social Avatars »

1
2
Newsletter graphic

This article may contain affiliate links. If you click an affiliate link and buy a product we may receive a small commission which helps support the publication. More information.


Ben is the world's most senior professional analyst solely dedicated to the XR industry, having founded Road to VR in 2011—a year before the Oculus Kickstarter sparked a resurgence that led to the modern XR landscape. He has authored more than 3,000 articles chronicling the evolution of the XR industry over more than a decade. With that unique perspective, Ben has been consistently recognized as one of the most influential voices in XR, giving keynotes and joining panel and podcast discussions at key industry events. He is a self-described "journalist and analyst, not evangelist."
  • Raphael

    Yes, eventually games will have a whole new dimension of realism/interaction. Natural voice input, AI powered characters, characters will track your eyes. The games we play now are very limited in terms of input/interaction realism. Facial expression detection will also make its way into games eventually.

    • Lucio Lima

      Yes, the future of VR is very promising!

    • impurekind

      And most of it is going to be realised because of VR specifically.

    • NooYawker

      I hope this all happens before I die!

      • Raphael

        December… can you hang around until then?

        • NooYawker

          I’ll try!

    • You have a keen eye for these things, Donatello :)

    • Zantetsu

      Ha ha the article was re-posted. I saw the Raphael baby icon which I haven’t seen in years and was like “wow who woke up the baby”. But now I see that the comment is 5 years old :)

      • XRC

        Where has everyone gone? Real quiet these days…

  • Lucio Lima

    Very interesting!

  • impurekind

    Well this all sounds like great stuff so hopefully they can get a properly reliable and effective solution working in the not too distant future.

  • moogaloo

    As someone who is cross eyed I am a bit concerned by stuff that uses both eye positions in concert like the focal depth stuff. i hope that they build something in to not do this if one eye feels like doing it’s own thing? If not it could potential ruin VR for me and millions of others.

    • Lucidfeuer

      I think that’s quite the opposite: eye-tracking enables all sorts of adjustment for astigmatics, cross-eyed, visual impaired etc…that are not possible with the just the screen.

      • kontis

        Exactly, but there are also some pessimistic hypothesis that it could convince brain that everything is okay and to stop trying to correct it, which would be undesirable.

        • Lucidfeuer

          Strictly talking optic, I don’t see how it could trick the brain without actually correcting vision.

          • Konchu

            I remember at least one thread with a person with Monocular vision not getting depth info from VR like they do in the real world. And I bet this varible depth focus will help simulate it for those people.

            But I do have a friend who has never been able to see 3D in movies stereo grams, old virtual boy etc but VR is amazing for them. So I can somewhat understand the fear that some immersion tech could ruin something for some people. I still think it will do more good that bad and I imagine it will be fairly easy to disable some things as long as it not detrimental to game performance etc. AKA if they start using variable focus for a Culling boost it might make those experiences harder to render without it.

      • Guest

        It cannot track saccading and adjust for individual varients. It just a marketing wet-dream to collect VC money!

      • Coffs

        Only if the eye tracking does each eye individually. If its basing the calcs on one eye, then the other eye gets screwed……

  • Lucidfeuer

    Oh so here something that I’m pretty sure they’ll do, but this has nothing to do with functionality or rendering: “Intent & Analytics” is the only reason why they invest the extra-cents to have eye-trackers.

    • Raphael

      Very impressed with your record for 2018 flappy. All but one of your statements is either half, three-quarters or fully dumb. You are consistent.

      • Graham J ⭐️

        And you call everyone “flappy”. Takes dumb to know dumb.

        • Raphael

          No flappy, I don’t call anyone else flappy.

      • Lucidfeuer

        And you’re still an eloquent genius with convincing counter-arguments.

    • brubble

      Well then seeing that you probably wont buy one, why are you here? Welcome to the internet where your precious f*cking “privacy” doesnt exist. Give it a rest man.

      • Sandy Wich

        Telling people to give privacy a rest is pathetic. If you want to give up caring about your own basic self respect and human rights then do it. But don’t shame people for caring about sociopaths spying on you and getting away with it cus they have money/influence/word it nicely.

        • brubble

          Oh no, the big bad companies might know what youve been looking at and hit you with ads and marketing?! Once these evil sociopaths compile enough info on you they’ll confer in their secret deep underground lair to decide if youre important enough to blast you with their top secret information brainwashing raygun to force into buyng Charmin over Royale.
          Basic self respect??? Human rights??? Pffft Really? Please do explain your preposterous, misguided hyperbole. You couldnt be any more melodramatic? Watch out man! The unmarked black van is circling your neighbourhood. Gimme a break. Bahahaha. Tool.

    • kontis

      Not true. Eyetracking is crucial in improving the quality of the experience (which is currently insufficient). Without these kind of improvements many people will not want to use HMDs.

      They have to do the eyetracking because they have no other choice. Analitics is more like a super enticing bonus for them and maybe a reason to give it a higher priority, but that’s all.

      • Lucidfeuer

        Yes right, these companies have no agendas and don’t care about data and money…eye-tracking is as crucial as pass-through AR, untethered wireless, inside-out tracking, hand tracking etc…yet I’m ready to bet we’ll see the priority being put on eye-tracking even though foveated rendering is not usable yet…

        • Raphael

          There is no counter-argument to stupidity flappy. People without logic or reasoning are “always right”.

          Once again you have an entirely negative/cynical opinion on the motivations of VR developers.

          “yet I’m ready to bet we’ll see the priority being put on eye-tracking even though foveated rendering is not usable yet…”

          Your bets are worth less than the shit from your botty flappy.

          Nvidia and Oculus along with other companies are developing EYE-TRACKING. PRIMARY USE is for foveated rendering. Your idiotic paranoia about eye-tracking being used for NSA surveillance or advertising just goes to show how utterly stupid you are.

    • Sandy Wich

      It’s not the only reason, but it’s a big one.

      People who don’t see what this is really going to be used for 5-10 years down the line… Idk about em.

  • Doctor Bambi

    Eye tracking can also help with redirected walking which I think will become a more important area of interest when full 6DOF standalone gets here.

    It’s amazing to me how much promise eye tracking holds for VR and AR. And it’s why I personally think Gen 2 headsets won’t launch without it. Even if it’s not quite accurate enough for foveated rendering, there are still plenty of benefits to be had in simpler use cases.

  • Jistuce

    Am I the only one concerned at how low the windshield ranks on that car cabin heatmap?

    • doug

      Car was still, in a lab.

      • Jistuce

        See, I assumed the windshield was just omitted from the data set so that all the other stuff would be visible. But that was boring, so I took the image at face value instead. (Also why are people checking the rearview mirror but not the front windshield?)

  • bud

    Great content road to vr team!!, not overlooked as just another article imo..

    Much appreciated, good job, nice.

    thanks,

  • Alexander Grobe

    Good article. However I was missing the usage of eye tracking for redirected walking in VR using saccades and eye blinks.

  • NooYawker

    I remember watching an episode of 20/20. John Stossel was doing a story about advertisers using eye tracking technology to see what people find interesting. They had him watch a Tab commercial with a girl on a beach and yea.. wasn’t hard to predict what he was looking at. But this was close to 40 years ago and I was amazed they could do such a thing. And after 40 years they finally found a consumer use for eye tracking.
    For the young folks:
    20/20 was a news program similar to 60 minutes
    Tab: the first diet soda
    John Stossel: a promising young reporter before he went insane and became a libertarian.

    • brandon9271

      What’s wrong with libertarians? :)

      • Who knows…

        • Zpfunk

          I’m glad someone made that reference. Eye Tracking for advertisement is the main inspiration for the push into the Next Lvl of virtual reality. Good article, but the author must have overlooked that use case. I believe it will be used in much the same way, although with our current internet based consumer culture, here in the United States, that information will most definitely be leveraged against the consumer. In my opinion. Eventually it may be impossible to look the other way.

      • Robert Gordon

        They don’t believe in the goverment as God, and giving all their money to the powerfull god-complex trolls in power to create more regulations to benefit the few billionairs that buy these regulations and limit competition and salaries; Oppressing everyone else, sending jobs to the countries with the lowest regulations/wages is best for them?

  • doug

    If Google is empowered knowing what ads people will click on, just wait unit a company knows what you looked at.

  • Psuedonymous

    Missing is THE most important application of 3D pupil tracking due VR: real time lens correction. Currently, lens correction assumes a single fixed pupil position, while in reality the rotation of your eyeball causes your pupil to physically translate side-to-side, up and down, and even forward and back slightly. Even if it remains within the eyebox, the distortion correction shader will only provide the correct view for one pupil position. By tracking the pupil, the correct distortion correction can be used all the time.

    • Kev

      I wonder if that could be used in some way to help people with low vision where they use a headset with cameras and the display tries to make all the appropriate corrections for them.

    • Zantetsu

      I believe that StarVR did this with the StarVR One. I was super excited about that tech oh about 5 years ago but they never released it in a consumer level headset, never made the tech available for wide review, and basically fell off the face of the Earth. Like a lot of VR since then unfortunately :(

    • Sven Viking

      This also becomes increasingly important with larger FOVs.

  • nipple_pinchy

    Eye-tracking/foveated rendering is going to allow for those lower power standalone, wireless headsets to come to market and be affordable for more people.

  • MarquisDeSang

    Yet PC Looser race will never be interested in VR, because it does not look and feel like tv games.

  • oompah

    wonderful
    this is the future & the right path
    to pursue in VR tech
    Combine it with Cloud VR streaming
    and then MAGIC

  • Sandy Wich

    I can’t wait to see the ad placements that track what I’m looking at and then they’ll, “cater the experience to my liking! <3", by forcibly filling my FacebookTM games with them.

  • dk

    can I get 8 reasons why it’s not in consumer headsets yet….also hand tracking although the unreasonably expensive vive pro might have that

  • Great editorial

  • Damien King-Acevedo

    > if you keep your eyes focused on this word and try to read just two sentences below

    Uh… I think there might be something wrong with your eyes; I couldn’t read two paragraphs below, but two sentences was fine.

  • VR4EVER

    Hands-of gaze-input will be groundbreaking for all the handicapped persons out there. Friend of mine loves vTime just for that. Should generally be a mandatory input-scheme, IMHO.

  • Scot Fagerland

    Ben, thanks for the great research and analysis. I am a patent attorney doing prior art research for a client in the field of enhanced vision, and this article was just the starting point I needed.

    • benz145

      We’re here to be a resource, glad this was helpful!

  • Byaaface

    Awesome article!

  • “Active Input” is not always the best thing to do. We never use eyes to activate objects in real life, so it’s weird when we can do that in VR. At maximum, it can support the activation of objects. E.g. you look at menu items to highlight them, but then you need a click on your controllers to confirm the selection

    • Ben Lang

      I’m in agreement generally, but also willing to be convinced otherwise.

      It’s a potentially useful capability, but I’m waiting to see someone figure it out in a way that feels good. There’s a real chance that it’s the kind of thing that we’ll become completely accustomed to after a few hundred hours of use (like a mouse or keyboard), but that’ll require a smart and consistent application of the feature.

      • Tabp

        Controllerless active input allows you to use the headset without controllers, which is a big deal in many circumstances, especially productivity. Meanwhile, in games input modes would tend to be purpose-specific. Yes, consistent “interaction language” is important and we’ll see the best implementation spread until everyone thinks it’s obvious to do it that way. Agreed with the speed benefits mentioned in the article.

        Even if you’re like “use the controllers because dev says so” users will be like “but I need to play with a drink in one hand and something else in the other hand.”