Vision Pro is built entirely around hand-tracking while Quest 3 uses controllers first and foremost, but also supports hand-tracking as an alternate option for some content. But which has better hand-tracking? You might be surprised at the answer.

Vision Pro Hand-tracking Latency

With no support for motion controllers, Vision Pro’s only motion-based input is hand-tracking. The core input system combines hands with eyes to control the entire interface.

Prior to the launch of the headset we spotted some footage that allowed us to gauge the hand-tracking latency between 100-200ms, but that’s a pretty big window. Now we’ve run our own test and more precisely find Vision Pro’s hand-tracking to be about 128ms on visionOS beta v1.1.1.

Here’s how we measured it. Using a screen capture from the headset which views both the passthrough hand and the virtual hand, we can see how many frames it takes between when the passthrough hand moves and when the virtual hand moves. We used Apple’s Persona system for hand rendering to eliminate any additional latency which could be introduced by Unity.

After sampling a handful of tests (pun intended), we found this to be about 3.5 frames. At the capture rate of 30 FPS, that’s 116.7ms. Then we add to that Vision Pro’s known passthrough latency of about 11ms, for the final result of 127.7ms of photon to hand-tracking latency.

We also tested how long between a passthrough tap and a virtual input (to see if full skeletal hand-tracking is slower than simple tap detection), but we didn’t find any significant difference in latency. We also tested in different lighting conditions and found no significant difference.

SEE ALSO
New Reality Labs Research Project Demonstrates Mind-bending AR Capabilities

Quest 3 Hand-tracking Latency

How does that compare to Quest 3, a headset which isn’t solely controlled by the hands? Using a similar test, we found Quest 3’s hand-tracking latency to be around 70ms on Quest OS v63. That’s a substantial improvement over Vision Pro, but actual usage of the headset would make one think Quest 3 has even lower hand-tracking latency. But it turns out some of the perceived latency is masked.

Here’s how we found out. Using a 240Hz through-the-lens capture, we did the same kind of motion test as we did with Vision Pro to find out how long between the motion of the passthrough hand and the virtual hand. That came out to 31.3ms. Combined with Quest 3’s known passthrough latency of about 39ms that makes Quest 3’s photon to hand-tracking latency about 70.3ms.

When using Quest 3, hand-tracking feels even snappier than that result suggests, so what gives?

Because Quest 3’s passthrough latency is about three-and-a-half times that of Vision Pro (11ms vs. 39ms), the time between seeing your hand move and your virtual hand move appears to be just 31.3ms (compared to 116.7ms on Vision Pro).

– – — – –

An important point here: latency and accuracy of hand-tracking are two different things. In many cases, they may even have an inverse relationship. If you optimize your hand-tracking algorithm for speed, you may give up some accuracy. And if you optimize it for accuracy, you may give up some speed. As of now we don’t have a good measure of hand-tracking accuracy for either headset, outside of a gut feeling.

Newsletter graphic

This article may contain affiliate links. If you click an affiliate link and buy a product we may receive a small commission which helps support the publication. More information.


Ben is the world's most senior professional analyst solely dedicated to the XR industry, having founded Road to VR in 2011—a year before the Oculus Kickstarter sparked a resurgence that led to the modern XR landscape. He has authored more than 3,000 articles chronicling the evolution of the XR industry over more than a decade. With that unique perspective, Ben has been consistently recognized as one of the most influential voices in XR, giving keynotes and joining panel and podcast discussions at key industry events. He is a self-described "journalist and analyst, not evangelist."
  • Eren

    Did I miss it or has there still not been a review of the vision pro?

  • Dragon Marble

    Great analysis. It should settle the debate.

    Most Vision Pro users may not realize this if their experience with hand tracking is restricted to operating UI. Even when you drag the window around, you may not notice the latency unless you bring the window really close.

  • g-man

    For AVP there seem to be three latencies at play – passthrough, arm/hand segmentation, and hand tracking. Most of the time you’re looking at video of your hand so the latency appears to be ~11ms but there’s the cutout around it that slightly lags behind. I’d be interested to know the latency of that. It’s surprisingly fast.

    Although Q3’s passthrough isn’t as good I’d like to see the same approach taken there, with your passthrough hand and cutout being the usual thing you see and a rendered hand only when necessary. I haven’t tried an AVP but I’m sure seeing your real hands adds considerably to presence.

    • Ben Lang

      Yes good point, the occlusion mask latency is really low and quite accurate. It’s easy to see when stress testing it with fast motions, but with regular usage like tapping and swiping (as most of the usage in AVP) it’s works impressively well.

  • Stephan @ Ghost Medical

    The big issue that fewer are discussing is that it seems currently impossible to interact with 3D objects naturally with your hands with the VisionPro. The canned gesture-based controls do work pretty well in the headset, but this is limiting especially when users want to reach out, grab an object. So the VisionPro can’t do things that all VR headsets can do since the original Rift with motion controllers. This means no headshots on virtual zombies with virtual guns, or virtual surgeries, or virtual sculpting, or anything that requires virtual hands in a fully immersive experience.

    That’s a bummer of a shortcoming since the fidelity and processing power is the best I’ve ever seen in an untethered headset.

    • Ben Lang

      I don’t think it’s ‘impossible’, it’s just the default interaction system Apple has gone with. Developers can customize their apps to rely on full skeletal hand-tracking.

      Most hand-tracking apps that allow you to directly interact with 3D objects are generally also using a glorified pinch gesture, but don’t lock some of the axes of rotation as is commonly the case with AVP.

  • Jeff

    There are so many core decisions that were made for Vision that will prove to be wrong down the road- a lot of people just don’t want to admit it yet. Hand tracking only is at the top of that list. If you’re going to make it the exclusive input option, it needs to be basically perfect. There is so much in XR that simply can’t be done with any level of precision with hand tracking. A pinch simply cannot always be seen by the headset, period end of story. And fingers will be occluded, wrist rotation cannot be calculated precisely enough.

    Even the highly praised pass through has major issues that came from bad decisions such as prioritizing a “clean” picture over accuracy. They simply do not correct scale and perspective nearly as well as Meta because they wanted to avoid warping, but they could have harnessed the sheer horsepower in their chip to acheive accuracy and less warping than Meta’s solution.

    The latter will probably be addressed with updates, but I expect Apple to dig in their heels with hand tracking. It’s like removing hardware buttons and ports on their other devices- they want to streamline input and justify showing up super late by doing it differently and “better”. I want (and we need) Apple to succeed here so I hope they are willing to humble themselves and admit that the many years of research and experience show that deeper experiences need hardware inputs and feedback.

    • ViRGiN

      Meta achieved a lot with their $300-$400-$500 price range, Apple achieved nothing with their starting at $3500.

      what a boring device, equally as bulky as Quest 3, but even heavier and more uncomfortable.

  • ViRGiN

    i measured the latency on my valve indexc, and it’s in the lower 700 ms using quadruple lighthouse tracking.
    i’m so glad i’ve ended up with Meta.

  • f3flight

    I really wonder what accuracy AVP gives. I found through testing that Quest 3 does not have good enough accuracy to handle precise finger movements (needed to play shorter notes and faster rhythms on a virtual music instrument), but it sounds like even if Apple could have better accuracy potentially, it would still feel worse due to higher lag (100ms is unacceptably high to perform music live).

  • Poze Ar

    Great article!

    Can you please comment on how latency relates to frames per second? At 40ms latency, 1000 / 40ms might suggest 25fps for the Quest 3 passthrough cameras. But that seems far too low.

    What is the correct way to calculate FPS of the Quest 3 passthrough cameras?

    • Ben Lang

      Filming a high speed video through the lens with a stopwatch in the passthrough view could probably answer that question sufficiently.

      As I think you’re getting at, the FPS of the cameras themselves may be different than the FPS of the update rate of the display (90Hz by default on Quest 3)