Apple today released ‘Spatial Personas’ in public beta on Vision Pro. The newly upgraded avatar system can now bring people right into your room. We got an early look.

Much has been said about Apple’s Persona avatar system for Vision Pro. Whether you find them uncanney or passable, one thing is certain: it’s the most photorealistic real-time avatar system built into any headset available today. And now Personas is getting upgraded with ‘Spatial Personas’.

But weren’t Personas already ‘spatial’? Let me explain.

Sorta Spatial

At launch the Persona system allowed users to scan their faces into the headset to create a digital identity that looks and moves like the user thanks to the bevy of sensors in Vision Pro. When doing a FaceTime call with another Vision Pro user (or users), their Persona(s) head, shoulders, and hands would be shown inside a floating box.

Image courtesy Apple

While this could feel like face-to-face talking at times, the fact that they were contained within a frame (which you can move or resize like any other window) made it feel like they weren’t actually standing right next to you. And that’s not just because of the frame, but also because you weren’t actually in a sharing the same space as them—it’s not like they could walk right up to you for a high-five, because they’d be stuck in the window on your screen.


Now with Spatial Personas (released in beta today on the latest version of VisionOS), each person’s avatar is rendered in a shared space without the frame. When I say ‘shared space’, I mean that if someone takes takes a step toward me in their room, I actually see them come one step closer to me.

Previously the frame made it feel sort of like you were doing a 3D video chat. Now with the shared space and no frame, it really feels like you’re standing right next to each other. It’s the ‘hang out on the same couch’ or ‘gather around the same table’ experience that wasn’t actually possible on Vision Pro at launch.

And it’s really quite compelling. I got a sneak peek at the new system in a Vision Pro FaceTime call with four people (though up to five are supported total), all using Spatial Personas. You’ll still only see their head, shoulders, and hands but now it really feels like a huddle instead of a 3D video chat. It feels much more personal.

Spatial Personas Are Opt-in

To be clear, the ‘video chat’ version of Personas (with the frame) still exists. In fact, it’s the default way that avatars are shown when a FaceTime call is started. Switching to a Spatial Persona requires hitting a button on the FaceTime menu.

And while this might seem like a strange choice, I actually think there’s something to it.

On the one hand, the default ‘FaceTime in Vision Pro’ experience feels like a video chat. In everyday business we’re all pretty used to seeing someone else on the other side of a webcam by now. And even though this is more personal than an audio-only call, it’s still a step away from actually meeting with someone in person.

Spatial Personas is more like you’re actually meeting up in person, since you can actually feel the interpersonal space between you and the other people in this shared space. If they walk up and get a little too close, you’ll truly feel it in the same way if someone stands too close to you in real life.

So it’s nice to have both of these options. I can ‘video chat’ with someone with the regular mode, or I can essentially invite them into my space if the situation calls for a more personal meeting.

New Live Captions Feature on Vision Pro Could Lead the Way to Real-time Translation

And Spatial Personas aren’t just for chatting. Just like regular Personas, you can use SharePlay while on FaceTime to watch movies and play games together (provided you both have a supported app installed).

Take Freeform for instance, Apple’s collaborative digital whiteboard app. If you launch Freeform while on a FaceTime call with Spatial Personas, everyone else will be asked to join the app, which will then load everyone in front of the whiteboard.

Everything is synchronized too. Anyone else in the call can see what you’ve put on the whiteboard and watch in real time as you add new photos or draw annotations. And just as easily, anyone can physically walk up to the board and interact with it themselves.

When it comes to shared movie viewing on Apple TV on Vision Pro, Spatial Personas unlock the feeling of sitting on the same couch together, which wasn’t quite possible with the headset at launch. Now when you watch a movie with your friends you’ll be sitting shoulder to shoulder with them, which feels very different than having a window with their face in it floating near the video you’re watching.

It’s possible to stream many flat apps to anyone in the FaceTime call while using Spatial Personas, but for 3D or interactive content developers will need to specially implement the feature.

That’s somewhat problematic though because it’s difficult to know exactly which apps support Spatial Personas or even SharePlay for that matter. As of now, you have to scroll all the way to the bottom of an app’s page to see if it supports SharePlay (unless the developer mentions it in the app’s description). And even then this doesn’t necessarily mean it supports Spatial Personas.

The Little Details

Apple also thought through some smaller details for Spatial Personas, perhaps the most interesting of which is ‘locomotion’.

Room-scale locomotion is essentially the default. If you want to move closer to a person or app… you just physically walk over to it. But what happens if it’s outside the bounds of your physical space? Well, instead of directly moving yourself virtually, you can actually move the whole shared space closer or further from you.

You can do this any time, in any app, and everyone else will see your new position reflected within their space, keeping everything synchronized.

Newly Announced 'Neuromancer' TV Show Could Be Another Big Moment for VR to Make an Impact

Apple also made is so when two Spatial Personas get too close together, they will temporarily revert to just looking like a floating contact photo. I think this is probably because they want to avoid possible harassment or trolling (ie: you want to annoy someone so you phase your virtual hand right through their virtual face, which is uncomfortable both visually and from an interpersonal space standpoint).

The headset’s excellent spatial audio is of course included by default, so everyone sounds like they’re coming from wherever they’re standing in the room, and their voices actually sound like they’re in your room (based on the headset’s estimate of what the acoustics should sound like). And if you move to a fully immersive space like an ‘environment’, the spatial audio transitions to that new acoustic environment—so for instance you can hear people faintly echoing in the Joshua Tree environment because of all the rock surfaces nearby. Hearing the acoustics fade from being inside your own room to being ‘outside’ in an environment is a subtle bit of magic.

Image courtesy Apple

And last but not least, it’s possible to have a mixed group of FaceTime participants. For instance you could have people using an iPhone, an Android tablet (yes you can FaceTime with people on non-Apple devices), a normal Persona, and a Spatial Persona all at once. SharePlay in that case will also work between those formats (except non-Apple devices) as long as long as the app supports it. In cases with apps that are Vision Pro native, the iPhone user would get a notification that their device isn’t supported.

– – — – –

Spatial Personas is a big upgrade to Apple’s avatar system, but the company maintains the whole Persona system is still in ‘beta’. Presumably that means there’s more improvements yet to come.

Newsletter graphic

This article may contain affiliate links. If you click an affiliate link and buy a product we may receive a small commission which helps support the publication. More information.

Ben is the world's most senior professional analyst solely dedicated to the XR industry, having founded Road to VR in 2011—a year before the Oculus Kickstarter sparked a resurgence that led to the modern XR landscape. He has authored more than 3,000 articles chronicling the evolution of the XR industry over more than a decade. With that unique perspective, Ben has been consistently recognized as one of the most influential voices in XR, giving keynotes and joining panel and podcast discussions at key industry events. He is a self-described "journalist and analyst, not evangelist."
  • Adrian Meredith

    You have to wonder what on earth meta is playing at. They’ve been showing this off for years now and there no sign of them shipping it. Instead we’re stuck with those horrible avatars

    • Ondrej

      No, Meta has never showed anything like this.

      Meta is doing hundreds of scans in a multimillion dollar studio and turning that data by experts into a few sophisticated examples. We don’t even know how much time and effort it takes for a single codec avatar.
      Their results look much better, but at what cost?

      Apple shipped it in a consumer device. All integrated.

      This is a world of difference.

      • Sad.
        But true.

      • Dragon Marble

        Makes no difference for me. I have no one to share this experience with. When something requires >$7000 (at least two Vision Pros), it’s a “consumer device” in name only.

        • Yeah, but rich consumers are gonna use it! lol

    • Rogue Transfer

      Latency. That’s the main issue they remarked on in the recent research video online about Meta Codec Avatars. They said that earlier versions, like the Codec Avatars 1.0(the one they say worked on standalone) and 2.0 both suffer from a disconnect, due to the delay from all the processing needed to render even their simpler versions of Codec Avatars.

      They say that finally with the use of four RTX 4090s, locally in a PC workstation to render them quick enough, they have the feeling of the other person’s Codec Avatar being real-time & present. There are some other issues they have to do with initial avatar scanning(they require someone to do 65 expressions one-after-another), that are deal-breakers to launching it too.

      • So who’s runnin’ the show over there that this *still* hasn’t been fixed …??
        Boz …?
        If that’s the case, ohhhhh boy ….

  • Charles U. Farley

    Did they make them less creepy? Or did they just bring the creepy into 3D?

    • foamreality

      Is it actually 3d though? Or just 2d without a frame. The article omits to explain this. I suspect its the latter. Lame.

  • Y’know what kills me about all of this …??

    What do you think multiplayer VR games are:
    you sharing space with other avatars.
    What’s described in this article is precisely that, only in *AR* ….
    Heck, we don’t even have that “Other Avatars In Your Home Enviornment” thing yet that Zuck promised us THREE YEARS AGO ….

    This has nothing to do with “technical ability”.
    All the tools are now, and *have* been, in place for a long while.
    But tragically, this is Meta’s MO all over:
    fantastic tech, but it just sits there, unused, rotting away.
    And that makes me friggin’ INSANE ….
    []^ (

    • Charles U. Farley

      You’re already insane.

      • I resemble that remark ….
        It just so happens that I am N-O-T crazy!

    • ApocalypseShadow

      This is actually true. Facebook could do this currently. But they’ve been dangling the carrot in front of consumers telling them it’s in the future. You can only sell dreams so long until you have to produce something.

      Apple is like, “here it is.” And it just works. Even in beta. Even connecting with iPhone, Android phone or tablet users. Just like, “Here’s clear pass through” day one. “Here’s good hand tracking” day one. “Here’s a lot of useful apps you already use on your phone that you can use on your headset” day one.

      This is what Zuckerberg has feared. Actual competition from a company that already has a huge platform of content, a huge amount of followers. And, hardware and technical know-how. This isn’t Pico or PC or console VR that lacks many things that Apple has and will have and improve on it.

      It’s expensive now. But version 2 won’t be. It’ll do almost everything this first model does. It’s why Zuckerberg has downplayed Vision Pro. He can say that this is better or that is better on Quest. But he’s got to produce. Facebook is using gamers as a stepping stone to get to where cellphones users are. Apple is already there with cellphone users, can sell a more expensive device to their base and the masses. And is already offering useful things beyond the cellphone in their headset. Both want to be that next paradigm shift. But Apple is going directly at the masses. Not through gamers to get there.

      Facebook better get cracking and get those realistic avatars out. Or they’ll be left behind.

      • Hear, hear!!
        []^ )

      • foamreality

        The one single reason apple is so successful (and its beyond belief that not a single tech company on earth has emulated it) can be summed in one word: Polish.

        They release software and hardware that is polished. Its not hard. But meta, and all the PCVR competitors are too dumb to polish any of their software even several years after release. Apple knows that people want things to work. And to work well. They don’t want half arsed gimicks and tech demo’s that look cool for 5 seconds until you realise they don’t really work properly and have no useful purpose Which is what every other tech company focuses on. People will pay 3500 dollars just for a bit of polish. To know things are going to work more or less as expected.

  • Ondrej

    Hopefully, Apple not allowing competitors to use eye tracking in their social apps will be challenged sooner than later.

    Of course they use the shameless “privacy” excuse. They are acting like a dictatorship justifying censorship with “only we know what’s good for our citizens”.

    • gothicvillas

      I expected nothing less. Apple is as far left as it gets.

  • wheeler

    There’s a funny pattern here. Over the years, Meta has tried to promote an XR vision/ideal encompassing certain XR / “spatial computing” features which we’ve known that Apple has been working on (hinted at through rumors/leaks over the years–and now confirmed with Apple actually releasing and rapidly iterating on those features). And yet even with Meta’s 10 year head start, Meta has failed to execute on these features and Quest is still a kids gaming console with an otherwise bad XR interface.

    They clearly received some intel on what Apple was doing and threw together some snazzy promotions and tech demos to try and claim that vision as their own, but ultimately failed to implement them in time (or what they have implemented just stinks). They have neither the hardware or software to pull it all together.

    Meanwhile Apple is actually making it happen. VR enthusiasts are criticizing Apple for not embracing immersive VR gaming, but Apple is probably looking at that market (with its extremely lopsided level of investment to return, and low retention despite high market penetration and accessibility) and thinking “why would we care about that?”

  • Dragon Marble

    For me, this is the same as Meta’s codec avatar demo: cool but irrelevant. I am already crazy enough to purchase a Vision Pro for myself. Who am I going to watch a movie together with?

    You see, just because one of them is a “consumer product” and the other a “technical demo” makes no practical difference for 99.9% of people.

    • Arno van Wingerde

      I think you should this in long term perspective, when the AVS arrives at 1000-1500 or so and some of your friends colleagues will also have one. This would be possible with the Quest3 today, at a lower quality, if Meta would actually take time to develop the software.

      • That’s all it is, man: SOFTWARE.
        That’s what makes it so bloody frustratiing!!

      • Blaexe

        Without eye and face tracking these realistic avatars wouldn’t work, at least today and maybe ever. So no, it wouldn’t be possible on Quest 3 today.

        • foamreality

          Could work on quest pro but not possible on quest 3. Pro is a far superior product except for the lower chipset. Such a shame the q3 didn’t have eye and face tracking. Also non OLED headsets are truly terrible, I don’t understand why anyone would buy one. Breaks immersion more than any other spec.

  • Wow, we were expecting it for WWDC instead they already launched this feature. It’s very cool

  • ViRGiN

    It’s cool, it’s really cool, but how is valve ever supposed to catch up?
    Competition is great, forces everyone to make better products, except for valve, valve idea of connecting people is lowpoly shapes of robots with disconnected hands, like a ray-man.