NVIDIA, one of the tech sector’s power players, is pushing the Universal Scene Description protocol as the foundation of interoperable content and experiences in the metaverse. In a recent post the company explains why it believes the protocol, originally invented by Pixar, fits the needs of the coming metaverse.

Though the word metaverse is presently being used as a catchall for pretty much any multi-user application these days, the truth is that the vast majority of such platforms are islands unto themselves that have no connectivity to virtual spaces, people, or objects on other platforms. The ‘real’ metaverse, most seem to agree, must have at least some elements of interoperability, allowing users to seamlessly move from one virtual space to the next, much like we do today on the web.

To that end, Nvidia is pushing Universal Scene Description (USD) as the “HTML of the metaverse,” the company described in a recent post.

Much like HTML forms a description of a webpage—which can be hosted anywhere on the internet—and is retrieved and rendered locally by a web browser, USD is a protocol for describing complex virtual scenes which can be retrieved and rendered to varying degrees depending upon local hardware capabilities. With a ‘USD browser’ of sorts, Nvidia is suggesting that USD could be the common method by which virtual spaces are defined in a way that’s easy for anyone to decipher and render.

“The most fundamental standard needed to create the metaverse is the description of a virtual world. At Nvidia, we believe the first version of that standard already exists. It is Universal Scene Description (USD)—an open and extensible ecosystem for describing, composing, simulating, and collaborating within 3D worlds, originally invented by Pixar Animation Studios,” writes Nvidia’s Rev Lebaredian and Michael Kass.

“[USD] includes features necessary for scaling to large data sets like lazy loading and efficient retrieval of time-sampled data. It is tremendously extensible, allowing users to customize data schemas, input and output formats, and methods for finding assets. In short, USD covers the very broad range of requirements that Pixar found necessary to make its feature films.”

Indeed, CGI pioneer Pixar created USD to make collaboration on complex 3D animation projects easier. The company open-sourced the protocol back in 2015.

USD is more than just a file format for 3D geometry. Not only can USD describe a complex scene with various objects, textures, and lighting, it can also include references to assets hosted elsewhere, property inheritance, and layering functionality which allows non-destructive editing of a single scene with efficient asset re-use.

Qualcomm's Head of XR is Leaving at a Pivotal Moment for the Industry

While Nvidia thinks USD is the right starting point for an interoperable platform, the company also acknowledges that “USD will need to evolve to meet the needs of the metaverse.”

On that front the company laid out a fairly extensive roadmap of features that it’s working on for USD to successfully serve as the foundation of the metaverse:

In the short term, NVIDIA is developing:
  • glTF interoperability: A glTF file format plugin will allow glTF assets to be referenced directly by USD scenes. This means that users who are already using glTF can take advantage of the composition and collaboration features of USD without having to alter their existing assets.
  • Geospatial schema (WGS84): NVIDIA is developing a geospatial schema and runtime behavior in USD to support the WGS84 standard for geospatial coordinates. This will facilitate full-fidelity digital twin models that need to incorporate the curvature of the earth’s surface.
  • International character (UTF-8) support: NVIDIA is working with Pixar to add support for UTF-8 identifiers to USD, allowing for full interchange of content from all over the world.
  • USD compatibility testing and certification suite: To further accelerate USD development and adoption, NVIDIA is building an open source suite for USD compatibility testing and certification. Developers will be able to test their builds of USD and certify that their custom USD components produce an expected result.
In the longer term, NVIDIA is working with partners to fill some of the larger remaining gaps in USD:
  • High-speed incremental updates: USD was not designed for high-speed dynamic scene updates, but digital twin simulations will require this. NVIDIA is developing additional libraries on top of USD that enable much higher update rates to support real-time simulation.
  • Real-time proceduralism: USD as it currently exists is almost entirely declarative. Properties and values in the USD representation, for the most part, describe facts about the virtual world. NVIDIA has begun to augment this with a procedural graph-based execution engine called OmniGraph.
  • Compatibility with browsers: Today, USD is C++/Python based, but web browsers are not. To be accessible by everyone, everywhere, virtual worlds will need to be capable of running inside web browsers. NVIDIA will be working to ensure that proper WebAssembly builds with JavaScript bindings are available to make USD an attractive development option when running inside of a browser is the best approach.
  • Real-time streaming of IoT data: Industrial virtual worlds and live digital twins require real-time streaming of IoT data. NVIDIA is working on building USD connections to IoT data streaming protocols.

Nvidia isn’t alone in its belief that USD has an important role to play in the coming metaverse. The idea has also taken hold to some extent at the newly formed Metaverse Standards Forum—of which Nvidia and thousands of other companies are members—which has also pointed to USD as a promising foundation for interoperable virtual spaces and experiences.

Newsletter graphic

This article may contain affiliate links. If you click an affiliate link and buy a product we may receive a small commission which helps support the publication. More information.

Ben is the world's most senior professional analyst solely dedicated to the XR industry, having founded Road to VR in 2011—a year before the Oculus Kickstarter sparked a resurgence that led to the modern XR landscape. He has authored more than 3,000 articles chronicling the evolution of the XR industry over more than a decade. With that unique perspective, Ben has been consistently recognized as one of the most influential voices in XR, giving keynotes and joining panel and podcast discussions at key industry events. He is a self-described "journalist and analyst, not evangelist."
  • VR Geek

    Give me a Unity to Universal Scene Description and I will make a build in that format today.

    • Christian Schildwaechter

      There is a Unity USD format package, still in preview, providing C# USD bindings that allows direct import of USD as game objects etc. and export of Unity meshes and materials (don’t know about scenes) to USDz, with a rather inactive github repository.

      • And no Android support thus making it useless on most VR/AR devices

  • kontis

    The HTML of Metaverse is the easy party. Practically the most common 3D format is FBX, not glTF or USD, so we will see how the adoption will work. But this is more of a concern of devs and content creators, not users.

    The real nightmare will be the HTTP of the Metaverse – the protocols.
    And I’m not talking about technical challenge. We already did it in web 1.0 and then it was ruined with web 2.0
    I’m talking about “politics”.

    If a web browser was invented in 2010, we… wouldn’t have web browsers or websites, because the idea to freely browse unmoderated, uncurrated web is completely illegal according to ToS of iOS, Google play or basically any modern walled garden ecosystem controlled by big corporation, because 1. it’s bad for monetization (“tax” avoidance), 2. it’s bad for content control (and they are all control freaks) + the online lynch mobs literally demand censorship.

    So why would they ever think about allowing such thing? There will be some basic multiplatform support and content migrations, but they will never allow the actual idea of metaverse to come to fruition. They will only use it as a buzzword.
    And real users also don’t want decentralized/federated open platforms. Centralized ones are superior for exposure, because you target the whole thing with low effort, without fragmentation problems, which means more likes/follows/subs -> satisfaction -> adoption rate.

    • Christian Schildwaechter

      FBX is a undocumented, proprietary format owned by Autodesk that can be used via the FBX Extension SDK, another proprietary blob owned by Autodesk that other programs can use to import or export it. It has become a sort of standard format for mesh objects through historic support by commercial apps, but both its closed management in contrast to e.g. the open gITF from Khronos Group (OpenGL, WebGL, Vulkan, OpenXR and many more) and its limited scope make it unsuitable as the “HTML for the metaverse”.

      USD was designed to enable not only the exchange but also non-destructive collaboration on the same data with mechanisms to extend it, and e.g. allows to only load and render parts of the scene, which makes it very useful for complex virtual worlds. All that complexity and extendability makes it a beast which will need to be optimized to be usable in lightweight clients like web browsers, but the fact that Apple and Adobe picked a compressed version called USDz as the base for their AR development should help with adaption.

      As for the “nightmare”: this is a wild mashup of rather unrelated issues of technology, competing standards and politics. Moderation is usually related to country policies, for example you will find no country that allows freely posting child pornography, planing acts of terrorism or conspiring to commit murder online, making the idea of an “unmoderated, uncurated web” legally impossible, no matter the TOS, and mostly a straw man argument. Companies tend to overshoot and set much stricter limits than legally required, often to cover their backs from becoming liable for their users activities.

      This has nothing to do with the technology, this is a side effect of humans living in societies with parts of them consistently behaving like a*holes. There is no wild conspiracy completely limiting your net access to ensure monetization, you can take any browser on iOS, all of which are forced to used Apple’s WebKit/Safari engine on the very locked down iOS platform, and still use the very secure, built-in HTTPS implementation to go on the dark web and order all of the assault weapons you want. Apple will not stop you, your real danger comes from FBI agents posing as drug/arms sellers.

      The technical mishaps of web “protocols” (markup actually, the protocols weren’t really problem) came first from companies adding proprietary, incompatible extensions to the still primitive HTML 1/2 and later when the W3C tried to force HTML 3 into a more formal and strict description with HTML 3.2, compared to the flexible mess that everybody was actually using, causing xhtml to be mostly scrapped and HTML5 being driven by the WHATWG industry group, which has solved most of the compatibility issues we had in the past.

      So companies can successfully come together and create interoperability standards in an iterative process, enabling the WWW or a metaverse, but there will always be a human factor that will set limits to what you are allowed to do with the results, at least in public. There will never be an “anything goes”, because there is way too much “anything” that we want to avoid.

      • The Werewolf

        Totally agree about FBX not being a workable standard. As you note, it’s undocumented, proprietary and you have to go through AutoDesk’s libraries to use it.

    • Keopsys

      You obviously talk about stuff you have very low knowledge about.

      Fbx is only an interchange file format primarily dedicated to animation workflows and used in game industry only, it belongs to Autodesk and will never be a standard.
      USD is a full data mangement + agnostic + extremely efficient cache system for game and animation/vfx industries, and it’s opensource.

      If one CG format will rule them all, it’s very likely USD.

  • Till Eulenspiegel

    USD won’t be of much use in the metaverse due to inflation.

    • mcnbns

      That’s right. We need to be moving toward the BTC standard.

  • Ben Outram

    USD is a 3D file format for the new internet – it’s super flexible and allows components to be kept in folders outside of the file itself, so they can be reused. It will become the standard.

  • The Werewolf

    Uhm… VRML? It’s been around since forever.

    Also, why is a protocol bound to a language? The implementation might be, but if we’re talking implementation, then it should be language independent. Remember, a LOT of Unity programming is done in C# and MacOS devs will need ObjectiveC or Swift support.

    • Andrew Jakobs

      Why does MacOS need objectiveC ir swift if unity has C#, I think C# is an excellent choice for multiplatform development, just like Razor is getting more popular for webdevelopment.

      • Christian Schildwaechter

        Many of the major programming languages over time have adapted features from each other, leading to basically feature parity. So you could replace Swift with C# on MacOS or C# with Swift on .NET. It just wouldn’t make a lot of sense to rip out the foundation just to replace it with something very similar.

        For a long time Unity supported not only C#, but also UnityScript and Boo. UnityScript was a variant of ECMAScript/JavaScript, Boo a variant of Python, both adding strong type support to work with the .NET foundation Unity is build on. AFAIR those were kicked out with Unity 5, because they were too different for JavaScript or Python programmers to be directly usable, meaning only a few developers used them, while causing lots of trouble because features available in C# remained unusable due to having to keep everything compatible.

        So the choice of language is often mostly due to historic reasons. In 1984 Objective-C introduced the elegant object messaging from SmallTalk to C with a very thin wrapper, allowing to easily transition existing C code bases. Nothing even close to that existed back then, which is why Steve Jobs’ NeXT Computers picked it for their MachOS/Unix based NeXTSTEP OS in the late 80s. NeXTSTEP became the design template for the Java standard libraries in the mid 90s, with the Java platform itself becoming the template for the C# and .NET framework Microsoft introduced in 2000.

        Meanwhile Apple had acquired NeXT in a sort of reverse takeover, with Apple effectively paying NeXT USD 400mn for taking over Apple and turning NeXTSTEP into todays MacOS (X), iOS and what now seems to be called realityOS. Which is why iPhones today and Apple XR/AR HMDs next year will still run on a (revolutionary) platform from the mid 80s, with Apple modernizing Objective-C with the mostly compatible Swift that removes some crud, similar to Koplin updating Java for Android or Google’s recently announced Carbon that is compatible to the C++ dungheap it is supposed to replace.

        Swift, Koplin and Carbon are similar enough that it is a valid question to ask why the hell we get three new language successors instead of finally one for all platforms. But they are all slightly different to work seamlessly with the existing codebase of their predecessors, and adapting all the existing code is simply unrealistic (expensive). If all we needed was a better language for OS development, Rust would have worked fine and nobody would need Objective-C, Swift, Java, Koplin, C++, Carbon or C# anymore, with C# only escaping a revamp so far due to being the youngest at only 22 years.

        With our economies still running on lots of code written in the 1960s, long term maintainability and compatibility is a much more important factor for language choice, with radical changes only being made every few decades. And with declarative languages like HTML, VRML and USD (actually a framework) that isn’t really a problem, because you can still create bindings for whatever language you prefer, be it Swift or C#. This usually only becomes a problem when you try this with huge frameworks like .NET or Qt, often resulting in a very crude mix of syntax and procedures.

  • Did anyone watch “Ready Player One” and actually think that mismatch slamfest of random game elements was a good idea? Games are more then just shoot this and that. There’s narratives that wouldn’t work well with some idiot showing up dressed as Mario or Master Chief. You don’t double jump and butt slam your way through Call of Duty.

    Are people just looking for a way to move between game worlds without leaving VR? Because if that’s their issue, I can’t help but to notice I can operate all of my Quest titles in their VR interface just fine.

    Nothing about this “Metaverse” crap makes ANY sense! It’s just a buzzword, with as much relevance and usefulness as the word “Synergy”. If people are just looking for easier to use 3D formats, there’s plenty of those. Are they trying to load up game code over the internet? I can download full games using Steam, right now! We’ve had web-based games using JAVA and what-not for over 20 years now.

    What is really trying to be accomplished here? Player match making? Game downloads? More player character design choices?

    Alot of this sounds alot like VRML, which has existed since the early 2000’s, but never really found any great use. 3D worlds aren’t like 2D webpages. They harder to make and don’t translate well between devices. The number of 3D editors is much smaller then photoshop users, and rarely still are coders.

    This all smacks of nebulous fantasies by people who know nothing of games, 3D design, VR, or even computers in general.

  • Thanks for the explanation. It was unclear to me why they defined USD the HTML of the metaverse

  • david vincent

    An open-source and versatile protocol, that’s exactly what we need for the future metaverse. Zuck can go to hell.