As I’ve watched a friend play Skyrim over the last few days, I’ve been blown away by the immense world that Bethesda has created. Not only is it huge, but it’s rather beautiful as well. “Slap on an HMD,” I thought, “and this would be a wonderfully immersive virtual reality world.” But as I continued to watch my friend play, I noticed how the interactions between the player and the non-player characters seemed to lag years behind the graphics. They were stale and scripted, unlike the sandbox world that contained them.
Much of the dialogue in the game has a non-player character (NPC) talking to the player about the world. The player gets to interact by selecting from a list of canned responses. Sometimes, the questions you want to ask or the things you want to say just aren’t in that list, and this truly detracts from the immersion. You don’t feel like you are in control because you’re limited to just a few choices. Your character doesn’t even speak the lines aloud; you just pick an option and the NPC starts responding. It also makes the NPCs feel less real because they seem like they are just a robot reading a script (probably because that’s what they are).
A world like Skyrim would be far closer to immersive virtual reality if the NPCs were able to not only hear your voice (through a microphone) but understand it and respond appropriately. Not only would you be able to ask the questions you really wanted to, but you’d have to be more immersed in the lore of the game to even know what to ask.
Voice recognition difficulty aside, having an NPC respond naturally to nearly unpredictable input is definitely a huge challenge, but it’s certainly possible. Games will become far more real when we achieve a level of AI where this is possible, and we aren’t as far away as you might think.
If I say “virtual reality” people often picture someone hooked up to a head mounted display. Head tracking, 3D life-like graphics, surround sound — all of this comes to mind when thinking about that image. Artificial intelligence is one piece of the puzzle that doesn’t immediately come to mind, but it is just important to creating an immersive experience as all of those other components.
The use of the term “AI” it today’s world is quite misleading. I’d argue that there is still a lot of work to be done in achieving artificial intelligence, but we’re on our way.
Alan Turning, in 1950, developed something now referred to as the ‘Turning Test’ which seeks to test the ability of computer intelligence and determine whether or not it can be considered true artificial intelligence. The basis of the test is to have one human act as a judge, then communicate with another human and a computer through typed communication. The judge is separated from the computer and the other human, and doesn’t know which is which when communication is received. According to the turning test, the computer system can be considered artificial intelligence if the judge is unable to realiably determine which of the responses are from the computer and which are from the other human.
We simply aren’t there, yet. A lot of time and research is going into creating a true AI system.
Funded by the Defense Advanced Research Projects Agency (DARPA), the CALO project ran from 2003-2008. CALO (Cognitive Assistant that Learns and Organize) sought to create a machine entity that that would be useful to human users. A spin-off from the CALO project is the now famous Siri.
Undoubtedly, Siri is closest system to true AI to be made available to the mainstream. Siri is a sort of digital assistant that was first released with the iPhone 4S and can do simple tasks like send text messages, emails, check your calendar, tell you the weather, etc. While other devices have had voice-command features for years prior, Siri makes strides in understanding natural human speech with context.
While no one could mistake Siri’s output for a human because of the computerized nature of the voice, with a proper voice some might be fooled. Siri is a big stride toward a Turning-capable system thanks to its ability to accept a broad range of contextual and natural human input. As similar systems are developed and deployed, they will undoubtedly find their way into our games and fill that vital missing link of immersive virtual reality.
As of March 2011, Microsoft’s Kinect has sold 10 million units. Every Kinect has a microphone inside, and the Xbox 360 easily has the processing power to handle a system like Siri. Although less widespread, the Playstation Eye, a camera which connects to the PS3, also has a microphone. Microsoft is bringing basic voice control to the Xbox 360 through the Kinect this holiday.
If Microsoft and Sony know what they’re doing, the next generation of consoles will feature some sort of pseudo-AI, and this will open up the door for integration of that AI into games.