Chinese developer and publisher Nuverse is using AI for all NPC speech and animation in games. For its upcoming science fiction survival title Earth: Revival the development team used a system to generate speech and compute full body animations for NPCs, without any manual intervention. The company claims that the majority of players could not tell that the voices were synthesized.
Dao Si, AI Team Leader, at Nuverse is enthusiastic about the technology. He says it helps artists to free up their time and produce content more efficiently. “Currently, our system is designed to address highly repetitive and mechanical content production problems, such as matching lip movements. That can be very tedious and repetitive work.”
Players will also benefit from this system, he thinks. “Because our game has a large number of NPCs, we believe that if these ordinary NPCs can express emotions through better audio capabilities, it gives players a more immersive and interactive experience. Of course, we will still choose to use voice actors for our protagonists.”
Impressive numbers
To give an indication of how much time a dev team saves on the creation of NPCs, Dao Si shows some impressive numbers. “The system being used by Nuverse takes less than 800ms to generate speech and compute full-body animation trajectories for an utterance which lasts about 10 to 20 seconds. Therefore, for a dialogue plot with 20 hours of content, we only need about 48 minutes to complete the generation when it comes to consumption.”
Nuverse is building on the GPT-3 network to create natural sounding dialogue. It even helps writing the lines. “With the emergence of ChatGPT, we are trying to introduce this capability”, says Dao Si. “The advanced natural language processing capabilities of GPT-3 enable it to produce dialogue that is virtually indistinguishable from human-generated sentences. At the same time, we are making certain adaptations so that our NPCs can better conform to our game’s worldview and provide players with a better virtual world experience.”
Synthesis system
“Building upon our speech synthesis technology from AI-lab, we have incorporated fine-grained analysis of text based on GPT-3. Thereby adding emotion to our synthesis system, resulting in incredibly realistic emotional effects in the generated audio”, he adds.
Nuverse is constantly working to fine-tune the system, so it will continue to generate better, more natural performances. Earth: Revival is the first real test run of the AI driven speech and animation system and Dao Si hopes that unaware players will not notice the difference to an all human made effort.
Beyond our imagination
“The intelligence level of language-based models is beyond what we can imagine”, he says. “By utilizing these systems effectively, we can definitely present high-quality game content to players. Throughout the exploration of these projects, collecting key data and efficient iteration based on artist feedback are crucial.”
Earth: Revival will be released in 2023 on PC and mobile.