Nvidia explores the what-if of training a model to draw new worlds

On Dec 6, 2018

Here is a word string you can ponder: AI interactive graphics. Nvidia is the main act. The GPU brainiacs are fully behind a next chapter, “AI,” for the graphics industry. Awesome things happen when you bring a neural network system to the table for virtual worlds based on video footage.

Creating a lifelike digital scene normally requires lots (and lots) of patience. “Now we can just offload the work to an AI algorithm,” said Will Knight, in his look at what Nvidia is working on, in MIT Technology Review.

Nvidia is announcing what it said was the first interactive AI rendered virtual world, and called it an “AI breakthrough.” The team of researchers was led by Bryan Catanzaro, vice president, applied deep learning. “We’re actually teaching the model how to draw based on real video,” he said in MIT Technology Review.

The frames are rendered by AI technology, said research scientist, Ting-Chun Wang. In other words, the team have trained a neural network to render 3D environments after being trained on existing videos.

Breakthroughs start with somebody’s question. At Nvidia, the what-if question was, what if we could train an AI model to draw new worlds just based on video from the real world? So, it’s now a done deal in technology showing such image generation.

Targeted users are developers and artists. The idea is for them to create interactive 3D virtual worlds for automotive, gaming or virtual reality.

Videos of cities to render urban environment were selected on which they trained the neural network.

“We were given driving sequences of different cities,” said Ting-Chun Wang. Then they used another segmentation network, to extract the high level semantics, he said, from those sequences.

The UE4 engine helped generate the colorized layouts—different objects were given different colors. The network in turn converts the representation to images.

[UE4 refers to Unreal Engine 4. The Verge described it as “a popular engine used for titles such as Fortnite, PUBG, Gears of War 4, and many others.” The Unreal Engine 4 site said it is “a complete suite of game development tools made by game developers, for game developers.”]

Nvidia has shown its progress at the NeurIPS conference in Montreal, Canada, a show focused on AI research.

Yes, but, other than the gee-whiz factor, what’s the point? Content creation is expensive and this AI route saves money and it saves time. New Atlas made the observation that “To make virtual worlds feel more immersive, artists need to fill them with buildings, rocks, trees, and other objects. Creating and placing all those virtual objects quickly adds up to quite a high development time and cost.”

For game development or other applications, Nvidia explored a method that would allow developers to create at a lower cost, by using AI that learns from the real world. The news release pointed out that since output is synthetically generated, “a scene can be easily edited to remove, modify, or add objects.”

James Vincent in The Verge walked readers through Nvidia’s research effort:

“The problem is, if the deep learning algorithms are generating the graphics for the world at a rate of 25 frames per second, how do they keep objects looking the same? Catanzaro says this problem meant the initial results of the system were ‘painful to look at’ as colors and textures ‘changed every frame.’ The team gave the system short-term memory, and so it could compare a new frame with what’s gone before. As such, it creates new frames consistent with what is on screen.

The Nvidia press release said “this research is early-stage.”