Intel researchers explain how they used AI to give GTA V a photorealistic revamp

Nearly eight years after Grand Theft Auto V was released, modders are still enriching — and ruining — the game‘s celebrated visuals.

Last week, researchers from Intel Labs unveiled an AI-powered revamp of the Rockstar classic that brings the graphics close to photorealism.

TNW spoke to Intel research scientist Stephen Richter to find out more about their technique — and the potential to productionize the method.

The cornerstone of the Intel Labs method is a convolutional network, a deep learning architecture that’s commonly used for image processing.

The team trained their convolutional networks on real-life images to translate GTA V’s graphics to a model of reality.

Richter said convolutional networks are well-suited to learning this type of task:

For games, simulations, and films, there is a tremendous amount of work that needs to go into modeling objects, materials, etc. to make them look realistic. Set up the right way, convolutional networks can just learn these things directly from real-world photo collections automatically.

The resulting output is strikingly realistic: reflections are added to the windows of cars, roads are paved with smoother asphalt, and the surrounding vegetation gains a lusher texture.

[Read moreThis dude drove an EV from the Netherlands to New Zealand — here are his 3 top road trip tips]

Perhaps the most interesting aspect of the revamped graphics is the influence different training datasets have on the output.

In one application of their method, the researchers trained the convolutional network on the Cityscapes dataset, a collection of images recorded primarily in Germany. As a result, GTA V’s parched hills were reforested to mimic the German climate, while San Andreas acquired a grey hue that’s more resemblant to Bavaria than Southern California.

When the network was trained on the more diverse Mapillary Vistas dataset, however, the visual style of the output was brighter and more vibrant.

Some of these changes are a reflection of the location where the training images were recorded. But other differences are due to the cameras that captured the pictures.

The changes to the vegetation, for instance, were because Cityscapes represents mostly German cities. But Richter said the revamped color palette was due to the recording equipment:

Cityscapes was recorded with an automotive-grade camera, which has this characteristic green tint. Consequently, images enhanced to look like Cityscapes also get this green tint. Vistas was recorded with a diverse set of cameras, including, for example, smartphone cameras. Images from Vistas are much more vibrant and you can see this in the results by our method.

Credit: Richter, Abu AlHaija, and Koltun