September 2020 – Arram Sabeti

AI can already create photorealistic faces, objects, and landscapes. Video isn’t far behind. We can already recreate any voice. GPT-3 can already write dialogue and movie plots almost indistinguishable from ones written by humans. Even generated music is making fast progress.

It’s only a matter of time until we’re generating entire movies and shows. It’s startling to realize that Hollywood movies that cost $300M to produce today might be generated for a few cents within our lifetimes.

When the cost of something drops by a factor of a billion we should expect to see qualitatively different uses. It’d be a different medium. Not only would there be endless content; there’d be endless content for any specific movie or TV show you could want. Generally speaking, AI will do for production what the internet did for distribution.

One of the amazing things about AI is how you can work at a high level of abstraction. Imagine adjusting sliders on your favorite movie to make it more gritty or funny. Or asking the AI to show you what Starwars would look like as directed by Christopher Nolan.¹ The same way you can use GPT-3 to write a Taylor Swift song about Harry Potter, you could make a new Harry Potter movie starring Taylor Swift.

Language models already create new kinds of interactivity. One of the surprisingly fun things I’ve done with GPT-3 is to insert out of character behavior in the middle of a scene and watch the characters try to make sense of what happened.

When you insert out of character behavior and GPT just has to go with it. 😆 pic.twitter.com/OrYpympLXX

— Arram Sabeti (@arram) August 4, 2020

Imagine a video form of this with your favorite characters.

Once we can generate high definition video and audio in real-time, these generative movies will become a different medium again – one you can inhabit.

There are already companies working on replacing 3D game engines like Unity with neural nets. In the limit, movies and games could merge into a single medium we might call Generative Reality.²

I don’t mean to say people will stop consuming passive entertainment. Just that the same software will create both, and that at any time you’ll not only be able to change anything about what you’re watching but also insert yourself into the story.

What we’ll probably see first are virtual people that are indistinguishable from real people – at least for short conversations. The technology is almost there, and for some people, it is there. People are getting real emotional support from chatbots, and they’ll be a lot more appealing when talking to them is indistinguishable from doing a video call with a friend (or with Abraham Lincoln or Professor Dumbledore). GPT-3 is already good at impersonating people. I expect licensing fictional and celebrity chatbots like Tony Stark or Lady Gaga will be a big business.³

Before we get AI that can autonomously generate a full movie we’ll have increasingly powerful tools where humans guide the AI by selecting and combining the best outputs. This is the stage we’re at now with GPT-3. It’s generally coherent for only a paragraph or two, but if you have a human generating multiple paragraphs and picking the best one you can end up with surprisingly good results.

Ira Glass talks about how when you get started as a maker, your taste exceeds your grasp because you lack technical proficiency. As AI tools become more powerful the technical skill required will decrease until good taste is all you need to make great art. Eventually though AI will create content that won’t be improved by human intervention.⁴

What timeline should we expect? The first tools for making generative movies (as well as real-time virtual people) seem only a matter of improvements on existing technology and putting them all together in a useable way.

One researcher I talked to told me I was underselling the whole thing and that fully generative movies could be here in under five years. That seems very fast to me, but what’s clear is that this is all closer than most people realize. Progress in AI has been frighteningly rapid in the last five years. When I first started sharing samples of text generated by GPT-3 the main reaction I got was disbelief. Most people thought it was a hoax and that there was no way an AI could have written that.

Combine photorealism with convincing virtual people and endless personalized stories and worlds to explore and it’s hard to imagine a more entertaining form of entertainment. We’d have the kind of simulated reality that philosophers have long built thought experiments around.⁵

This will be addictive. Thousands of people were depressed after watching Avatar because their real lives weren’t as appealing. It’s simultaneously exciting and worrying to imagine what would happen if people could live inside the movie.

It’ll be especially addictive if we use AI to discover what we want. By watching us interact with content, it could learn what we want better than we understand it ourselves.

I hesitate to predict this is the end of civilization since that sort of prediction is famously usually wrong, but it’s worrying to imagine alternate realities so compelling that people no longer engage with the real one.⁶ Most animals, including humans, are susceptible to superstimulus. When Google tried using AI to develop the best chocolate chip cookie recipe, the solution it converged on was to make the cookie out of solid chocolate.

What I’m describing isn’t just a single superstimulus, but an entire reality made of superstimuli. What will reality look like when it’s made out of solid chocolate? We might find out.

Thanks to Ben Mann, Amanda Askell, Nick Cammarata, Emmett Shear, and Ian Thompson for reading drafts of this essay.

Notes

1. I expect one of the most popular uses will be people creating more content from fandoms people already love. There are 800,000 Harry Potter stories on fanfiction.net.

2. Game engines today have a lot of practical uses beyond games which would also be impacted by Generative Reality.

3. You could argue that the need for canned NPC dialogue is one of the biggest things holding games back as an art form compared to books or movies. GPT-3 obviates it.

4. Generative Reality is trivially possible if we create superintelligence, but the thing I want to point at here is that even without superintelligence it appears to be not only possible but not very far away. (For what it’s worth, I also suspect that Artificial General Intelligence isn’t much further off than Generative Reality.)

5. Like all guesses at the future, this essay looks at a handful of trends and ignores all the others. While Generative Reality seems possible, a few of the basic assumptions it holds are that humans will continue to exist and that they’ll remain relatively unchanged.

6. We’d still be missing important technologies for fully immersive virtual reality, such as haptics, but once VR becomes very popular there will be a tremendous economic incentive to improve it.

Month: September 2020

The Generative Age