“Seeking to conquer a larger liberty, man but expands the empire of necessity.” — Herman Melville, “The Bell-Tower”
This essay was published on a now-defunct blog in 2019, but remains relevant as it predicted many of the now-urgent issues we see in AI today.
A funny thing about science fiction is that it never really predicts the future. The details are always off, and it is the details that make all the difference. Now that real AI is actually here driving cars, writing poetry, making music, and filtering your spam, the detail that has made all the difference is that these machines don’t think like us. In fact, these “intelligences” probably don’t think at all. What they definitely do, however, is copy.
All serious AI now takes the form of artificial neural networks, which act like individual brain modules lacking self and motivation. They are pattern completers, not pattern inventors. Their cognitive mode is mimetic. Trained on large data sets, often by unknowing humans as they fill out CAPTCHAs or query Google, each neural network lives in a universe bounded by us. None show any awakening to consciousness, not even when performing tasks we once deemed consciousness necessary for. Their mimetic nature is the missing detail, the reason our future isn’t already on the shelves of bookstores. Instead the future is trending toward a state that can only be called uniquely weird.
There has already been significant outcry on how the abilities of these isolated brainlets can be used for disinformation. A recent paper by researchers at Durham University showed that it only took 13 hours to train an AI to produce realistic fake UN speeches in the style of the chosen world leader. Such “deep fakes” might put us in a situation where convincing forgeries of articles, videos, and audio are more common than the real thing. We may even have to go back to giving prominent social figures the benefit of the doubt.
While the concern over deep fakes is a legitimate one, it misses the true long-term consequence of this new technology. For there is a more insidious effect that bears directly on our very notion of meaning, creation, and intent. Artistically, I mean.
In April of 2019 the OpenAI project released a preview of MuseNet, a deep neural network that generates musical compositions. Its mimetic abilities means the project can start with just a few given notes and extrapolate out an entire song, and do so in styles ranging from Chopin to Lady Gaga. The results are staggeringly realistic. A neural network can now, or within the next few years, be trained in the style of Philip Larkin, or The Beatles, and produce new works in the style of those artists, works that are nigh indistinguishable from the originals. It’s now just a matter of time until our world becomes a Jurassic Park filled with newly-issued work by long-dead creators. Perhaps this means future creators will become judicious about how much work they publish, and therefore how much data they provide, to prevent such “style cloning.” Or perhaps creators will embrace this strange sort of immortality. Maybe they will eagerly train neural networks on their own work; every artist becoming the master of an atelier formed from themselves. Perhaps the future of the creative arts is an assembly line of style clones.
The limits of these new technologies are unknown. In November of 2019 another OpenAI project called GPT-2 was released, doing for writing what MuseNet did for music. For a while the OpenAI team even debated not releasing it, out of fear its abilities would be misused. After it was eventually made available, The Economist used the AI to answer an essay question: “What fundamental economic and political change, if any, is needed for an effective response to climate change?” The AI’s first paragraph in response was this:
“Do we want to go through the same process we have been through for decades with no changes? Is there a way to build a sustainable energy system that is both affordable and environmentally responsible? The basic premise behind this article is that we need to transform the economics of our energy system by investing in the necessary infrastructure so that it will be affordable for everyone.”
For a lark the response is chillingly passing. A blind judge for The Economist reviewed the AI-written article as “strongly worded and backs up claims with evidence, but the idea is not incredibly original.” That is, it could be any high school student’s answer to an SAT question. And it’s not like there’s a known upper bound on how good these things can get. Just as with games like chess, Go, or StarCraft, arts and media differ in terms of how resistant they are to the learning techniques of AIs. Will the art forms that are easily produced by these new technologies become diluted, maybe even abandoned? What if that includes article writing, or at minimum, generic form stuff like press releases? With works available at the click of a button, some creative forms are poised to drown in abundance. And while we’ve accepted that humans will never beat a trained neural network at chess, what happens when neural networks create musical compositions more original, catchy, and beautiful than those by even the best human composers?
There are some areas in which this has already occurred. Indeed, professional chess is a perfect example, for it is still a viable subculture and activity decades after computer domination. Yet it has always struck me that Magnus Carlsen is one of the more tragic living figures. Magnus, now 29, is the best chess player there has ever been. He would thrash Bobby Fisher. He is good at every part of the game, a savant. Ironically, his abilities have been described as inhuman. Yet my iPhone can beat him at his chosen profession. He’s never shown that this bothers him in any way I’m aware of, but it would bother me, that low existential itch, which every zoo animal must feel, no matter how comfortable in their cages: the implicit knowledge of being a novelty, or rather, in a darker light, a joke.
Lee Se-dol, the South Korean Go world champion, must have felt this. He announced this year he would stop playing professional Go because of the AI program AlphaGo, which trounced him in a series of games. “Even if I become number one,” the once-champion said, “there is an entity that cannot be defeated.”
Humans will still play Go. Of course they will. But games occur on a restricted playing field in order to see which person (each abiding by the same arbitrary constraints) wins—it’s what makes them games. In terms of artistic creation, we don’t read poems or watch TV shows to witness the drama of competition behind their creation. We actually consume the product itself. Art is meant to be imbibed. How much will it really matter if there is a “certified human” sticker on a script or a song or a painting?
And so, with little knowledge or fanfare, and all in just the last five years, the creative world has been split in two. On one side are the types of artistic production that can be automated by these neural networks. On the other side are those that cannot be, or will resist it for decades or centuries. Improvisational jazz and classic music are easy to mimic, but the vibrato intonations of the human voice are surprisingly difficult to emulate. Neural networks can write beautiful poetry or short articles, but their novels collapse into nonsensical heaps. So far, AI cannot write books or longer structured stories because they cannot “fake it” for that long—unable to understand causation or time, or even object permanence, the characters in AI-written works come and go like ghosts, looping back on themselves. It certainly may be just a matter of time until these remaining issues are solved, or slowly dissolve away, as through sheer statistical enumeration AIs appear to learn these concepts. The technology can already produce convincing works for shorter pieces where long-term coherence isn’t a factor (think poetry, painting, music, or paragraphs and shorter essays).
There is an eerie scene in the director Alex Garland’s Annihilation where an alien being, just a silvery human-like form, begins to mirror Natalie Portman’s movements, and that simple act of copying becomes a dance of absolute terror. I cannot think of a better metaphor for a neural network. Perhaps in the future each of us will have such a replica living on after us, trained on the digital effluvia we’ve left behind. In a perversion of Philip Larkin, “what will survive of us is data.” Maybe we shouldn’t be surprised. Once our phones started autocompleting our sentences it was inevitable that they would continue on to people.
This mimetic ability takes what were previously merely philosophical thought experiments and thrusts them into everyday life. For such creative cloning is all light but no heat. All syntax but no semantics. To choose an example: when I listen to the work of Hildegard von Bingen, the Benedictine abbess who composed liturgies a thousand years ago, her music comes across those centuries with a clarity of meaning so great one aches upon hearing it. The meaning is one of longing. She saw visions all her life, and surely that’s what this longing is for—a world seen in a vision.
Yet to her body of work there is a higher dimension, just like those in the novel Flatland, where the Square can only see a Sphere as a circle that changes its circumference as it passes through the 2D plane. Behind all the actual realized liturgies of Hildegard there is a higher dimension filled with all the liturgies she never wrote, but could have, in some close possible world. A neural network doing its autocomplete magic is exploring this higher dimension. All creators have around them the blur of what they almost created, derivable from what they did create, and the only constraint to pulling all those works that were never written or never painted or never produced is the amount of available data. In a neural network this is called the “latent space” of the task the network has learned. Latent as in things that could have existed but never did.
I recently saw a talk by a woman who had trained a neural network on the Metropolitan museum’s entire art collection so that it could tell piece from piece. That network possessed, after training, an intrinsic latent space. Fueled by the hundreds of thousands of pieces in the collection, just by dragging a cursor the woman could change the pieces of the Met into versions of things that never existed, ancient Egyptian pottery stylistically mixing with Grecian urns in a kind of drunkard’s walk of cultural never-weres.
For an individual artist, the more things you create in the dimension of the real the more the lurking unreal higher dimension, defined entirely by the patterns in your existing work, can be brought into focus on our plane of existence by some neural network. It is the ultimate triumph of statistics over ontology. It makes it literally true that an oeuvre lives on long after the author is dead.
So then what, exactly, is the semantic content of an AI-produced work of Hildegard von Bingen? It is a “deep fake” of meaning. Such a work points to nothing, signifies nothing, embodies no spiritual longing. It is pure syntax. For art this is the semantic apocalypse. It’s when meaning itself is drained away by the mimetic powers we’ve unleashed.
Consider the self-published poetry collection put out just months ago by Kane Hsieh, Transformer Poetry. Except Kane isn’t a poet. Instead, he fed the initial lines of famous poems (in bold) to GPT-2, and the AI autocompleted the poems at its own discretion, even deciding typesetting. Here is an example result, starting with the opening lines from William Shakespeare’s Sonnet 18:
Shall I compare thee to a summer’s day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer’s lease hath all too short a date;
And the grass is green and the trees are full of fruit’.
And, when the sky is blue and the heat is blazing,
And the rain is falling upon the fields and the trees,
But the sun is downcast as when it is rising,
And the sky is covered with clouds and the land is shrouded by fog,
And the sky is not bright to behold yet:
Thou hast not a thousand days to tell me thou art beautiful.
This shakes me to my core. Not because it is such a perfect facsimile of Shakespeare. The poem is not even that good, overall. The AI clearly doesn’t understand it needs to mimic iambic pentameter for this to be a convincing sonnet, since this was never part of its training, and there’s at least one grammatical mistake. Still. It is not a bad poem either. A human very much could have written it. There are some good lines. Even some artful lines. Especially that last one.
Thou hast not a thousand days to tell me thou art beautiful.
It’s affecting. It signifies a loss, like the loss of summer itself, or at least a fragility, that fragility that exists between lovers. And yet that line was written by a thing that never loved, saw, nor heard. It was never told, and it never told. It never experienced a day in its life. It has no idea what these words point to, what they refer to. All it knows is that some words are associated with other words, and the whole web of word associations has its internally coherent patterns, and these patterns can be extrapolated in ways to create new sentences. It doesn’t even know the patterns it’s extrapolating are sentences made of words. To call this a magic trick is too denigrating: this thing really can write poetry, and it really can use language, there’s just no intentionality or reference behind its language use.
Does anyone think the above is the apogee of this newfound artificial art? A mere few years after its introduction? Let me make my concern as clear as possible. Imagine a future website where every time you click refresh a new and perfect Shakespeare sonnet is generated on the page in front of you. And you click again and again and again and again. Imagine then, your dread.
Apocalyptic concerns about how technologies impact our sense of meaning are not new, of course. From Marshall McLuhan’s famous “the media is the message” to contemporary concerns about social media, there have been continuous calls for the examination of technology on culture; such calls are especially important now that the internet, streaming services, and gaming form a new supersensorium that allows omnipresent entertainment.
Yet much cultural doomsday prophesizing occurred before the AI revolution really began, and more focused on habits of consumption. Deep learning only kicked off after researchers began training artificial neural networks on GPUs in the late 2000s. For example, in 2010 David Shield published a book called Reality Hunger about the ennui that can come from consuming too much fiction, whether from novels or TV shows, and the resultant hunger for connections to reality. Yet it seems to me that all fiction now, in this first decade after the advent of what no one can deny is genuinely artificial intelligence, must involve some sort of tie to reality simply by necessity, just to differentiate itself from what these new technologies can produce.
This is because real world events still have an indexical significance that “deep fakes” of meaning cannot have. I was recently reading Joan Didion’s Slouching Toward Bethlehem, from 1968. The same Joan Didion who, almost forty years later, wrote about her husband’s death from a heart attack in The Year of Magical Thinking, and then about the closely-following death of her daughter, from a fall, in Blue Nights. But all the way back in the ’60s, in the essay “Slouching Toward Bethlehem” itself, she investigates the LA hippie movement and at a commune has an interaction that struck me:
“Someone works out the numerology of my name and the name of the photographer I’m with. The photographer’s is all white and the sea (“If I were to make you some beads, see, I’d do it mainly in white,” he is told), but mine has a double death symbol.”
Such a “double death” prophecy, so chilling in our own world, derives its impact via its connection to reality. One can make up a fake history with however much detail one wishes but it will never share this indexical quality.
So I don’t find it a coincidence that literary fiction has moved toward auto-fiction, almost as if to preemptively protect itself. Authors like Karl Ove Knausgaard command the literary scene because speaking from experience not only avoids the social justice criticism of appropriating others as tools to tell stories with, but also is an implicit acknowledgement that, in this day and age, the connection to the real, the referential quality of literature, is the only way to ground it with meaning. This is what the contemporary reader expects and demands of literature. It makes the reading experience a kind of ritual, a sacred rite of passage for the individual, and this process becomes the crux of the relationship between the reader and author.
Imagine a future website where every time you click refresh a new and perfect Shakespeare sonnet is generated on the page in front of you.
How small a room our imaginations now have. For any work of art that contains a pattern, there are now statistical tools to complete it, tools that bear no reference to human minds. They can never begin, but they can always finish.
Perhaps the most terrifying thing is that these neural networks aren’t doing anything different than what we’ve been doing all along. Maybe we’ve been fooling ourselves for millennia that our pretty patterns are original rather than purely mimetic. As Mark Twain wrote in his essay “Corn-pone Opinions” from 1901, “We are creatures of outside influences; as a rule we do not think, we only imitate.”
So perhaps there’s not much of a difference, and words never had meaning in the way we thought they did. After all, do you think we poor humans even wrote this essay in its entirety?
We didn’t.
This makes me kind of depressed. What is the future of art going to look like? What will happen to future artists?
There's a new "Nirvana" song on YouTube called "Drowned in the Sun" that was created by A.I. Honestly, it sounds more like Bush than Nirvana, but here's the thing -- it's pretty damn good. It's been playing repeatedly in my head for days.
Honestly, as a guy approaching middle age who lived his first 40 or so years in a world that didn't have any decent A.I.-generated art, the future is looking kind of...cold. And that freaks me out more than a little.
> And yet that line was written by a thing that never loved, saw, nor heard. It was never told, and it never told. It never experienced a day in its life.
In some sense, the human aggregate (call it "god" or "devil" if you are mad) is talking, and it captures the pain and love of all the people that has existed, into a single encapsulated recording. Much like the Bible and other canonical works, this shall become revolutionary, and an epicenter on future meaning. The pure consumer shall end, and the creatives, through assistive creativity, shall prosper and be immortalized.