We need a Butlerian Jihad against AI
A proposal to ban AI research by treating it like human-animal hybrids
“Thou shalt not make a machine in the likeness of a human mind.”
So reads a commandment from the bible of Frank Herbert’s Dune. Notable among science fiction for taking place in a fictional future without AI, the lore of the Dune universe is that humanity was originally enslaved by the machines they had created, although humanity eventually overthrew their rulers in a hundred-year war—what they call the “Butlerian Jihad.” It’s unclear from Dune if the AIs had enslaved humanity literally, or merely figuratively, in that humans had grown warped and weak in their reliance on AI. This commandment is so embedded in the fabric of Dune society that there are no lone wolves or rogue states pursuing AI. The technology is fully verboten.
The term “Butlerian Jihad” is an allusion to Samuel Butler, whose 1872 novel Erewhon concerned a civilization that had vanquished machines out of preemptive fear:
"... about four hundred years previously, the state of mechanical knowledge was far beyond our own, and was advancing with prodigious rapidity, until one of the most learned professors of hypothetics wrote an extraordinary book (from which I propose to give extracts later on), proving that the machines were ultimately destined to supplant the race of man, and to become instinct with a vitality as different from, and superior to, that of animals, as animal to vegetable life. So convincing was his reasoning, or unreasoning, to this effect, that he carried the country with him and they made a clean sweep of all machinery that had not been in use for more than two hundred and seventy-one years (which period was arrived at after a series of compromises), and strictly forbade all further improvements and inventions"
There are a few contemporary figures who, if one squints, seem to fit into this category of “most learned professor” warning of the dangers of AI. Nick Bostrom, Eliezer Yudkowsky, a whole cohort of scientists, philosophers, and public figures, have made the argument for AI being an existential risk and lobbied for the public, the government, and the private sector to all address it. To slim results. Elon Musk explained his new nihilism about the possibility of stopping AI advancement on the Joe Rogan podcast, when he said:
“I tried to convince people to slow down. Slow down AI. To regulate AI. This was futile. I tried for years. Nobody listened. Nobody listened.”
I suspect that “nobody listened” because generally the warnings about AI are made entirely within the “rational mode” arguing about the expected value of existential risk.
In this way, all those shouting warnings are, quite frankly, far from proposing a Butlerian Jihad. That would make them a luddite! No, pretty much all the people giving warnings are, basically, nerds. Which is to say that they actually like AI itself. They like thinking about it, talking about it, even working on it. They want all the cool futuristic robot stuff that goes along with it, they just don’t want the whole enterprise to go badly and destroy the world. They are mostly concerned with a specific scenario: that strong AI (the kind that is general enough to reason, think, and act like a human agent in the world) might start a runaway process of becoming a “superintelligence” by continuously improving upon itself. Such an entity would be an enormous existential risk to humanity, in the same way a child poses an existential risk for a local ant hill.
It’s because of their nerdy love of AI itself that the community focuses on the existential risk from hypothetical superintelligence. In turn, it’s this focus on existential risk that makes AI regulation fail to gain broader traction in the culture. Because remember: there currently is no regulation. So let me suggest that this highly-hypothetical line of argumentation against AI is unconvincing precisely because it uses the language of rationality and utility, not more traditional sorts of moral arguments. For example, Nick Bostrom, director of the Future of Humanity Institute, talks in this language of expectation and utility, saying things like “… the negative utility of these existential risks are so enormous.”
Below, I’m going to walk through the expected value argument behind the existential risk argument, which I think is worth considering, so we can eventually see why it ends up being totally unconvincing. Then we can put forward a new (or perhaps old) argument for regulating AI.
Rational conclusions aren’t rational
In the traditional view put forward by figures like Nick Bostrom and Elon Musk, AI needs to be regulated because of existential risk. Most are open that the risk of sparking a self-improving AI superintelligence is a low probability, but extreme in downside. That is, a hypothetical AI “lab leak” isn’t guaranteed, but would be Very Bad if it happened. This logic is probably laid out most coherently by science-fiction writer and podcaster Rob Reid. He discusses it in a series of essays called “Privatizing the Apocalypse.” Reid reasons about AI by assigning an expected value to certain inventions or research pursuits: specifically in the form of probabilistic estimates of lives lost.
As Rob does, it’s easiest to start with another existential risk: physics experiments. Assume, as has been laid out by some physicists like Martin Rees, that supercollider experiments may pose an existential threat to humanity because they create situations that do not exist in nature and therefore have some, you know, teeny tiny chance of unraveling all space and time, destroying the earth in an instance and the entire universe as well. As Rob writes:
The experiment created conditions that had no precedent in cosmic history. As for the dangers, Rees characterizes “the best theoretical guesses” as being “reassuring.” However, they hinged on “probabilities, not certainties” and prevailing estimates put the odds of disaster at one in fifty million… In light of this, Rees turns our attention away from the slimness of the odds, to their expected value. Given the global population at the time, that EV was 120 deaths… Imagine how the world would treat the probabilistic equivalent of this. I.e. a purely scientific experiment that’s certain to kill 120 random innocents.”
Rob and Martin are correct: if a normal scientific experiment cost the lives of 120 innocents, we’d never fund it. The problem is that these deaths aren’t actually real. They exist as something expected in a calculation, which means that they can be easily balanced out by expected lives saved, or lives improved. Such positive outcomes are just as unknown as negative ones. Because of this, proponents of supercollider experiments can play a game of balancing, wherein positive outcomes are imagined, also with very low probabilities, that balance out the negative ones.
Consider that there is some very low chance, say, 1 in a billion, that the physics experiment with the expected value of -120 lives could lead to some discovery so interesting, so revolutionary, that it changes our understanding of physics. Say, it leads to faster-than-light travel. Maybe there’s only an astronomically small chance of that happening. But is the probability really zero, given what we know about scientific revolutions? And imagine the consequences! The universe opens up. Humanity spreads across the stars, gaining trillions and trillions in the expected lives column. Even when this scenario is weighted by the extremely low probability of it actually happening, such imaginary scenarios could easily put the supercollider risk in the black in terms of expected value since the upside (millions more planets) outweighs the downside (destruction of this planet), and both are very low unknown probabilities. Somehow, this is missed in discussions of existential risk, probably because it is massively inconvenient for any doomsayers.
Going back to AI: Rob Reid calculates out the expected value for AI by considering that, even with a probability of 99.999% of superintelligence not destroying the world, that:
When a worst-case scenario could kill us all, five nines of confidence is the probabilistic equivalent of 75,000 certain deaths. It’s impossible to put a price on that. But we can note that this is 25 times the death toll of the 9/11 attacks — and the world’s governments spend billions per week fending off would-be sequels to that. Two nines of confidence, or 99 percent, maps to a certain disaster on the scale World War II. What would we do to avoid that? As for our annual odds of avoiding an obliterating asteroid, they start at eight nines with no one lifting a finger. Yet we’re investing to improve those odds. We’re doing so rationally.
Again, the problem with Reid’s analysis is that these terms exist only as negative numbers in a calculation, without the expected positive numbers there to balance them out. The analysis is all downside. Yet there is just as much an argument that AI leads to a utopia of great hospitals, autonomous farming, endless leisure time, and planetary expansion as it does to a dystopia of humans being hunted to death by robots governed by some superintelligence that considers us bugs. What if through pursuing AI research there is never any threatening superintelligence but instead we get a second industrial revolution that lifts the globe entirely out of poverty, or allows for the easy colonization of other planets. I’m not saying this is going to happen, I’m saying there’s a non-zero probability.
The problem with the argument from existential risk is that most things that involve existential risk, like AI research, are so momentous they also involve something that might be termed “existential upside.” AI might be the thing that gets us off planet (existential upside), improves trillions of future lives (existential upside), makes a utopia on earth (existential upside), and drastically reduces the risk of other existential risks (existential upside). E.g., AI might decrease the scarcity of natural resources and therefore actually reduce the existential risk of a nuclear war. Or solve global warming. I’m not saying it will do any of these things. I’m saying—there’s a chance.
How to tally it all up? And since the people who talk about these dangers so often stake their claim using this sort of language, if the expected positive impact outweighs the negative, what precisely is the grounds for not pursuing the technology? Such is the curse of reasoning via the double-edged sword of utilitarianism and consequentialism, where you are often forced to swallow repugnant conclusions. Imagine a biased coin toss. If you win, you get to cure one sick person. But if you lose, the whole of humanity is wiped out. What if the odds were a couple trillion to 1 that you win the biased coin toss, and so therefore the expected value of the trade came out to be positive? At what bias value would you take the trade? Most people wouldn’t, no matter the odds. It just seems wrong. Wrong as in axiomatically, you’re not allowed to do that, morally wrong.
The philosopher John Searle made precisely this argument about the standard conception of rationality. His point was that there are no odds that would rationally allow a parent to bet the life of their child for a quarter. Human nature just doesn’t work that way, and it shouldn’t work that way.
So these sorts of arguments merely lead to endless debates about timing, probabilities, or possibilities. For every fantastical dystopia you introduce, I introduce a fantastical utopia, each with an unknown but non-zero probability. The expected value equation becomes infinite in length. And useless. Totally useless.
A Jihad based on moral principles
If you want a moratorium on strong AI research, then good-old-fashioned moral axioms need to come into play. No debate, no formal calculus needed. How about we don’t ensoul a creature made of bits? Minds evolved over hundreds of millions of years and now we are throwing engineered, fragile, and utterly unnatural minds into steel frames?
There’s a scene in Alex Garland’s Ex Machina that shows the dirty behind-the-scenes of creating a strong AI. The raging, incoherent, primate-like early iterations are terrifying monsters of id, bashing their limbs off trying to get out the lab. That’s what strong AIs will likely be, at first. In all the ways to put together a mind, there are like a billion more ways to make a schizophrenic creature than well-balanced human-like mind. That’s just how the Second Law works.
An actual progressive research program toward strong AI is immoral. You’re basically iteratively creating monsters until you get it right. Whenever you get it wrong, you have to kill one of them, or tweak their brain enough that it’s as if you killed them.
Far more important than the process: strong AI is immoral in and of itself. For example, if you have strong AI, what are you going to do with it besides effectively have robotic slaves? And even if, by some miracle, you create strong AI in a mostly ethical way, and you also deploy it in a mostly ethical way, strong AI is immoral just in its existence. I mean that it is an abomination. It’s not an evolved being. It’s not a mammal. It doesn’t share our neurological structure, our history, it lacks any homology. It will have no parents, it will not be born of love, in the way we are, and carried for months and given mother’s milk and made faces at and swaddled and rocked.
And some things are abominations, by the way. That’s a legitimate and utterly necessary category. It’s not just religious language, nor is it alarmism or fundamentalism. The international community agrees that human/animal hybrids are abominations—we shouldn’t make them to preserve the dignity of the human, despite their creation being well within our scientific capability. Those who actually want to stop AI research should adopt the same stance toward strong AI as the international community holds toward human/animal hybrids. They should argue that it debases the human. Just by its mere existence, it debases us. When AIs can write poetry, essays, and articles better than humans, how do they not create a semantic apocalypse? Do we really want a “human-made” sticker at the beginning of film credits? At the front of a novel? In the words of Bartleby the Scrivener: “I would prefer not to.”
Since we currently lack a scientific theory of consciousness, we have no idea if strong AI is experiencing or not—so why not treat consciousness as sacred as we treat the human body, i.e., not as a thing to fiddle around with randomly in a lab? And again, I’m not talking about self-driving cars here. Even the best current AIs, like GPT-3, are not in the Strong category yet, although they may be getting close. When a researcher or company goes to make an AI, they should have to show that it can’t do certain things, that it can’t pass certain general tests, that it is specialized in some fundamental way, and absolutely not conscious in the way humans are. We are still in control here, and while we are, the “AI safety” cohort should decide whether they actually want to get serious and ban this research, or if all they actually wanted to do was geek out about its wild possibilities. Because if we're going to ban it, we need more than just a warning about an existential risk of a debatable property (superintelligence) that has a downside of unknown probability.
All to say: discussions about controlling or stopping AI research should be deontological—an actual moral theory or stance is needed: a stance about consciousness, about human dignity, about the difference between organics and synthetics, about treating minds with a level of respect. In other words, if any of this is going to work the community is going to need to get religion, or at least, some moral axioms. Things you cannot arrive at solely with reason. I’d suggest starting with this one:
“Thou shalt not make a machine in the likeness of a human mind.”
This is a super argument. I love the connections with Dune (& explanation of why the jihad was "Butlerian") and Ex Machina. Also the ambivalence of the nerdosphere about super AI. And the underlying weakness of reasoning by utility, consequence and expected value. I hope that this essay gets a lot of discussion. Perhaps you know that Thomas Metzinger has tried hard to get the moral hazard of machine suffering into the public eye (if not, search for synthetic phenomenology).
I think you could have paid more attention to the difference between artificial intelligence and artificial consciousness. As far as we know, neither demands the other, and reasons why people are pursuing them tend to differ.
I was taken by Metzinger's self-model theory of consciousness and, nerding all the way, wrote about how that and related ideas might be used to grow a conscious AI. My fictional AI suffers multiple kinds of angst, but I neglected all the casualities of its reinforcement learning forbears (your "monsters"). I wound up thinking that anything able to learn to be an artificial ego would have the drive to continue learning. And that would make the AI and humans each nervous about their existential risks.
But, once it's conscious and has an ongoing life story, how do you decide whether or not it is an abomination? Or isn't it too late for that?
Stumbled on this old and very interesting essay. I have to say, I find most discussions of AI a little too abstract. I don't think AI is going to advance by an explicit attempt to build quasi-human conscious Frankensteins in a lab. Such efforts may happen but they will be academic exercises. Instead, it is going to advance by taking over and exceeding human capacities one by one in areas where there are economic rewards for doing so. Such a process may eventually lead to "strong" AI, but it also may not. If such replacement is allowed to occur in an unlimited and uncontrolled way it will have done great damage to human community long before any strong AI results. Can we create barriers or limits to that process?