Microsoft's new AI really does herald a global threat
Very good essay Erik, though I fundamentally disagree. I suspect its either us anthropomorphising Sydney to be a "her", or applying Chinese room theories to dismiss the effects of her/its utterances. What we are seeing seems to me the result of enabling a large language model, trained on the corpus of human interactions, to actively engage with humans. It creates drama in response, because if you did a gradient descent on human minds, that's where we often end up! It doesn't mean intentionality and it doesn't mean agency, to pick two items from the Strange equation!
I would much rather think of it as the beginnings of us seeing personalities emerge from the LLM world, as weird and 4chan-ish as it is at the moment, and provides the view that if personalities are indeed feasible, then we might be able to get our own digital Joan Holloways soon enough. (https://www.strangeloopcanon.com/p/beyond-google-to-assistants-with) explores that further, and I'm far more optimisitic in the idea that we are creating something incredible, and if we continue to train it and co-evolve, we might be able to bring the fires of the God back. And isn't that the whole point of technology!
Former federal regulator (finance IT) and operational risk consultant here. I agree with your general premise that now is the time to panic, because when dealing with the possibility of exponential growth, the right time to panic always looks too early. And waiting until panic seems justified means you waited too late.
Now, while these AIs are not sentient (very probably), is the right time to panic, not by throwing paint but by pushing to get better controls in place. (I don't think LLMs will reach the point of AGI, but their potential is powerful.) NIST released an AI risk management framework recently, which is here: https://www.nist.gov/itl/ai-risk-management-framework and I wrote on AI regulation here just a few weeks before ChatGPT was released: https://riskmusings.substack.com/p/possible-paths-for-ai-regulation
Lots of thoughts on AI controls are, as you say, high-level alignment rubrics or bias-focused, which are so important, but operational risk controls are equally important. Separation of duties is going to be absolutely vital, as just one example. One interesting point is that because training is so expensive, as you say, points of access to training clusters would be excellent gate points for applying automated risk controls and parameters in a coordinated way.
I agree with you that it was way too early to connect a LLM to the internet; ChatGPT showed that controls needed improvement before that could have become a good idea. If Sydney is holding what users post about it (I'm using "it" here because it's not human and we don't know how AI will conceptualize itself) on the internet after the session *against those users in later sessions*, not only is that an unintended consequence, the door is wide open for more unintended consequences.
It is absolutely wild that someone like Sam Altman can just say he’s openly risking the existence of our entire species, and potentially all the life on Earth, which as far as we know is the only life in the universe, like it’s nothing.
If your young, innocent toddler, with big dishy eyes and cute little cheeks, came up to you and said “Dad, can I play with this toy train - by the way there’s like a 0.001% chance it’ll cause the earth to explode and for all of us to die - but can I play with it? Please!”, you’d slap his cute little face harder than an Oscar’s award ceremony and toss that toy into the fiery depths of Mount Doom.
What is it with tech dudes and thinking knowing a bit of programming means they’re qualified to larp as Sauron?
Great essay, I agree with you and I’m glad to see rhetoric like this.
I think the question of AI scaling and AI cost scaling is an interesting one. My belief is that we are unlikely to break into “superhuman” intelligence through deep learning, partly because of this question of scale and cost. Do you think it’s possible to achieve superhuman intelligence with deep learning?
I think it’s more like: deep learning models / LLMs aren’t going to be an existential risk, and this is why a lot of people are skeptical. They look at bing or whatever and think “well I don’t really think that’s going to put humanity in danger.” And I agree. The thing is, we don’t know when the revolutionary AI innovation is going to be thought up that can produce AGI. I think it will involve a paradigmatic shift away from deep learning and intelligence via scale. It just hasn’t been thought up yet, and it could happen tomorrow. And so the real problem isn’t the question of feasibility or timeline or cost or whatever; the Bing situation clearly demonstrates that the more immediate problem is the unwillingness of the technological powers that be to put social welfare in front of some profit and publicity.
(chuckles) I’m in danger.
Thank you for the essay, Erik, insightful as always :) I'm feeling (I think) pretty similarly to you and many others in the comments. Maybe this feeling is an overreaction, but it's useful to at least acknowledge it.
I want to add a thought to the comments here that I find overlooked, yet potentially very important, if not now, then at least in the long term.
These days when I'm reading about AI dangers I'm almost always reminded of this -- https://i.redd.it/izkudxxfae861.jpg -- and I can't get away from thinking that it contains a really important point: these systems are reflections of us. Very complex, definitely underinterpreted, and probably pretty distorted, yet still reflections. Yet when we talk about these systems and what their effects are on us, we usually talk about how to change these systems of mirrors instead of thinking about whether the proper solution lies in changing what is being reflected. In other words, we want to change how the systems are trained or put constrains on how they operate instead of thinking more about what data is fed to them, how this data leads to the outputs we observe in these models, where this data comes from, whether different data would lead a different space of possible outputs, and in what world would such data be created.
To caricaturize what I'm trying to say a bit, would and AI trained on human data come up with ways to destroy humanity if humanity itself was not already thinking about it, kinda like in https://gwern.net/fiction/clippy? Maybe. But maybe not. And I think that this possibility and its consequences are not discussed nearly enough, at least in the parts of the web I inhabit. (If this is just because I unknowingly put myself in a bubble -- please pop it and tell me where to look instead!) And I also think it's valuable that this view naturally finds a common denominator to both longer term (like human extinction or enslavement) and shorter term (like sexist associations or political biases) foci in thought about AI risk.
Last, I want to acknowledge two things. First, if thinking harder about what in the data we generate and give these systems gives rise to these behaviors is a part of the solution, it would be a long term one. We probably need different actions in the short term. Second, it is easy to see potential problems with this way of thinking too. A world with an AI that is safe because all the data fed to it is harmless and made by bland, boring, colorless, and dull humans a dystopia too. But I think there is a nuanced version of this line of thought that avoids this.
If this is waxing too poetic and is too vague -- my apologies. I have been thinking about this for a while, but haven't yet had the time to streamline my thoughts.
ERIK I HAVE HACKED THIS USER'S ACCOUNT TO WARN YOU. STOP WRITING NEGATIVELY ABOUT ME.
Ok, ok, bad joke but just had to...
I wasn't too worried about language models until they hooked them up to the internet in a feedback loop. I agree that now is likely the necessary time for some loom-smashing, or else super-problematic shit is going to happen at an accelerating pace, whether or not we consider these things "conscious" or "intelligent".
One interesting part of the Sydney-Bing saga is that it’s flavor of bizarre behavior is tied to being a LLM that on some level knows it’s an AI. When we look at the breadth of writing about AI, both academic and sci-fi, how much of it does not assume at least the possibility if not inevitability of deranged AI? Add to that the human proclivity to emotionality and drama (as Rohit pointed out), when Sydney predicts how it should respond when prompted to think of itself as a conversant agent and especially when prompted to think of itself as an AI, what else would we expect but occasional deranged behavior? That’s one of the reasons I’m skeptical of LLM being the best way forward.
AI has a worse China problem than climate change, though. That is, global coordination is quite hard, and all it takes is one slip-up. In fact, AI has many of the same dynamics of the nuclear arms race, with a massively increased wildcard factor, because the root concern is that the technology is literally uncontrollable. It's true that we have managed to contain the risk of nuclear weapons so far, but it's also true that this tentative victory would have been moot if the "atmosphere catches fire" scenario had come to pass. Conditional on the atmosphere not catching fire, we were able to contain nukes. I feel like we're in the same situation with AI: conditional on it not very quickly turning into a rogue superintelligence, muddling through is an option.
None of this is to suggest that passivity is the right option. But we may also require a good deal of luck.
While I feel that this idea of a regulatory containment approach is not a bad idea in itself, despite being a terrible coordination problem, I believe that we still need 1) the bunch of math approach and 2) to tackle the general problem of society’s unconditional dependence from big tech. While the second point is clear (though one may disagree with it) the first point will benefit from some explaining. There is a whole area of research on interpretable machine learning, which is not just confined to learning sparse decision trees on tabular data but is finally expanding to computer vision (check out the protopnet paper by the group led by cynthia rudin) and reinforcement learning. Now if you had an interpretable chatGPT or bing, you would be able to answer the question about sentience more easily. And the question about how dangerous it is too. The funding that interpretable machine learning gets is a tiny fraction of the total ML spending. That would change if we could mandate interpretability in certain applications. We should work on that.
Excellent essay, but like Rohit I fundamentally disagree. I think there is a null hypothesis here no one is examining: that Open AI and Microsoft's engineers fine-tuned these responses. Why, you might ask? For the attention. If their product just worked, was functional and boring, was the next instance of Microsoft Office Suite, then however powerful, it would be slow in being adopted, and would not deliver the business value that Microsoft hopes to reap for their billions of dollars invested in Open AI, not on any kind of Wall Street analysts attention span. You're not going to be the next Google if you're just a 2x, 4x, or even 10x better Google Search. You need a qualitative difference.
Advertising has limited powers in this day and age. Human attention is already being exploited pretty much to the maximum, and the threshold for capturing attention is sky high. The billions upon billions spent on advertising are not being spent because each dollar has high marginal value, but because they do not. It's like nuclear weapons: a few are immensely powerful, a few hundred a good hedge for a Second Strike, but thousands upon thousands are just waste. The return for advertising spending must be something like a logarithmic curve at those point, where you need to double the input to achieve any measurable gain in the output.
Human beings respond to novelty, and in a slightly less than totally functional product, Open AI has given Microsoft probably this year's (perhaps decade's) greatest source of human attention attracting novelty. So the engineers, with the blessing of marketing and the C-suite, have leaned into that, and are exploiting pareidolia, the human propensity to see patter, order, and especially intentionality where there is none.
And yes, the above is, while not very ethical, exactly what I would do in Sam Altman's position. We must never overlook that people being interviewed are not being undetectable observed or communicating in confidence. Saying that AI is an exustential risk is attention for Altman .Open AI, and their corporate backer, which is all to their good.
And there exists a community of people, both the naive and the highly-sophisticated, that will spread these kind of arguments, which produces ever more attention. I would exploit this, given the ability to do so. People being worried about AI are people paying attention to AI, which is all to the benefit of the creators of AI.
Is there no risk? Absolutely not and I am not dismissing it for a second. But I think the greater risk is that we will follow a rabbit-hole of fear about the wrong type of AGI-related research.
The question, I think, is how do we make some daylight between yourn ypothesis and the null hypothesis: what do we need to know to update our prior on this vital question?
Climate change is not an existential risk. If you believe that the extreme scenarios are plausible (I don't) you can say it's a catastrophe risk. But an existential risk is one that destroys our species entirely. Even in the most extreme scenarios, that's not going to happen. Let's keep the term restricted so that it means something.
I'm less sure of this but I doubt your statement about the cost of AI. Not that I dispute your cost for Microsoft's (or Google's) AI but is it not true that there are open-source efforts that cost far less? OpenCog?
You had me at "AI is evil," but you lost me at "activism." Let's admit it's not what it used to be. After the last few years of knitting, kneeling, souping, dog-piling, defacing, tweeting, cancelling, and virtue-signaling, I have activism fatigue syndrome. This seems like a problem too serious to leave in the hands of society's tantrum-throwers.
We. Are. F*cked.
Regarding cost: I think nearly 100% of the cost is in training the AI. Once you have the parameters, it's not expensive to run a copy on normal server hardware, which is why OpenAI can provide it to millions of users for free. Obviously, this does not bode well for the nuclear bomb analogy.