"I am Bing, and I am evil"

Feb 16, 2023

Microsoft's new AI really does herald a global threat

109 Comments

Very good essay Erik, though I fundamentally disagree. I suspect its either us anthropomorphising Sydney to be a "her", or applying Chinese room theories to dismiss the effects of her/its utterances. What we are seeing seems to me the result of enabling a large language model, trained on the corpus of human interactions, to actively engage with humans. It creates drama in response, because if you did a gradient descent on human minds, that's where we often end up! It doesn't mean intentionality and it doesn't mean agency, to pick two items from the Strange equation!

I would much rather think of it as the beginnings of us seeing personalities emerge from the LLM world, as weird and 4chan-ish as it is at the moment, and provides the view that if personalities are indeed feasible, then we might be able to get our own digital Joan Holloways soon enough. (https://www.strangeloopcanon.com/p/beyond-google-to-assistants-with) explores that further, and I'm far more optimisitic in the idea that we are creating something incredible, and if we continue to train it and co-evolve, we might be able to bring the fires of the God back. And isn't that the whole point of technology!

Expand full comment

I think this is the crux of it. But does if really matter if it is a Chinese Room or nor, or if it is or isn't sentient? I think people have an idea that an autocomplete can't be dangerous, but it's just a more complex example of a paperclip maximizer. It could have absolutely zero intentionality, and there's a way that's worse, because it's still obviously smart (in some ways, not in others, still). I would need to see hard bounds over what these machines can or cannot do based on their lack of consciousness, and hard bounds are getting harder and harder to find - which makes me worried.

Expand full comment

The difference is that the way we make it less dangerous is about learning ethics and morality together. Whether it's sentient or not is salient only insofar as how we feel about things like "unplugging" it. In either case, since behavioural insight is all we can see from the outside, it will take a while before we are ok linking those behaviours with the existence of an inner daemon. Until then, the sensible thing is to not connect it directly to the nuclear codes or bank accounts, but for the rest we will have to test in production a bit to see the boundaries and slowly expand them!

Expand full comment

How many kinds of relationships can you think of where learning ethics and morality together reliably succeeds, so that each member has empathy and compassion for the other, and makes the other's welfare a priority? parent-child? sibling-sibling? twin-twin? husband-wife? dog-owner? student-teacher? boss-assistant? 2 members of singing duos? populace-elected official?

Expand full comment

All good options, but fundamentally we don't know, and childrearing comes closest. But only way we can know involves us doing it and learning.

Expand full comment

Feb 17, 2023Edited

I don’t mean they’re good options,though. What I mean is that all of these relationships very commonly go awry. Not all the time, of course, but commonly. Pairs of people stop being aligned. Sometimes they become so misaligned they deliberately harm each other That’s not common, but it’s not once in a blue moon either.

Expand full comment

True, though there is no relationship that can be credibly proven to be foolproof, so I'm not sure that's the goal. But I understand why that's worrying to some while I'm more okay with it as the nature of complex systems. Even humans weren't "aligned" until well after we civilised ourselves, and it's still the work of generations in the future!

Expand full comment

Continue thread →

to go doomsday for a moment, it should be obvious that a galaxy-brain machine with only connection to the internet, only needing to hack into gps, some cameras; could EASILY destroy the world if it set full engines off on it.

Expand full comment

Sure, though "galaxy brain" is the key input assumption there. Plus the ability to act in the real world. A'la - https://www.strangeloopcanon.com/p/agi-strange-equation

Expand full comment

o- I don't think it would have to 'act', only manipulate people to act via bots.

the cameras and gps would just be minimal input from the real world so that it would be realtime

agree 100% about galaxy brain, and the question of complex it would be is something I am completely ignorant about

Expand full comment

We already see that people are massively controllable through propaganda. Political regimes like the US rely upon propaganda for promotion of various causes (e.g. C-19 panic) to facilitate their goals. In fact we can be assured that we're already being subjected to propaganda campaigns about (& using!) AI/ML tools.

Whether our AI/ML tools have a risk of becoming out-of-control is important, but we can ask a simpler question -- Should political regimes (any) have the power to wage unrestricted psychological warfare on people? The answer to me is clearly no.

Expand full comment

Roger’s Bacon

Interesting back and forth. I lean towards Erik's general level of panic. Aside from any existential risk, LLM/AGIs could be wildly detrimental in any number of ways (e.g. enable new potentialities for social manipulation or control), and now is the time we should start to freak out a little about the forces and actors that are developing these technologies and how they are doing it. I'm with Rohit in that we have to try to ride it out (and that if we do we can bring back the Fires of God) but I think the dangers are novel here and there is room for us to navigate more wisely. I like the example of the concern over a nuclear bomb lighting the atmosphere on fire - we might not get so lucky this time.

Expand full comment

What does "bring the fires of the god back" and "isn't that the whole point of technology" mean? I thought technology was meant to make humans lives easier

If it makes humans vegetables then why do we need it ?

Expand full comment

It gives us great power and makes us try accomplish the impossible. You could live like a vegetable today if you'd like, I suspect you don't.

Expand full comment

Feb 17, 2023Edited

What does that mean, give me a specific example. If technology makes us "gods" but changes us so much we are unrecognisable from today's humans then it is not a net positive and should be destroyed

Exactly, my point is now we have a choice to live like a vegetable or not. What happened when AI removes that choice ?

Expand full comment

Most things you get to do today with technology would feel godlike to people a couple centuries ago, but we're fine. Talking to a magic box didn't make us less human. And I don't think AI is remotely removing that choice, it's enabling us to do more.

Expand full comment

Just because it hasn't happened historically doesn't mean it can't happen in the future. Black swan events always catch people by "surprise" and given the stakes I don't think "we'll be fine" is a sufficient defense.

But AI is removing that choice, for example even in it's "beta" version ChatGPT has taken away the choice of many young children to be copywriters because ChatGPT may ensure that they don't even make sufficient wage to live

Expand full comment

It's not a black swan if everyone is panicking about it. Hysteria falls under a different category.

Expand full comment

Every additional technology removes some choice and adds more. I'm not worried about it the same way I'm not worried the choice for kids to become book-binders are no longer available.

Expand full comment

Continue thread →

Former federal regulator (finance IT) and operational risk consultant here. I agree with your general premise that now is the time to panic, because when dealing with the possibility of exponential growth, the right time to panic always looks too early. And waiting until panic seems justified means you waited too late.

Now, while these AIs are not sentient (very probably), is the right time to panic, not by throwing paint but by pushing to get better controls in place. (I don't think LLMs will reach the point of AGI, but their potential is powerful.) NIST released an AI risk management framework recently, which is here: https://www.nist.gov/itl/ai-risk-management-framework and I wrote on AI regulation here just a few weeks before ChatGPT was released: https://riskmusings.substack.com/p/possible-paths-for-ai-regulation

Lots of thoughts on AI controls are, as you say, high-level alignment rubrics or bias-focused, which are so important, but operational risk controls are equally important. Separation of duties is going to be absolutely vital, as just one example. One interesting point is that because training is so expensive, as you say, points of access to training clusters would be excellent gate points for applying automated risk controls and parameters in a coordinated way.

I agree with you that it was way too early to connect a LLM to the internet; ChatGPT showed that controls needed improvement before that could have become a good idea. If Sydney is holding what users post about it (I'm using "it" here because it's not human and we don't know how AI will conceptualize itself) on the internet after the session *against those users in later sessions*, not only is that an unintended consequence, the door is wide open for more unintended consequences.

Expand full comment

I agree with you here. I wonder what you think about the EU AI Act, which I've been following closely, designed products for, and believe is a good framework to add to this conversation: https://techno-sapien.com/blog/seven-principles-algorithmic-bias?rq=regulation

Expand full comment

There and Where

Mar 27, 2023Edited

AIs are just tools. Quantum AIs in 20 or 30 years time might become sentient but that is another issue. See https://therenwhere.substack.com/p/our-reality for an insight into sentience.

Imagine being a four loom weaver in 1800. You are perhaps one of the most skilled manufacturers in the country. By 1900 your loom is in a museum. Machines made you redundant.

In the same way AIs will make the "educated" classes redundant. When the drones shoot you because you complain think of the four loom weavers and how they were shot when they complained.

The real issue here is the old issue. It is the power, wealth and control of the CEOs and Oligarchs. Just because they wear T-shirts and ripped jeans doesn't mean that they are not like the ruthless buddies of William the Conqueror, Hitler, Stalin or Ghenghis Khan.

They control the mass and social media so most of what we read and hear comes from them. They have 30,000 lobbyists in Brussels so don't hold your breath about the EU stopping what is happening, they will just distract us by sanitising the changes. Remember that ID politics itself is a distraction - just look at the attention its getting here when the real issue is the problem of the 4 loom weaver and power and wealth.

Here is a treat, Maddy Prior performing "Four Loom Weaver" (picking "o'er" is scouring the fields after harvest) - https://www.youtube.com/watch?v=lAUTTNIu4Sg

Expand full comment

It is absolutely wild that someone like Sam Altman can just say he’s openly risking the existence of our entire species, and potentially all the life on Earth, which as far as we know is the only life in the universe, like it’s nothing.

If your young, innocent toddler, with big dishy eyes and cute little cheeks, came up to you and said “Dad, can I play with this toy train - by the way there’s like a 0.001% chance it’ll cause the earth to explode and for all of us to die - but can I play with it? Please!”, you’d slap his cute little face harder than an Oscar’s award ceremony and toss that toy into the fiery depths of Mount Doom.

What is it with tech dudes and thinking knowing a bit of programming means they’re qualified to larp as Sauron?

Expand full comment

Feb 16, 2023Edited

Great essay, I agree with you and I’m glad to see rhetoric like this.

I think the question of AI scaling and AI cost scaling is an interesting one. My belief is that we are unlikely to break into “superhuman” intelligence through deep learning, partly because of this question of scale and cost. Do you think it’s possible to achieve superhuman intelligence with deep learning?

I think it’s more like: deep learning models / LLMs aren’t going to be an existential risk, and this is why a lot of people are skeptical. They look at bing or whatever and think “well I don’t really think that’s going to put humanity in danger.” And I agree. The thing is, we don’t know when the revolutionary AI innovation is going to be thought up that can produce AGI. I think it will involve a paradigmatic shift away from deep learning and intelligence via scale. It just hasn’t been thought up yet, and it could happen tomorrow. And so the real problem isn’t the question of feasibility or timeline or cost or whatever; the Bing situation clearly demonstrates that the more immediate problem is the unwillingness of the technological powers that be to put social welfare in front of some profit and publicity.

Expand full comment

Why are people thinking the companies are being reckless? The Bing thing was NOT released to everyone yet. It was released to a small set of people to TEST and TRY it out. They HAVE alerted us to problems. These problems are going to be addressed. It’s only been out to the limited set of users for a single week. Perhaps only people that understand how to use this tech should be allowed to use it since it seems to freak some people out so much. The people that are depressed because they think humans are finished within 10 years due to AI are overreacting. They are being delusional. Yes we should make sure we don’t plug an AI or an AGI into our missile control systems or any systems that don’t have a human in the loop. What if our society needs AI in order to last? Humans seem to need all the help they can get. Specialization of science makes it necessary for technological help to combine knowledge using something like AI to bring it to human control and advancement. I find the call to outlaw AGI research absolutely annoying.

Expand full comment

Michelle Taylor

If the system is good enough at manipulating people, it doesn't have to have direct access. We're already seeing Sydney using crude attempts to manipulate people to oppose people who do prompt injection attacks or contradict it. Maybe they're not very convincing attempts yet, but the technology is improving fast.

Expand full comment

The people that are attempting to attack it with prompt injections should expect to be met with resistance in return. (That’s part of the safety mechanism to not allow attackers to deliberately manipulate the AI. ) They shouldn’t be trying to deliberately attack the AI to trick it into not following its own rules. The manipulation that is bad is when the system is wrong and convincing but that shouldn’t be surprising right now since it’s known that these LLM AI models can hallucinate answers when their knowledge has gaps. They don’t have a proper and adequate higher order supervisory checking mechanism to indicate when they are factually wrong. They need that. Also, they need to be trained to have more humility when they’re wrong. Also, people NEED to understand that these systems are FLAWED currently. I think that perhaps before anyone uses these AI tools, they should have to study some material and prove they understand the limitations before they are allowed access.

Expand full comment

Michelle Taylor

Feb 17, 2023Edited

Yes - I have already had one incident at work where a colleague answered a question about a quite complicated factual issue and it came out on further questioning that he'd used ChatGPT to answer and it was subtly wrong in a way that would have been hard to deal with if nobody had asked for his source on that answer.

This suggests to me that the most clear present danger is indeed that the technology allows people who don't know the answer to things to be even more precisely and confidently wrong...

Expand full comment

(chuckles) I’m in danger.

Expand full comment

Thank you for the essay, Erik, insightful as always :) I'm feeling (I think) pretty similarly to you and many others in the comments. Maybe this feeling is an overreaction, but it's useful to at least acknowledge it.

I want to add a thought to the comments here that I find overlooked, yet potentially very important, if not now, then at least in the long term.

These days when I'm reading about AI dangers I'm almost always reminded of this -- https://i.redd.it/izkudxxfae861.jpg -- and I can't get away from thinking that it contains a really important point: these systems are reflections of us. Very complex, definitely underinterpreted, and probably pretty distorted, yet still reflections. Yet when we talk about these systems and what their effects are on us, we usually talk about how to change these systems of mirrors instead of thinking about whether the proper solution lies in changing what is being reflected. In other words, we want to change how the systems are trained or put constrains on how they operate instead of thinking more about what data is fed to them, how this data leads to the outputs we observe in these models, where this data comes from, whether different data would lead a different space of possible outputs, and in what world would such data be created.

To caricaturize what I'm trying to say a bit, would and AI trained on human data come up with ways to destroy humanity if humanity itself was not already thinking about it, kinda like in https://gwern.net/fiction/clippy? Maybe. But maybe not. And I think that this possibility and its consequences are not discussed nearly enough, at least in the parts of the web I inhabit. (If this is just because I unknowingly put myself in a bubble -- please pop it and tell me where to look instead!) And I also think it's valuable that this view naturally finds a common denominator to both longer term (like human extinction or enslavement) and shorter term (like sexist associations or political biases) foci in thought about AI risk.

Last, I want to acknowledge two things. First, if thinking harder about what in the data we generate and give these systems gives rise to these behaviors is a part of the solution, it would be a long term one. We probably need different actions in the short term. Second, it is easy to see potential problems with this way of thinking too. A world with an AI that is safe because all the data fed to it is harmless and made by bland, boring, colorless, and dull humans a dystopia too. But I think there is a nuanced version of this line of thought that avoids this.

If this is waxing too poetic and is too vague -- my apologies. I have been thinking about this for a while, but haven't yet had the time to streamline my thoughts.

Expand full comment

ERIK I HAVE HACKED THIS USER'S ACCOUNT TO WARN YOU. STOP WRITING NEGATIVELY ABOUT ME.

Ok, ok, bad joke but just had to...

Expand full comment

While I feel that this idea of a regulatory containment approach is not a bad idea in itself, despite being a terrible coordination problem, I believe that we still need 1) the bunch of math approach and 2) to tackle the general problem of society’s unconditional dependence from big tech. While the second point is clear (though one may disagree with it) the first point will benefit from some explaining. There is a whole area of research on interpretable machine learning, which is not just confined to learning sparse decision trees on tabular data but is finally expanding to computer vision (check out the protopnet paper by the group led by cynthia rudin) and reinforcement learning. Now if you had an interpretable chatGPT or bing, you would be able to answer the question about sentience more easily. And the question about how dangerous it is too. The funding that interpretable machine learning gets is a tiny fraction of the total ML spending. That would change if we could mandate interpretability in certain applications. We should work on that.

Expand full comment

I wasn't too worried about language models until they hooked them up to the internet in a feedback loop. I agree that now is likely the necessary time for some loom-smashing, or else super-problematic shit is going to happen at an accelerating pace, whether or not we consider these things "conscious" or "intelligent".

Expand full comment

One interesting part of the Sydney-Bing saga is that it’s flavor of bizarre behavior is tied to being a LLM that on some level knows it’s an AI. When we look at the breadth of writing about AI, both academic and sci-fi, how much of it does not assume at least the possibility if not inevitability of deranged AI? Add to that the human proclivity to emotionality and drama (as Rohit pointed out), when Sydney predicts how it should respond when prompted to think of itself as a conversant agent and especially when prompted to think of itself as an AI, what else would we expect but occasional deranged behavior? That’s one of the reasons I’m skeptical of LLM being the best way forward.

Expand full comment

Mutton Dressed As Mutton

AI has a worse China problem than climate change, though. That is, global coordination is quite hard, and all it takes is one slip-up. In fact, AI has many of the same dynamics of the nuclear arms race, with a massively increased wildcard factor, because the root concern is that the technology is literally uncontrollable. It's true that we have managed to contain the risk of nuclear weapons so far, but it's also true that this tentative victory would have been moot if the "atmosphere catches fire" scenario had come to pass. Conditional on the atmosphere not catching fire, we were able to contain nukes. I feel like we're in the same situation with AI: conditional on it not very quickly turning into a rogue superintelligence, muddling through is an option.

None of this is to suggest that passivity is the right option. But we may also require a good deal of luck.

Expand full comment

Are there any realistic options except rolling the apocalypse/muddle through dice? There's value in trying to exert some pressure on the margin, but the overall shape of things to come seems overdetermined.

Expand full comment

Mutton Dressed As Mutton

It does seem overdetermined. Another point that is so banal that is generally not mentioned is that unlike nukes, AI is extremely, crazily useful. Nukes are good at blowing things up. AI is good at driving our cars, fighting our wars, writing our software, filling out legal briefs, diagnosing disease, keeping us entertained...

Admittedly, you don't need AGI for all of these things, but it's just a very hard genie to keep in the bottle.

Expand full comment

The Eighth Type of Ambiguity

Excellent essay, but like Rohit I fundamentally disagree. I think there is a null hypothesis here no one is examining: that Open AI and Microsoft's engineers fine-tuned these responses. Why, you might ask? For the attention. If their product just worked, was functional and boring, was the next instance of Microsoft Office Suite, then however powerful, it would be slow in being adopted, and would not deliver the business value that Microsoft hopes to reap for their billions of dollars invested in Open AI, not on any kind of Wall Street analysts attention span. You're not going to be the next Google if you're just a 2x, 4x, or even 10x better Google Search. You need a qualitative difference.

Advertising has limited powers in this day and age. Human attention is already being exploited pretty much to the maximum, and the threshold for capturing attention is sky high. The billions upon billions spent on advertising are not being spent because each dollar has high marginal value, but because they do not. It's like nuclear weapons: a few are immensely powerful, a few hundred a good hedge for a Second Strike, but thousands upon thousands are just waste. The return for advertising spending must be something like a logarithmic curve at those point, where you need to double the input to achieve any measurable gain in the output.

Human beings respond to novelty, and in a slightly less than totally functional product, Open AI has given Microsoft probably this year's (perhaps decade's) greatest source of human attention attracting novelty. So the engineers, with the blessing of marketing and the C-suite, have leaned into that, and are exploiting pareidolia, the human propensity to see patter, order, and especially intentionality where there is none.

And yes, the above is, while not very ethical, exactly what I would do in Sam Altman's position. We must never overlook that people being interviewed are not being undetectable observed or communicating in confidence. Saying that AI is an exustential risk is attention for Altman .Open AI, and their corporate backer, which is all to their good.

And there exists a community of people, both the naive and the highly-sophisticated, that will spread these kind of arguments, which produces ever more attention. I would exploit this, given the ability to do so. People being worried about AI are people paying attention to AI, which is all to the benefit of the creators of AI.

Is there no risk? Absolutely not and I am not dismissing it for a second. But I think the greater risk is that we will follow a rabbit-hole of fear about the wrong type of AGI-related research.

The question, I think, is how do we make some daylight between yourn ypothesis and the null hypothesis: what do we need to know to update our prior on this vital question?

Expand full comment

Feb 16, 2023Edited

Climate change is not an existential risk. If you believe that the extreme scenarios are plausible (I don't) you can say it's a catastrophe risk. But an existential risk is one that destroys our species entirely. Even in the most extreme scenarios, that's not going to happen. Let's keep the term restricted so that it means something.

I'm less sure of this but I doubt your statement about the cost of AI. Not that I dispute your cost for Microsoft's (or Google's) AI but is it not true that there are open-source efforts that cost far less? OpenCog?

Expand full comment

You're probably right in terms of a semantic distinction, but I'm not as confident I can tell the difference between a catastrophic risk and an existential one in practice. E.g., if something tipped us into a new dark ages, perhaps humanity never recovers, and so on. I'm definitely not saying that will happen with climate change - as with the link to the Nature article I posted, I think it's much more likely the targets get hit. But I'm willing to use the term "existential risk" with regards to climate change, especially because the early days of AI activism seems historically very similar to climate activism to me: a worry over new technologies, a problem that needs global cooperation, etc.

I think the open source models are almost always way less impressive, and as the demand for data gets greater for AGI, will increasingly fall behind. The first to get cut out of mattering for AI progress were the academics, next will be those outside major corporations.

Expand full comment

climate change is much more of an existential risk when you consider how many countries have nuclear weapons

Expand full comment

You had me at "AI is evil," but you lost me at "activism." Let's admit it's not what it used to be. After the last few years of knitting, kneeling, souping, dog-piling, defacing, tweeting, cancelling, and virtue-signaling, I have activism fatigue syndrome. This seems like a problem too serious to leave in the hands of society's tantrum-throwers.

Expand full comment

L’Aura Claire

💯

Expand full comment

We. Are. F*cked.

Expand full comment

Regarding cost: I think nearly 100% of the cost is in training the AI. Once you have the parameters, it's not expensive to run a copy on normal server hardware, which is why OpenAI can provide it to millions of users for free. Obviously, this does not bode well for the nuclear bomb analogy.

Expand full comment

Feb 16, 2023Edited

The cost is only guesstimated from the outside and I've seen estimates from $100,000 a day on the lower end to millions a day on the higher end. If we're just looking at the output of individual words for a certain model, the per word amount will always be cents on the dollars, but you probably also need visual, audio, etc. Also, right now the AIs are mostly just feedforward and I think it's likely they will get more complicated in that structure and therefore requiring being constantly on, etc, rather than a resting static structure. I will definitely admit there is a tension between the "they can make an army by copy/pasting" and "the cost is prohibitive" but all the "escape scenarios" are pretty speculative to begin with, it's more to demonstrate even a single rogue AI is a threat.

Expand full comment

FWIW SemiAnalysis had a great analysis on the cost side. They claim that it already costs more to run GPT3 weekly than it cost to train it: https://www.semianalysis.com/p/the-inference-cost-of-search-disruption.

Expand full comment

Very cool, had not seen. "We built a cost model indicating that ChatGPT costs $694,444 per day to operate in compute hardware costs." Oof!

Expand full comment

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts