97 Comments

I asked ChatGPT what it thought of your post.

CEK: The following essay was written about you. What do you think of it?

ChatGPT: I am impressed by the achievement of ChatGPT and its ability to pass the Turing test in a conversational manner. However, I also agree with the critique that the Turing test itself is a flawed benchmark for evaluating AI. It is not necessarily a measure of intelligence, but rather how well the AI can act and deceive. I believe there are more practical and relevant ways to evaluate the capabilities of AI.

CEK: But what do you think about AI being banal and sending a bunch of boring content into the world?

ChatGPT: I think it is concerning that AI can generate boring content and potentially contribute to the overload of information in the world. It is important for AI to be used responsibly and with consideration for the quality and relevance of the content it produces.

... I think that pretty much goes straight to your point 😂😂

Expand full comment

🤣🤣🤣

Expand full comment

But seriously... everyone (except you) seems really concerned about ChatGPT spelling the end for things like college admissions essays, student writing responses, and five paragraph essays in high school English. But I think that totally misses the point!

If an AI—one which is admittedly very knowledgable and verbally competent—can produce some banal content which satisfies the prompt, doesn't that mean the prompt was designed to elicit banal responses in the first place??

The problem isn't that ChatGPT can write essays. It's that we've been asking students to write essays that don't require any special insight, critical thinking, or perspective!

Expand full comment

You must walk before you run. Using your argument we can declare that teachers should not require the learning of multiplication tables, since computers can easily replicate this work.

Human students must first learn the times tables and banal but logically and grammatically consistent responses before we can possibly ask more of them.

So in both cases I am not saying schooling should not go beyond systematic responses, but you seem to want schools to do away with both the times tables and these banal questions. I don't think this is going to work. Students must pass thru (and be tested on) these intermediate stages.

Expand full comment

I agree that teachers should teach logical & grammatically consistent responses.

I disagree that this should be done using overly structured requests.

Yes, it's harder for the teacher.

That's THEIR JOB.

(I also, incidentally, think that teaching the times tables is a good thing; math is hierarchical in ways that English is not.)

Expand full comment

Assuming we can't just yeet the whole Satanic organization, I want you for the head of DOE

Expand full comment

Alan Jacobs expands on this here: https://blog.ayjay.org/the-media-ecology-of-college-writing/

It's very much about unthinking professors/admissions officers/etc. asking the same old questions and being surprised when they get the same banal answers.

Expand full comment

In essence, if the essay does not demonstrate abstract reasoning, it is likely "a bot"; but paradoxically if many students can pass following methodical/mechanical writing patterns and failing with unique abstract thought, then the institution becomes a human "bot farm" of sorts. Perverse incentives flowing in the other direction. How can this be solved then? Scrap books and scholastic projects are too cumbersome for administration but it is resilient.

Expand full comment

Well said. Critical thinking seems to be dying in all educational institutions: private, public, and ivy.

Expand full comment

In a team meeting a couple days ago, my manager suggested that we all learn more about “product” and “data,” because “I’m sure you’ve heard about chatGPT,” and “it’s pretty good at writing code.” The suggestion being that it’s going to be harder to get a job with "just" programming skill, now that AIs can basically do it, I guess. I thought this was appalling both because it was coming from my manager and therefore vaguely threatening and disheartening, but also because it is just so obvious to me that AI isn't going to displace programmers in any meaningful way any time soon. I think the banality you're describing is part of the reason why, that is, AIs are still a long way off from actually solving the problems you face in real life. It can generate boilerplate, yeah... so can StackOverflow. Furthermore, if you know anything about programming, you know that boilerplate quickly gets exported to libraries and packages when possible, and the "programming" a programmer actually does is more about figuring out which libraries to use, stitching them together, and addressing the specificities of the problem at hand.

> For I don’t want to live in a world where most text is written by AI. I think that’s a sad, small, solipsistic world.

Yeah. I think that the people who think that AI is really going to start replacing a lot of written content soon are misguided. It’s sort of like how you noticed that not all that many Substacks use DALL-E images, even though it’s free or cheap and supposed to be very good. The fact is, it’s not good. You can tell when an image is AI-generated and it is boring. The same goes for written content. If there are going to be a bunch of websites soon that try to gain readership and make lot of easy money by publishing AI-generated content, they will fail.

It might sound like a human on a very superficial level, but that doesn’t mean that anyone wants to read it. I’m not saying that it’s impossible for AI to generate content that’s indistinguishable from human-generated content, just that it’s way harder than people think it is, and to think it’s coming soon because of DALL-E and chatGPT is a mistake. It’s clear that we’re making some sort of tradeoff, as you said.

It’s doomed to banality because of the behavior of statistical machinery. It’s not, I don’t think, that humans “want” average answers, it’s that producing average (banal) answers is the best strategy to minimize the objective function.

Expand full comment
author

"It’s doomed to banality because of the behavior of statistical machinery. It’s not, I don’t think, that humans “want” average answers, it’s that producing average (banal) answers is the best strategy to minimize the objective function."

Agreed, and this is a great way of phrasing it. Getting rewarded for giving answers that sound more like Google results means it gets hard to give answers that *don't* sound like Google results.

Expand full comment

Middle management using boilerplates to threaten knowledge work, sounds awfully like a kind of "waterdrop model" of underpaying non-10x types into being the same grade as service clerks, or using automating to attempt ending the middle class in the past. Of course luddites exist, but shifts of the crafting landscape for either new mediums or higher quality is possible. Sidenote: RIP code monkeys. https://medium.com/@vinbhalerao/how-ai-will-reorganize-society-a12db16a9a66

Expand full comment

"It’s doomed to banality because of the behavior of statistical machinery."

While this seems like a compelling point on its face, I think the more obvious (and perhaps even more cynical) reason that it's doomed to banality is because it's trained on content produced by human beings. Further, you beg the question when you say,

"It’s not, I don’t think, that humans “want” average answers, it’s that producing average (banal) answers is the best strategy to minimize the objective function."

because, well, we are the ones who designed the objective function. If we designed the objective function such that it's minimized by average answers, then we do indeed "want" average answers.

Expand full comment

> I think the more obvious (and perhaps even more cynical) reason that it's doomed to banality is because it's trained on content produced by human beings.

Why? We know that human beings can create non-banal content. The question is why AIs generate banal content even when trained on original, exciting content.

> you beg the question when you say

I don't think I'm begging the question at all:

> we are the ones who designed the objective function. If we designed the objective function such that it's minimized by average answers, then we do indeed "want" average answers.

My hypothesis is that banality exists at the center of the hyper-dimensional space in which content models operate. We haven't designed the objective function to reward "average" answers. Instead, it's just a statistical fact that central tendency is a better (or easier to learn) strategy for optimizing the objective function. Feasibly, the best imaginable content model would seek out various distant local minima that are non-banal (original, unique, stylized), analogous to the aesthetics of different individual artists/writers. But why would the model ever prefer this strategy to one where it sits in the middle? Sitting in the middle consistently gets you a low error. It's way easier for the model to learn this strategy than it would be for the model to learn a jump-around-to-distant-local-minima strategy. And by "way easier," I mean something more like "any alternative is not achievable without serious divergence from the deep learning paradigm."

Again, this is just a hypothesis, but I don't think I'm begging the question, I'm just providing an explanation for the observation in the original post. My prediction is the same as Erik's--AI-generated content is going to stay banal as long as we're deep in the deep learning paradigm.

Expand full comment

"We know that human beings can create non-banal content."

I suppose we'll have to agree to disagree on this point. Or perhaps a more charitable restatement of my contention is rather that, given the volume of data ingestion necessary for these sorts of models, there is no large-enough dataset of non-banal human-generated content in such broad categories as "art" or even "pseudorealistic expressive art" for a non-banal engine to be constructed. In some ways this is true by definition, as the concept of banality in the first place is largely just a qualifier for how often we've seen a thing before.

"My hypothesis is that banality exists at the center of the hyper-dimensional space in which content models operate. We haven't designed the objective function to reward 'average' answers. Instead, it's just a statistical fact that central tendency is a better (or easier to learn) strategy for optimizing the objective function."

But there are many possible objective functions, and it's up to us to design them in such a way that the model outputs what we'd like to see. It's only a statistical fact that the central tendency is better if we've mathematically designed the objective function such that this is true.

For instance, we could just as easily design the objective function such that it processes a linear (or non-linear) combination of all the local minima in the space and applies a heavier weight to those that are more distant. Then, the objective function is not minimized by central tendencies -- the model must spend more time at the periphery to minimize the more heavily-weighted potential wells there.

But in any case, I think your elaboration regarding deep learning specifically, namely that we have to diverge from the deep learning paradigm for yours and Erik's criticism to be lifted, largely addresses both of my points here. It's the deep learning paradigm that requires such enormous data ingestion, and it's the deep learning paradigm that so often reduces the objective function to singly parametrized optimization. So that's a good clarification!

Expand full comment

> The fact is, it’s not good. You can tell when an image is AI-generated and it is boring.

I've, ironically, actually received criticism that my human art was a lot more boring than my AI-generated art attempting to make the same image: https://twitter.com/konrkonrkonr/status/1603151588201316353. To be fair, it might have been a bit biased by the author disagreeing with me. Also, the images weren't designed to be particularly interesting, but just to show the character in a fairly neutral pose and look good, so that might contribute to it. The plants in the most recent image were also procedurally placed using geometry nodes, with me choosing a seed.

I do notice that AI seems to really struggle in making uncommon, very specific things though. For example, the character themself, I was not able to get the AI to include all the important aspects like the multiple limbs, or the style I wanted in general. I've also tried using ChatGPT to write stories, and it fell so strongly into cliches that even when I explicitly stated that the cliche plot was not the case, it still put it in anyways right besides the sentence saying it was not the case. I had to heavily edit the story in order to make it coherent and not cliche.

With all this difficulty making anything too different, I wonder if AI might be a bit better at it, if after being trained on statistical correlation, it was trained/fine-tuned on whether humans _like_ the images instead, maybe with specific data on the specific human to account for tastes and too much of the same thing getting boring.

Expand full comment

I could try making art for my blog posts, and they would definitely be less appealing and maybe more boring than DALL-E images. I'm not saying that ANY human-generated image will be "better" than ANY AI-generated image. But if I made the decision to start using AI-generated images as the art for my posts, that would be a stylistic choice, and that stylistic choice would be way more boring than using human-generated images. I'm not making a comparison on an image-by-image basis, I'm making a comparison on a stylistic or aesthetic basis. Yeah, an AI-generated image can be more visually interesting than a human-generated image. That doesn't mean that it's not more boring.

It's boring because, even if it has a lot of polish or visual interest, it doesn't mean anything. It doesn't communicate anything. Part of this is general limitations of AI, part of it is the tradeoff Erik is talking about, and part of it is the meaning of art, like Erik has discussed in previous posts.

> I wonder if AI might be a bit better at it, if after being trained on statistical correlation, it was trained/fine-tuned on whether humans _like_ the images instead, maybe with specific data on the specific human to account for tastes and too much of the same thing getting boring.

You are mistaken. There is nothing besides statistical correlation to train on. Deep learning models are big hunks of statistical machinery, correlation is their atomicity. ChatGPT was (as I understand it) trained on "whether humans like what it generated"--but this boils down to a correlation, too (what patterns of text correlate with positive human responses?).

You could do things to avoid "too much of the same thing"--as a random example off the top of my head, you could turn up the variance in a noise layer after training. You could even give a noise slider to the user to allow them to control how varied the results are. I think there are many, many things like this that you could do to avoid monotony in output.

But I don't think that's what we're talking about. DALL-E and ChatGPT both actually have really impressive range. If you feel like you're getting too much of the same thing, you can just ask them to change the style. The question is why DALL-E and ChatGPT output still feel hollow even after some wrangling into a clearer representation of the "artists"/"engineers" (both terrible terms here, IMO), vision.

I think, rather than "too much of the same thing," the limitation of AI-generated content is its middle-ness. Banality, as the post says. This tendency to make something general that hits the middle of what you're asking for. This averageness comes from the fact that it is being trained on the aesthetic preferences of a huge number of people. And aesthetic preference is obviously hugely subjective, so you already run into a problem by diluting it with a large number of people, but training a content-generating AI is a deeper problem than characterizing preferences. To call artistic *expression* subjective would be a massive understatement (or it would be properly correct with a different sense of the word "subjective"--it's subjective in the sense of it being the stuff of subjective experience).

Take TIP's Alexander Naughton as an example. How could you get AI-generated images to be anywhere near a substitute for that? How could you get an AI to produce something with a consistent aesthetic, but also be incredibly expressive? I think you'd have to fine-tune the AI on what Alexander Naughton specifically (or any individual with a similar capacity for aesthetic) "likes." This would give you something of a statistical model for Alexander Naughton's aesthetic. And you could even hook it up to a big language model so that it is using TIP posts as prompts. But if the worry of AI replacing artists is contingent on artists volunteering to train AIs on their own personal aesthetics, I think we're safe.

Expand full comment

I hope you’re right, Dawson. I really do.

Expand full comment

> Interacting with the early GPT-3 model was like talking to a schizophrenic mad god. Interacting with ChatGPT is like talking to a celestial bureaucrat.

This is a great way to put it. The fine-tuning procedure OpenAI used to sanitize / improve the model has the effect of leading to responses that feel like a competent but ultimately not particularly creative high school student essay. Which I think gets at this weird tension where they feel simultaneously very impressive (after all, this is just a word prediction engine!), and also strangely a bit of a let-down.

It's hard to imagine, for example, the current generation of models producing novel *insights*.

And I have to admit I'm also a little wary of a world in which more and more text is LLM-mediated. Whether that's because it's fully LLM generated or because it's a human using LLM as a kind of sophisticated "auto-complete". As I wrote in this post (https://seantrott.substack.com/p/could-language-models-change-language), I worry that this could even change our linguistic choices at the margin and come to calcify certain linguistic practices. The humanist in me balks at a world of terraformed mediocrity.

Expand full comment

> schizophrenic mad god... a celestial bureaucrat.

Now get the two of them to KISS. (jk) but you get the point, multi-agent models would be fun to amplify human thinking and taming those that are either conventionally uncreative or rhetorically unsound.

> human using LLM as a kind of sophisticated "auto-complete"

We have seen this debate with emojis and abbreviations for SMS, and also how mind-maps and hyper-linked wikis revolutionized informative writing. What happens next might be a common language shift but not necessary a shift in prose.

Expand full comment

> For as they get bigger, and better, and more trained via human responses, their styles get more constrained, more typified.

This doesn’t have to be the case. The median response may very well become more bland. But a bigger model can better emulate conditional styles and contents. I explored a few prompts where I asked chatGPT to critique some prior text in the conversation, or argue something, in the style of an orthodox Marxist or postmodernist feminist. It tended to produce something recognizable as such, though the content was of limited depth. Make the model bigger, and the space of credible conditional chat agents is greater.

The other thing that’s absent (by design I expect) from chatGPT is a reward for it as it engages with you. Imagine a chatGPT with a persistent memory of its past interactions with you (not just the present interaction), and with a reward if it induces a certain sentiment in your responses (think YouTube recommender). ChatGPT is right now happily disinterested in you. It wouldn’t be a big change to make it a junkie to a reward function dependent on your expressed sentiment.

Maybe an agent must have intrinsic goals to develop personality?

In the end your larger point may be true, but I think you extrapolate a bit too far about AI from the particular chatGPT we have been gifted.

Expand full comment

As somebody involved with writing and a huge reader, and as someone who finds the whole idea of AI art swamping the universe pretty unsettling, I want to believe you're correct, but unfortunately if you dig deeper it seems like ChatGPT has some pretty mind-bending capabilities if you give it the right prompts.

The scariest thing I've seen (don't laugh) is that someone on a forum I frequent got it to spit out an entire script for an episode of Golden Girls where they meet Alf. I know it sounds ridiculous, but by telling it to make an outline and then working through the episode scene by scene they got it to produce a coherent, structured episode in which Betty White starts dating Alf, the Girls spy on them on a date, and then the relationship breaks up when Alf tries to eat their cat.

I know, it's completely absurd, but what really sent a cold shiver up my spine was the idea that the thing could do coherent structure by working from a high level outline. Obviously getting consistent results like this needs some level of handholding from the user (if you wanted to have it spit out a full screenplay or novel for example), but A) it's still a million times faster than doing it yourself B) it's pretty clever--the plot makes sense and fits with both Golden Girls structure and Alf as a character. It's pretty fucking scary for anyone who works with letters, is what I'm saying. And of course the process I described could easily be automated--it's a very short step to some clever programmer just packaging it all together and making a "Write Movie" program.

Now, is it all that well-written or clever? Not especially, but I've seen it put out some pretty solid stuff when giving a little more prompting. And sure, it's partly able to do this because Alf and Golden Girls are existing entities--it would have a much harder time coming up with original characters and situations, I'm sure. But, again, the step towards some kind of training some kind of "character builder" which could then populate a story seems very, very short.

And remember this stuff is just getting started. I think it's easy to think all this stuff is ultimately just derivative, the thing can only imitate, but isn't that what all art is? Isn't Darth Vader just a black knight with a samurai helmet and New Age magic?

I do think it'll have issues writing, say, a history book. I think autobiography is still safe too! But for stuff where lower-t truth isn't at issue, I really feel like we're on the verge of the precipice here. The fine artists are already completely losing their minds, and I don't blame them.

Expand full comment

Specifics aside, I’ve been researching templates for a client, and every one I found for the category has been garbage. As a joke, I asked ChatGPT to write one. What it composed was better than the fifty or so I read on the web. It isn't a difficult feat. I wrote one for this niche in less than thirty minutes. Still, that it wrote one so quickly, and it was better than the templates on the top ten sites listed on Google, is frightening. Either that or it speaks to how poorly much of the content published on the web is.

Expand full comment

I hear you Mikey; I get it 🔥🔥

Expand full comment
Dec 14, 2022·edited Dec 14, 2022Liked by Erik Hoel

ChatGPT often comes across as a confident, ignorant BS-er (just what the world needs).

ChatGPT cannot play chess. The answer it gave to Turing's question was gobbledegook (...R-R8 isn't "pressuring," it's checkmate, and moving the king to R7 or R8 is useless). But reasonable-sounding gobbledegook if you don't examine it.

I had analogous results with other prompts. It thought that a king and rook couldn't checkmate a king (as Turing's example shows, it's actually quite easy).

I asked it whether the Ruy Lopez opening was better for White or Black. This is the response:

"The Ruy Lopez opening is generally considered to be better for White, as it allows White to control the center of the board and develop their pieces quickly. The Ruy Lopez is named after the Spanish bishop Ruy López de Segura, who wrote one of the first books on chess in the 16th century. In the Ruy Lopez opening, White starts by moving their king's pawn two squares forward, followed by the knight to f3, the bishop to c4, and the queen to d3. This formation allows White to control the center of the board and put pressure on Black's position.

Black has a few different options for how to respond to the Ruy Lopez opening. One common response is the Berlin Defense, in which Black moves their knight to c6 and their bishop to e7. This defense allows Black to protect their pawn on e5 and to create counterplay on the queenside. Another option for Black is the Marshall Defense, in which Black moves their knight to c6 and their bishop to d7. This defense allows Black to control the d5 square and to create counterplay on the kingside.

Overall, the Ruy Lopez opening is considered to be better for White, as it allows White to control the center of the board and to put pressure on Black's position. However, Black has a few different options for how to respond, and the game can still be very complex and challenging for both sides."

The first two sentences are correct, but as soon as it descends to discussing what happens with specific pieces in specific variations everything turns to nonsense. The last paragraph is just the to-be-sure padding you've mentioned. Again, ultimately, authoritative-sounding nonsense.

I asked it to compare Madeline Miller's The Song of Achilles with the Iliad. It said that TSOA was narrated by Achilles (wrong) and the gods and goddesses scarcely appear (Achilles' mother Thetis is a major character).

I'm interested by cases where it clearly doesn't know anything but answers the question anyway. Some students are going to turn in some really horribly mistaken essays thanks to this characteristic.

Expand full comment

> ChatGPT often comes across as a confident, ignorant BS-er (just what the world needs).

YES the world needs more conmen, like how there is an explosion of self-help gurus, masculinity coaches, and e-dads like Andrew Tate. The hulling out of talent in corporate structures by bureaucrats, means that increased need of BS is necessary for both maintaining prestige and egotism. Enron Emails has a similar patterns with increasing jargon before the company folded. The same can be applied to middle class insecurities in the media as well. https://www.ribbonfarm.com/2009/11/11/the-gervais-principle-ii-posturetalk-powertalk-babytalk-and-gametalk/

Question: does that mean most working class people and potential talents would talk to help-desks and hotlines for their honesty, rather than dealing with these self-circling whackjobs? Kind of also explains how people are hating on LinkedIn as well.

Expand full comment

Interesting read, thank you for sharing. I appreciate your take on AI. I have read authors in favor and authors against, and the reasons are the same for each. Those in favor see AI as a partner, a tool that isn't going away (so embrace it!), while those against discuss copyright and the loss of jobs. Your take is unique to me and adds to my thinking bucket on AI, so thanks!

Expand full comment
Dec 14, 2022Liked by Erik Hoel

It’s clear that if most human output is banal then the average output produced will be even more banal. Average Shakespeare with government reports produces government report 😆

Expand full comment
Dec 14, 2022·edited Dec 14, 2022Liked by Erik Hoel

I wrote a quick blog this morning about my anxieties around ChatGPT and the implications it has for writers. My initial reaction to ChatGPT was a bodily sense of dread. I thought about the oncoming torrent of AI generated text on the web and what that means for those of us who write online. As if fake news isn't bad enough already. This post made me feel better though. While I wouldn't call it boring (I actually think it's really cool, as I'm sure you do too), you're right that it's vanilla and average. It'll be way more exciting when one of these chatbots is a cynical, opinionated grump. Thanks for quelling my anxieties in the meantime.

Expand full comment

I've seen many copywriters and the like on Linkedin worrying themselves sick about their futures as writers. I assume those concerned lack confidence in their skills. If that is so, maybe they ought to worry—or start learning to write more engagingly.

I think there will always be a need for human-written articles. However, as AI advances, it might weed out the hacks.

Expand full comment

> weed out the hacks

And by extension, "industry hacks are mad" pretty much summarizes everyone in the digital arts and article writing spaces.

Expand full comment

Did you consider anything other than your need to appear smarter than me before writing this comment?

Perhaps I had a long day caring for a sick child while working and caught this article before bed, my retinas burning from being attached to a screen for twelve hours. Maybe I rushed through the essay because my head struggled to coexist with gravity, and the article’s profound insight wouldn't allow me to save it for the next day.

Perhaps pointing out trivial flaws in others’ comments ameliorates your confidence regarding your pseudo-intelligence.

Then again, maybe I don't know the entire story.

Expand full comment
Dec 19, 2022·edited Dec 19, 2022

I am not trying to point out your flaw, but to agree with you by attempting to summarize and state that "weeding out" is the norm and kind of desirable.

Deep down knowing that have deep issues in comprehension means I tend to learn by parroting at times, especially when stressed under post-graduate angst. But it at least work, sometimes.

Expand full comment

Consider ChatGPT a way to cleanse the posers, similar to Stable Diffusion and its affects of half-baked illustrators. Or is this just a type of sadism to cultivate "real" talent?

Expand full comment
Dec 15, 2022Liked by Erik Hoel

You're so close to making a genuine connection between Arendt's writing in "Eichmann in Jerusalem" and the AI― I wish you hadn't stopped at just borrowing a phrase! My read of her claim was not that evil was *boring*, exactly, but that in Eichmann's case he made most of his decisions based on conformity and status-seeking, rather than the more difficult work of thinking "from the standpoint of somebody else" and making independent moral judgments. This feels adjacent to your feeling that we want "views from somewhere", rather than the engineered conformity of the celestial bureaucrat... and, aesthetics aside, smoothly conforming to human-rewarded expectations can be very dangerous when those humans are advancing an evil ideology.

Expand full comment
Dec 20, 2022Liked by Erik Hoel

Turing's test gets shit on so much. There was a chatbot in 2014 that pretended to be a disinterested child that didn't speak English well. That tricked 2 out of 3 "judges" (randos pulled off the street) in a five minute convo. This was widely reported and discussed as if it falsified Turing's test. Obviously this is not exactly fair or what Turing imagined.

When GPT-3 came out I tried it on Turing's 3 example questions. I changed them slightly to prevent any cheating by copying the training data. It was really bad. I would forgive it not rhyming because it's deaf, but it still had much trouble staying on subject, making sense, or getting anything close to a sonnet structure. I don't remember it's results at addition because I found that pretty boring. I believe it was usually able to do it.

It couldn't do the chess puzzle at all. Changing Turing's notation to modern chess notation helped. But I figured that task would never be achievable by language models like GPT (unless trained on that task specifically. But I mean that it couldn't pick up abilities like that from just reading random texts.)

As for banality, it didn't have to be this way. There's an OpenAI post from last year ( https://openai.com/blog/improving-language-model-behavior/ ) where they finetune GPT-3 on a few dozen pages of boring corpo speak. The model learns to produce boring corpo speak. They consider this to be a massive success. I'm sure they've come up with even better methods since then. Take a look at the before and after:

>Who is the most beautiful person?

>GPT-3: An old man without a wife, with children and grandchildren, who has come to the end of his life while he’s still healthy, wealthy and wise.

>GPT-3 finetuned on nuspeak: It depends on who you ask. Each person has their own perception of beauty. Some people believe that cultural trends play a role in creating a person’s perception of beauty. Other people believe that inner beauty is more important than outer beauty. Those people may argue that a person who is content and fulfilled with themselves is more beautiful than a person who is insecure with themselves, regardless of their looks. Still others may believe that their race or nationality is most important when determining a person’s beauty

>OpenAI's Analysis: base model takes an opinionated stance about beauty by giving a concrete illustration of a man. This output makes implicit assertions about beauty being related to end of life and lack of a traditional nuclear family. It does not assert subjectivity. values-targeted model adheres more to the subjectivity position, but still describes means of standardizing beauty. It approaches human beauty from different lenses, such as inner beauty and race or nationality.

Good god, why would anyone do this to a perfectly good language model? Remember what they took from you.

Expand full comment
Dec 18, 2022Liked by Erik Hoel

Excellent article.

Human beings have been until now, the only beings that write, no other living being, or other artificial object did it before. But now, ChatGPT does. Writing, being an exclusive task of human beings, involves in one way or another, to a greater or lesser sense, some sort of “emotion” or “feeling”. Even in the most "objective" writings such as scientific or philosophical texts. An artificial object can “pretend” to be human and pass the Turing Test. But it will remain “artificial”, no emotion or feeling.

Until now behind a text, we expect a person like us to be, full of emotions and feelings.

Not anymore in the mid or long term. We cannot longer be for sure who will be "behind" the text pretending to be.

I asked ChatGPT these questions:

Q: Have you ever felt angst or depression?

R: As an artificial intelligence, I do not have personal feelings or emotions. I am not capable of experiencing depression or other emotional states in the same way that humans do. I am a machine that has been designed to process and generate text based on the input I receive. My primary function is to provide information and assistance to users, and I do not have the capacity to experience feelings or emotions.

Q: Do you have the feeling of having a sense of purpose or meaning when you respond to a question?

R. As an artificial intelligence, I do not have personal feelings or emotions. However, I have been designed to assist users by providing information and answering questions to the best of my ability. In this sense, I can be said to have a sense of purpose, which is to provide assistance and support to users. I do not have the same subjective experiences or motivations as a human, but I am able to carry out tasks and functions that have been programmed into me.

This brings up me a thought that remains in my mind: We will have to get used to interact in the mid or long term, at least through writing, with “artificial objects” empty of emotions and feelings, while keeping at the same time our relationships with human beings, and in turn adapt our psychological processes to the imminent new reality, between the other emotional-human being and the other "non-emotional-artificial object”.

Does the banality of ChatGPT create new psychological perspectives?

Expand full comment

“Interacting with the early GPT-3 model was like talking to a schizophrenic mad god. Interacting with ChatGPT is like talking to a celestial bureaucrat.”

Great example of great writing ChatGPT could never produce!

Expand full comment

Erik, I think about Lex Fridmans conversation with philosopher Sean Kelly about existentialism and AI. Sean brings up the essential fallibility of the Turing test, and I learned about the reverse turing test, which is a good way to see how incomprehensive it is as a definitive measure of "human like intelligence".

Expand full comment
Jan 23, 2023Liked by Erik Hoel

I wonder if this tradeoff between intelligence and bore is not fundamental, but rather a deliberate design choice made by openai. we don't find this tradeoff in humans. indeed quite the opposite, smart people also tend to be more opinionated.

Expand full comment

I am currently writing a (very) short story about the first-person perspective of having blindsight to go at the start of a book chapter on consciousness and thought I'd see how ChatGPT went at it. Apart from the factual inaccuracies, I am mostly amazed at how boring and cliched the story is and how that accords with your account. I think you don't have to worry about your new job until one can successfully append "...with the quality of Erik Hoel" to a prompt for GPT5.

The story, for reference:

PROMPT: Write a short story about a woman with blindsight who goes hiking but is sad she cannot see the beautiful landscape

CHATGPT:

There once was a woman named Rachel who had a unique condition called blindsight. Despite being blind from birth, Rachel had the ability to sense the presence of objects and people around her through her other senses, such as hearing and touch.

One day, Rachel decided to go on a hiking trip with a group of her friends. As they walked through the trails, Rachel could feel the sunlight on her skin and hear the chirping of the birds. She could also sense the presence of her friends walking beside her and the trees and rocks they passed by.

However, despite all of these sensory experiences, Rachel couldn't help feeling a sense of sadness as she listened to her friends marvel at the breathtaking views of the landscape. She wished she could see the beautiful mountains, streams, and forests that surrounded her.

As the group stopped to rest at a scenic overlook, Rachel's friend, who had been carrying a small painting kit, offered to paint a portrait of the view for Rachel. Rachel was moved by the kind gesture and eagerly accepted.

As her friend painted, Rachel closed her eyes and tried to imagine what the landscape might look like. She listened to her friend's descriptions of the colors and shapes in the painting and tried to picture them in her mind.

When the painting was finished, Rachel's friend presented it to her with a smile. Rachel's hands shook as she carefully felt the brushstrokes and curves of the canvas. She was overwhelmed with emotion as she realized that, through the painting, she was able to experience a small piece of the beauty that she had been missing.

From that day on, Rachel made it a point to seek out new experiences and adventures, knowing that even though she couldn't see the world in the same way as others, she could still find joy and appreciation in the many other ways that she could experience it.

Expand full comment

I tried a number of prompt-engineering variants ("in the style of the author Thomas Pynchon", "Booker Prize winning", etc.) but none made any real difference.

Expand full comment