79 Comments
User's avatar
Marcus Seldon's avatar

This isn’t a perfect analogy, but I’m curious what you think of it as it came to mind while reading your piece.

LLMs are to white collar labor as factories are to blue collar craftsmanship.

Both can produce ok but fairly generic products very easily and cheaply, and often that is good enough. But in many cases it’s also not good enough. A skilled craftsman still makes better furniture than IKEA. And even when we don’t need amazing work, there are many cases where you need a result that isn’t generic.

Another way the analogy works is how we’re flooded with low quality consumer products, and many people in rich countries now struggle with clutter in their homes.

Erik Hoel's avatar

I think it's a quite good analogy. But notably, when it comes to intellectual output, where it differs from physical goods is that only the artisanal stuff matters. Which is why great science is still produced by small teams, and great art often still produced by one person. These are markets where there's not a lot of benefit to "mass production" at the top.

Greg Billock's avatar

Often, but arguably the most widely consumed art (think movies, music, shows) is produced by very large teams and therefore has all the normal procedural features of every human effort requiring large group coordination

Samuel R Holladay's avatar

Music is still produced by fairly small teams (bands of less than 6 or 7 people, or groups of 5 or fewer singers plus maybe 5 producers).

Film and TV are the same medium production wise. The creative above-the-line talent is still fairly small (writers rooms have less than 15 people usually, plus a small group of directors, plus the department heads). The main trend here is not to reduce costs with AI, but to reduce crew costs by shooting in cheap countries that give big tax breaks. That is the main Hollywood trend: fly your talent out to the UK or Hungary and shoot there. You can produce cheap Netflix content on an industrial scale overseas without investing in AI.

Dmitrii Zelenskii's avatar

Well, no. Just no. There is lots and lots of text tasks where being artisanal is optional. Hell, just look at Character.AI — one would _think_ that you need a human for their tasks, but actually, despite the occasional repetitiveness, they simply win by accessibility.

(Ironically, that's why I would inertially predict a grim future for all those gem-based Character.AI clones that try to limit generations: its ability to respond on demand over and over is its strength, which they lose. Unless, of course, they just course on whales who pay out for as many generations as they need - which is, for better or worse, quite a plausible story.)

Brandon Fishback's avatar

Only a the artisanal stuff matters but there is still demand for things that aren’t very artistic. Many Netflix shows are lacking in soul and still watched.

Culture is definitely worse than 2016 but the big inflection point wasn’t AI, it was COVID and its effects on society. Everything has been worse since then.

Liam Riley's avatar

I think the factory analogy is not the best choice, as industrial mass production in factories *benefits* from standardisation, whereas it's kind of the opposite we're after with white collar work and LLMs. Mass production is great for quality control at scale, LLMs not so much.

In this way a better analogy for LLMs than industrial factories are clothing sweat shops in Asia, in that their value is speed, ease of access and low price point, not quality or standardisation. Most factory made items are of a better standard than LLM output. They actually meet our needs consistently.

Gregory Forché's avatar

I have been paying for a while that we should not be calling them “data centers “ but “answer factories” . In keeping with your point.

Let’s keep the change to our world away from empty abstractions.

Dmitrii Zelenskii's avatar

I think this is a good analogy, and I think that it predicts where it goes. Nobody in their sane mind goes to a skilled craftsman rather than IKEA if what they need is a chair/table/... rather than a work of art.

Carlos's avatar

Let's run with this. Having non-IKEA furniture or handmade shoes is primarily a status symbol. Similarly, for example using lawyers who do not use AI (or good at hiding it) can be a status symbol.

Avi (Firecrystal Scribe)'s avatar

I really appreciate this article. I've noticed the people I respect the most talking about AI have now moved to reluctantly acknowledging that there is a difference between training AI in skills that are automatically checkable, like passing a unit test or proving a mathematical theory in a formal language, and skills that are not easy to check, like writing good prose or making a high-quality informal academic argument. Our current methods for training LLMs don't continue to scale for the hard to check skills.

The idea that we will soon have automated software engineers and researchers, much less children's book authors, appears ridiculous when you consider this. Until we have another paradigm shift in machine learning, this isn't going to change. No matter how much the labs pour money into bespoke reinforcement learning frameworks and agentic scaffolds, they aren't going to be able to endow the models with human level expertise in these hard to verify capabilities.

Kala Hansen's avatar

I agree with the distinction between skills that are easily verifiable and those that aren’t. In my own work with AI (including tools like ChatGPT), I’ve run into an interesting boundary that touches on this exact problem.

LLMs are very strong at organizing information and reasoning within established logical structures. But in investigative or exploratory thinking, humans often work with possibility spaces first: patterns, intuitions, or hypotheses that aren’t yet provable. When I present those kinds of ideas to AI systems, they sometimes push back because they don’t meet a strict logical threshold.

But that gray zone is often where human insight begins. Writers, investigators, and researchers frequently move through uncertainty, intuition, and pattern recognition before something becomes formally provable. You see this in thinkers like Jung, who argued that a lot of cognition operates through symbolic or subconscious pattern detection before it becomes explicit reasoning.

The real paradigm shift may be in teaching AI how to interact with that exploratory phase of thinking (the messy stage), where ideas are still forming and logic hasn’t fully crystallized yet; it’s a space I’ve found myself thinking about a lot in my own investigations into how these systems interact with human reasoning.

8Lee's avatar

OMG, yes:

> The worst users flood the zone.

As a software engineer this is so apparent b/c AI "slop" is so easy to identify, simply because none of it actually works in production. A simple, binary audit that reveals as much as I need but way more than I wanted.

I think this has become the case in many other industries where writing is the core modality. Wait, that might be every industry... what am I talking about. You get a rock, you get a rock, EVERYONE GETS A ROCK.

I can't wait for the midwits to move on to the next iteration of OpenClaw.

Anthony's avatar

Thanks for pointing out this paper on reliability, which is really useful! I've been harping to people for a while that the barrier to economic adoption of AI is reliability, and that AI development should focus on reliable tools rather than unreliable (and uncontrollable to boot) agents.

However I worry you are being too optimistic (by my metric) in this piece about remaining a tool (well, along with a person-simulator) - reliability is increasing and there is a lot of low-hanging fruit in improving it because it is not really being measured and trained against. And in mathematics and programming (and a bit physics) where you can generate verifiable training challenges as synthetic data, things are really taking off. Writing is not really one of these things.

Companies are working really, really hard to make agents, pouring more $ and IQ into it than any past human endeavor. We should not bet on their failure, but advocate for a different goal: not AI agents to replace us but trustworthy AI tools to complement and empower us.

Wabi Sabi's avatar

Loved this. Sam Kriss has a very good essay that analyses the typical LLM writing style. He points out that, because LLMs don't operate in a physical environment the way organisms do (btw in my view that means they can't possibly have emotions, and very probably can't experience consciousness either), their metaphors often fall flat, or if you think about them for half a second, don't even make sense. The analogies aren't genuine because they don't really refer to physical reality, they're just words referring to other words. The machines talk, but fundamentally don't know what they're talking *about*.

That's what I think of when I read a passage like 'You don't shine by being the biggest. You shine by being yourself' - it doesn't actually mean anything. Firstly, the first sentence should be 'You don't shine *brightest* by being the biggest'. And secondly, what's the connection between a star "being itself" and literally emitting more light, even in the world of children's book logic? Maybe reading the rest would give me a satisfying answer to that question, but I doubt it.

Admittedly, the average person's writing is as bad as this, if not worse (though they make up for it by, you know, having conscious experiences, feeling things and living lives). But I don't think thinkers and writers that are genuinely creative, original or inventive have anything to fear from the current crop of LLMs. If anything, they stand out more than before.

Dennis Nehrenheim M.Sc.'s avatar

My favorite example of ChatGPT not understanding language is currently this: "I want to wash my car. The car wash is only 300 meters away. Should I walk there or drive by car?" It will tell you and rationalize, with absolute certainty, that you should walk there. LLMs can't and won't ever be able to "read the room" - they can't extract human subtext, calculate reliably, or, as you said, think in 3D.

Brandon's avatar

I just asked Gemini that exact question. Here is its answer:

"Since your goal is to wash the car, driving it there is definitely the way to go.

While a 300-meter walk is a great way to hit your step goal, it makes scrubbing the chassis a bit difficult if the vehicle is still sitting in your driveway. Unless you’ve discovered a way to make your car follow you like a well-trained retriever, you'll need to be behind the wheel.

A few quick tips for such a short trip:

Engine Health: Since the car won't have time to reach its optimal operating temperature in just 300 meters, try to avoid "short-tripping" too often, as it can lead to moisture buildup in the oil.

The "Dry Off" Drive: After the wash, you might want to take a slightly longer route home (maybe 5–10 minutes) to help air-dry the brakes and blow water out of the crevices to prevent spotting.

Would you like me to check the weather forecast for Palmerston North to make sure there's no rain scheduled right after you finish?"

Dennis Nehrenheim M.Sc.'s avatar

That’s a good answer. Claude also did better than ChatGPT for me. But ChatGPT got it consistently wrong. But LLMs are not deterministic. With a slightly different phrasing of words Gemini might also get it wrong. Also, Gemini might get things wrong ChatGPT gets right.

Rebecca's avatar

also I want to point out - a lot of people (normies?) can't tell AI writing from real writing whatsoever

Rebecca's avatar

When polished, efficient writing becomes cheap and ubiquitous, people may begin to value the signals of real authorship...things like idiosyncratic voice, imperfect structure, even mistakes.

I sometimes think messy writing might become a kind of proof of work (to borrow from crypto) for human artistry and thinking, and I'm wondering if I should just keep my mistakes in my blog moving forward. Then I think: but wait, at the same time the total volume of content today means efficiency and messaging will matter more than ever for mass communication. So we may see a split... highly optimized, AI assisted writing dominating the mass market while more distinctive, human writing becomes a marker of taste and authenticity. But at the end of the day, our tastes are going to change. Proust/Doestoevsky, versus the evolution to Stephen King in the 90's versus tomorrow's writers. We will always have to return to the 1800 classics when we want something good. And as for kids books, I'm sure Dr Seuss still rules the day (despite woke controversy).

John Van Gundy's avatar

“And now here is my secret, a very simple secret: It is only with the heart that one can see rightly; what is essential is invisible to the eye.” — Antoine de Saint-Exupéry, The Little Prince

Unirt's avatar

Very interesting! I wanted to give you an example of an actual AI-related writing improvement that is very important but may be invisible to you. Most people around the world are not native English speakers, and many of them are actually Pretty-Bad-English speakers (me included). Before LLMs, motivation letters from PhD position candidates were often in Bad English; now they are always in Perfect Officialese English. Research papers don't need "native speaker editing" anymore. Of course, it's all horrible to read, but remember this: a person writing in Bad English sounds a bit mentally challenged, while the one writing in Officialese sounds like one with normal intelligence but no sense of humor. It's better. This is a staggering change for the vast majority of people!

Secondly, I'm still hopeful that AI might help us with the actually important stuff, like curing diseases and growing food plants more efficiently. These tasks are not waiting for unusual genius insights; the insights are already there. The problems are in the horribly complex systems, such as human physiology, which are beyond our brains' ability to grasp naturally, whereas AIs may have exactly the necessary abilities, either now or in the future.

Aman Karunakaran's avatar

I liked the piece and agree with most of it. Im curious, regarding this part

>In this conservative view, by 2030, the top 100 math papers of the year won’t look spectacularly different from the top 100 math papers of 2020. The top 1,000 papers?

What do you make of Terence Tao saying that AI is already impacting the way he is currently doing math research? I wouldn’t have thought to pay much mind to AI’s impact on math research were it not for someone like Terence Tao making this statement. I imagine he likely would fall under the top 100 bucket rather than 101-1000, and it feels interesting to note that there are already gains there.

Furthermore, it does seem noteworthy that in the last 6 months, even high level software engineers (say those at big tech companies, AI labs, and trading firms) are all experiencing a boost in productivity (anecdotally from folks I know in such places) from AI tools.

I’m prepared to accept the thesis that math and software engineering are just “different” because 1. they’re verifiable domains which lend themselves more nicely to RL post-training approaches and 2. these are the specific domains AI labs care most to improve and are spending the most money trying to improve. However, I’m curious if you have any different perspective on this

Erik Hoel's avatar

> I’m prepared to accept the thesis that math and software engineering are just “different” because 1. they’re verifiable domains which lend themselves more nicely to RL post-training approaches and 2. these are the specific domains AI labs care most to improve and are spending the most money trying to improve. However, I’m curious if you have any different perspective on this

It's a great question and the answer is I think it's no different from writing. They are just finally experiencing it. We writers were just ahead. We were ahead by years. And we've seen the effects (on books, on blogs, etc). Meanwhile, finally the LLMs can understand what a top mathematician like Tao is even talking about. And so he confronts the same thing the rest of us mere mortals confronted more than half a decade ago. He's shook! Things are changing! Etc. And that's all correct... but it's much more about the Long March finally reaching these outposts than those outposts being actually more affected by it.

Thomas Thorp's avatar

The point of writing is not to produce good books. The point of writing is to produce good writers. We are not Homo faber due to tool use; rather we are homo faber because in remaking our world we make ourselves. Sophocles had this pretty much figured out. The choral ode we call "ode to Man" (Ant. 370 - 410) starts off with a list of "wonders" that are man's inventions: sailing, agriculture, domestication of animals, but then interrupts the list of "contrivances" with this (Grene's trans) "And speech and windswift thought / and the tempers that go with city living / he has taught himself . . ." Again, we don't make tools in order to make more tools (we do but that's missing the point); we fabricate in order to remake ourselves, our world. We write well in order to become good.

Who?'s avatar

You always write thought-provoking pieces, Erik.

I do want to hear, though, in more detail how your thinking on AI risk has changed since 2021 and “We Need a Butlerian Jihad”. Your essays in that period were, in my lights, absolute barnstormers, and your work certainly moved the needle for me on the subject. It felt to me you were exactly right.

Yet in 2026, especially, it’s become apparent that your position has (perhaps) softened substantially. You don’t *appear* to be very worried about the dangers of frontier models, or gradual disempowerment, let alone existential risk anymore. At least, it doesn’t seem to be a focus of your recent writing. Even with this essay and your last Desiderata, your mental updates on the subject seem like they could profit from a whole essay or two.

Personally, I’m not… persuaded that there’s no “there” there. Quite apart from the powerful arguments that have not been settled, the Pentagon-Anthropic spat is staring at us right in the face.

I would be remiss if I didn’t mention one more thing, with tact and respect.

From Bicameral Labs:

“Phase 5: Consciousness Tech

An unknown world awaits: hyper-data-efficient learning, creating (or preventing) artificial consciousness, experience engineering, and much more.”

I have to confess that the idea of unlocking a research path to the creation of artificial consciousness sent a shiver down my spine. I don’t know what to make of it.

You’re a writer I hold in considerable esteem, Erik. Please tell us what’s going on.

Erik Hoel's avatar

Heard.

I will absolutely do a large “How I’ve changed my mind about AI” update.

In brief: I think existential risk is much lower. Disempowerment is also lower. Why? Because we have way more experience with these systems than we did when GPT-4 came out, which is when I most worried about it.

However, that doesn’t mean AI = always good. I think many of the original problems I (and a few others) pointed to on the effect on culture have become glaringly obvious. I also think empowerment in the hands of the few (rather than no humans in the loop) is a major problem. But our pointing out those cultural problems early has had a better track record of explaining the situation than say, looking at existential risk.

And regarding artificial consciousness: it doesn’t mean I want to go full steam ahead on building it! I think rather we need to understand the dividing line. And then we can make actually informed choices to, say, not make conscious AI. I think it’s really important to develop that line because otherwise we’re totally in the dark about both AI welfare but also worry about humans anthropomorphizing.

Susan Knopfelmacher's avatar

What about the rapidly escalating use of LLM Chat platforms by adolescents to produce essays, snd writing tasks in general; largely without significant intervention by schools to prevent or ameliorate it, At this critical phase in development of mental schema / capacity for reasoned thought. this decognition will surely have a cascading effect on the already poor quality of thought and writing in decades to come.

Erik Hoel's avatar

I think this is a much more realistic problem than concerns about existential risk

Susan Knopfelmacher's avatar

Excuse fat finger errors/premature posting!

Steve Place's avatar

This makes me feel SO much better. I get to keep my job!

Becoming Human's avatar

Awesome.

Though I would quibble with your summary of “The Giving Tree.”

It is a parable of codependency and narcissism. ;)

Matt Duffy's avatar

> a prompt is an injection of human intelligence, and a scaffold too is an injection of human intelligence (but the advantage of scaffolds is that you can put a ton of domain-specific knowledge and tips and tricks and guides in the scaffolds...

This makes the writing case quite clear. There is no efficiency gain from building the necessary scaffolding, so nobody does it. If I were to build writing infrastructure, I'd need to make ongoing, unpredictable adjustments. Sufficiently improving the tool is approximately equivalent to doing the work, itself.

Kevin McLeod's avatar

Every technology eventually fades then fails. Welcome to the end of language, at least as an arbitrary signal.

"We are not in a glut of good writing. We are in a dearth of it. This is surprising and counterintuitive, because for an LLM, words are its womb, its mother, its literal atoms"