The NYT has a much, much wider audience, and for all we know it might have a more persistent archive into the future. I loved the piece. I especially treasure the phrase "cognitive micronutrient." It's an example of what an A.I. could never come up with: a tiny linguistic seed that take roots in human minds to yield who knows what kind of fruit (some of which will hopefully be shared on Substack.)
hahaha and THAT is why a copy-editor is sometimes useful. And to be fair, I usually have more than a morning to write a post, they pulled the trigger last night and it was all a rush... another issue with writing for pubs.
Congrats on the essay. Today's episode of Pedantry with Pete: the English word "subjugate" comes from the Latin, sub iugum mittere, or "to send under the yoke." A defeated army had to pass between two facing rows of soldiers with their spears extended in an arch. This was deeply humiliating.
Interesting - what's funny is that line is a throwback to an older piece, when an editor said to me (when I was younger) that "everyone, from Noam Chomsky to Sheldon Glashow, submits to the yoke" because they were mad I threw a fuss at how they had butchered the piece. https://www.theintrinsicperspective.com/p/writing-for-outlets-isnt-worth-it
Came here to mention that "editorial yolk" is an excellent typo.
Also, come on, man, ease up on the self-flagellation. Ask the questions, "Is there one platform that is better than all other platforms for disseminating ideas? Is there one publishing format that is better than all other publishing formats?" They are self-answering. Congrats on the piece!
Just for your information: I read your article in the NYT today, had never heard about you before but found the article very relevant. I then found you on Substeck and signet up as a subscriber.
You were supposed to destroy the NYT, not join them!
I'll probably heartily enjoy your article in a few hours once I'm freed for the weekend. That said, in the short term I remain pessimistic about watermarks. Open source models will likely offer the ability to either strip or obfuscate watermarks for deception or plausible deniability, respectively, and that will mean company-controlled models will be the only "trustable" ones; yet who watches the watchers, and how will we know if these companies abuse our trust in their watermarking technologies? Today the word fascism is drenched in the flavor of the right wing but it was originally based on a melding of state and corporate power, a modern command economy that could be wielded by either political persuasion if desired. I dread a future where the only two options are technofeudalism or complete chaos.
I wouldn’t have discovered you or Substack if not for your NYT article. I have read lots of articles on AI and trying to use it in behavioral health but yours was the first one that scared me. Great work!
I agree with the concern behind your article and I'm glad you raised it in the New York Times. AI pollution of the commons is a real problem and needs to be discussed by a wide audience. However, the solution described in your piece is all based on an assumption that strong watermarking is possible. I believe that assumption is false.
You said that "major A.I. companies are refusing to pursue advanced ways to identify A.I.’s handiwork." This is untrue. For example, OpenAI hired Scott Aaronson for a year to work on this. As he says in his blog post (https://scottaaronson.blog/?p=6823), his "main project so far has been a tool for statistically watermarking the outputs of a text model like GPT." Other labs have pursued similar efforts as well. It's not that they haven't tried; it's that no option is particularly good. I suggest to you no technique works well and can't be easily broken. You can see this in some papers that have come out recently (https://arxiv.org/html/2310.07726v2, https://arxiv.org/abs/2305.03807), but you don't need to scour the literature to know that this is true.
Some simple techniques would indeed work against the least sophisticated actors. But it's not going to take an expert to break these watermarks; a motivated, college essay writer with access to Google will be able to.
And then we get to something you didn't mention: The cost of an inaccurate method. Whether it suffers from false positives or false negatives, in many ways, it could be worse than no method at all.
I challenge you to come up with a watermarking system I can't break. I mean this challenge honestly, and I would very, very much like for you to succeed. Being that this system would have to be deployed widely, you would have to for me to repeatedly test against it (e.g. via API) and still not be able to break it. I don't think you can. It's not about me or you or Scott Aaronson, it's that the task is impossible.
I appreciate this, since this is actually the substance of the debate. Broadly I think two things.
First, I think watermarking techniques do clearly work. The papers you cite contain some examples of these methods. But they require interfering with the output side, and my guess is not that the companies can't implement them, it's that they don't want to. They mostly want, in other words, to not bake in detectability to outputs - but if you don't do that then I agree it's necessarily impossible. Scott himself says " we actually have a working prototype of the watermarking scheme" wrt OpenAI.
Second, I admit that very advanced countermeasures could be used. I am mostly unconcerned about them. This is because (a) I don't think that the frame of "the adversarial side has all the advantages" is true. It looks a lot more like a standard defense/offense arms race, and there are all sorts of defensive moves, like rotating ciphers, that will likely work extremely well. And then also (b) even if there always exists some advanced offensive techniques, these, as they get more advanced, will be used less and less.
Let me give one example of this. The easiest method to avoid detection in text would be to ask another AI for a rephrasing. So you might ask Claude to rephrase ChatGPT (this is what Scott brings up). But if both are baking in detection methods, then all you've done is swapped the ChatGPT marker for Claude's! So you are really relying on finding obviously sketchy open source countermeasures... and I think people will mostly not do that. If such open source offensive techniques are illegal (and there are various arguments for that, it's pretty sketchy ground to just remove watermarks from company products and repackage them as your own) that tiny minority becomes a vanishingly small number.
Let's use your example of a smart motivated college essay writer. Right now they use ChatGPT to write essays. They care a lot about their career. They hear that now there is a watermark. They can't go ask another big company AI, because they're all doing it. Maybe they can find www dot removewatermarks dot com maintained by a sketchy Russian website, but will they bet their grade, in fact their entire future success, on that? Probably not.
This impassioned comment just breaks down to the assumption, which the first commenter already said, that it’s impossible to add watermarks without totally nerfing function. Given that Claude can write a paragraph about any subject under the sun where the first letters of each sentence spell out AI GENERATED, I don’t think this is the case. As for using open source for jail breaking - that assumes the models will be just as advanced and just as used. Two more assumptions I strongly disagree with, on both counts. History doesn’t really repeat, and saying that I’m on the side of the people who drove Swartz to suicide is just overly emotional purple prose.
Even though it may feel like turning back to the dark-side, I actually think it's a huge step forward having writers who dominantly post on Substack to write articles for the major news corporations. Although they have lost tons of viewership and credibility, they inevitably will stay around and continue to be read by many people, even through spite. Having credible and critical authors with a new perspective write for them outside of their primary quorum of bandits opens a lot of doors back to non-propagandized media.
Was just thinking of you as Sam Harris and Robert Sipulsky went on and on in Sam’s podcast with “downward control” tropes about causal supervenience, that seemed to call for a reading of your take. Pressure doesn’t need to”control” the billiard ball atoms to be the right level of description, is maybe how I internalize it, but I too could stand another reading.
I feel torn writing this, since I support you and it sounds like I'm being nitpicky for no good reason, but 'yolk' is for eggs. What you struggle under is their editorial 'yoke'. This article's topic is necessary and important and I agree with you, and yet a publication with editors and proofreaders would not have this in the opening sentence. I almost want to cry.
It's an endemic problem of a lack of seriousness on Substack. As a passionate supporter and reader of Substack authors, who I am proud to support with my subscriptions, I struggle every day with how much it bothers me that all of you desperately need to hire a proofreader. I can't count how many I've unsubscribed from because they cannot provide this modicum of respect to paying readers.
I commented on Freddie DeBoer's blog asking him to hire a proofreader when he made a ruinously confusing passage, that could have been fixed with a simple grammar correction, discussing an issue on which he and I agree which was dearly important to me. He accused me of being a jealous writer with a failed writing career. I'm not. I'm just a reader and a paying customer who cares about attention to detail in thinking.
That's all fine, but this is already addressed with a previous comment. It's a reference to a previous piece about editorial yokes, but somehow it got transformed into yolk in the rushed writing of this. Most of the time my pieces go through a proof-reading process. This one didn't, because Times decided to go out Friday morning after saying Saturday and it involved a massive behind-the-scenes rush (I had to skip my own kid's bedtime, which I am still sore about). I am sorry that the terminology was too egg-like for your liking.
Keep in mind, regular publications are not immune to this. When the Times piece as about to go out, I found three different mistakes in their own edits that were similarly embarrassing and had them make last-minute changes. So it's not really a Substack-specific problem, it shows up everywhere. You'd be shocked at the number of stealth edits publications do to correct such problems. It's very easy to read over word swaps like that, or missing words, even for experienced editors.
I, too, have noticed this from many writers. Erik has been one of the lesser offenders, for sure. And to be fair, I am not immune to grammatical or spelling errors myself, so I try to extend some grace to those who do not have the luxury (financial or time) of hiring someone to proofread for them.
With that said, is there a way to read Erik’s article without rendering a hard earned dollar to the organization that ran off my journalistic crush, Bari Weiss?
Erik, what can I say?! I/we readers have become--how should I put it--'anastedized'?--to your work. Almost boringly brilliant every single time. To step out of your lane of focus, consciousness, into the realm of NYT audiences, with 120 comments and counting, only attests further to your unending creativity. My only remark on The Intrinsic Perspective is more of your great charts & graphs, please! I come from the same habit, too many words when a picture says more--and often better....
Congratulations, Erik! I’m a fan of your writing and I’m glad to see you get in front of more people.
I work in AI, and I’ve written a couple of books (and a substack!) about it. While the idea of text watermarks makes sense, the unfortunate truth is that it’s trivial to overcome them: even if all the big companies added watermarks to their output, a user can just paste that output into a tool that subtly reworks the piece to remove them. Additionally, watermarks don’t work well for shorter textual content, like the peer reviews and social media posts you mention in the Times article. Further, “AI detection” models have a huge false positive rate.
As with most technological problems, there is no easy fix that involves piling on yet more tech. The only real solution involves societal hard work. We need to build a solid system of standards and trust that help us distinguish between garbage content and the good stuff. Ironically, I see there being a big role here for traditional publishing brands (hello NYT) and self-publishing networks that host personal recommendations (hi Substack!), along with the scientific journals that are clearly not doing their (only, and overcompensated) jobs.
Professional standards and social proof can help us—and the search engines, plus the people who train models—distinguish between high and low quality content. Technological solutions won’t work.
AI devices and programs cannot police themselves. However if a piece is AI generated that could and should be indicated at the very beginning. Most companies will not want to do so. Credibility comes from belief in authenticity. "Erik Hoel" essays will be generated from AI. will every author need to read everything "penned: in his or her name to uncover non human generated material? Can a sophisticated AI program be used to detect AI generated essays? How ironic. Misalignment seems to be inherent in Neural Net entities. Can Substack prevent such non authored material from appearing?
You have a clear mission and the Times will allow you to reach it. No fault in that.
The NYT has a much, much wider audience, and for all we know it might have a more persistent archive into the future. I loved the piece. I especially treasure the phrase "cognitive micronutrient." It's an example of what an A.I. could never come up with: a tiny linguistic seed that take roots in human minds to yield who knows what kind of fruit (some of which will hopefully be shared on Substack.)
Yoke 😜
hahaha and THAT is why a copy-editor is sometimes useful. And to be fair, I usually have more than a morning to write a post, they pulled the trigger last night and it was all a rush... another issue with writing for pubs.
Congrats on the essay. Today's episode of Pedantry with Pete: the English word "subjugate" comes from the Latin, sub iugum mittere, or "to send under the yoke." A defeated army had to pass between two facing rows of soldiers with their spears extended in an arch. This was deeply humiliating.
Interesting - what's funny is that line is a throwback to an older piece, when an editor said to me (when I was younger) that "everyone, from Noam Chomsky to Sheldon Glashow, submits to the yoke" because they were mad I threw a fuss at how they had butchered the piece. https://www.theintrinsicperspective.com/p/writing-for-outlets-isnt-worth-it
Came here to mention that "editorial yolk" is an excellent typo.
Also, come on, man, ease up on the self-flagellation. Ask the questions, "Is there one platform that is better than all other platforms for disseminating ideas? Is there one publishing format that is better than all other publishing formats?" They are self-answering. Congrats on the piece!
Just for your information: I read your article in the NYT today, had never heard about you before but found the article very relevant. I then found you on Substeck and signet up as a subscriber.
Always great to hear things like that, thank you
Same here
And by the way CONGRATULATIONS!
And after reading it, THANK YOU
You were supposed to destroy the NYT, not join them!
I'll probably heartily enjoy your article in a few hours once I'm freed for the weekend. That said, in the short term I remain pessimistic about watermarks. Open source models will likely offer the ability to either strip or obfuscate watermarks for deception or plausible deniability, respectively, and that will mean company-controlled models will be the only "trustable" ones; yet who watches the watchers, and how will we know if these companies abuse our trust in their watermarking technologies? Today the word fascism is drenched in the flavor of the right wing but it was originally based on a melding of state and corporate power, a modern command economy that could be wielded by either political persuasion if desired. I dread a future where the only two options are technofeudalism or complete chaos.
Erik
I wouldn’t have discovered you or Substack if not for your NYT article. I have read lots of articles on AI and trying to use it in behavioral health but yours was the first one that scared me. Great work!
I agree with the concern behind your article and I'm glad you raised it in the New York Times. AI pollution of the commons is a real problem and needs to be discussed by a wide audience. However, the solution described in your piece is all based on an assumption that strong watermarking is possible. I believe that assumption is false.
You said that "major A.I. companies are refusing to pursue advanced ways to identify A.I.’s handiwork." This is untrue. For example, OpenAI hired Scott Aaronson for a year to work on this. As he says in his blog post (https://scottaaronson.blog/?p=6823), his "main project so far has been a tool for statistically watermarking the outputs of a text model like GPT." Other labs have pursued similar efforts as well. It's not that they haven't tried; it's that no option is particularly good. I suggest to you no technique works well and can't be easily broken. You can see this in some papers that have come out recently (https://arxiv.org/html/2310.07726v2, https://arxiv.org/abs/2305.03807), but you don't need to scour the literature to know that this is true.
Some simple techniques would indeed work against the least sophisticated actors. But it's not going to take an expert to break these watermarks; a motivated, college essay writer with access to Google will be able to.
And then we get to something you didn't mention: The cost of an inaccurate method. Whether it suffers from false positives or false negatives, in many ways, it could be worse than no method at all.
I challenge you to come up with a watermarking system I can't break. I mean this challenge honestly, and I would very, very much like for you to succeed. Being that this system would have to be deployed widely, you would have to for me to repeatedly test against it (e.g. via API) and still not be able to break it. I don't think you can. It's not about me or you or Scott Aaronson, it's that the task is impossible.
I appreciate this, since this is actually the substance of the debate. Broadly I think two things.
First, I think watermarking techniques do clearly work. The papers you cite contain some examples of these methods. But they require interfering with the output side, and my guess is not that the companies can't implement them, it's that they don't want to. They mostly want, in other words, to not bake in detectability to outputs - but if you don't do that then I agree it's necessarily impossible. Scott himself says " we actually have a working prototype of the watermarking scheme" wrt OpenAI.
Second, I admit that very advanced countermeasures could be used. I am mostly unconcerned about them. This is because (a) I don't think that the frame of "the adversarial side has all the advantages" is true. It looks a lot more like a standard defense/offense arms race, and there are all sorts of defensive moves, like rotating ciphers, that will likely work extremely well. And then also (b) even if there always exists some advanced offensive techniques, these, as they get more advanced, will be used less and less.
Let me give one example of this. The easiest method to avoid detection in text would be to ask another AI for a rephrasing. So you might ask Claude to rephrase ChatGPT (this is what Scott brings up). But if both are baking in detection methods, then all you've done is swapped the ChatGPT marker for Claude's! So you are really relying on finding obviously sketchy open source countermeasures... and I think people will mostly not do that. If such open source offensive techniques are illegal (and there are various arguments for that, it's pretty sketchy ground to just remove watermarks from company products and repackage them as your own) that tiny minority becomes a vanishingly small number.
Let's use your example of a smart motivated college essay writer. Right now they use ChatGPT to write essays. They care a lot about their career. They hear that now there is a watermark. They can't go ask another big company AI, because they're all doing it. Maybe they can find www dot removewatermarks dot com maintained by a sketchy Russian website, but will they bet their grade, in fact their entire future success, on that? Probably not.
This impassioned comment just breaks down to the assumption, which the first commenter already said, that it’s impossible to add watermarks without totally nerfing function. Given that Claude can write a paragraph about any subject under the sun where the first letters of each sentence spell out AI GENERATED, I don’t think this is the case. As for using open source for jail breaking - that assumes the models will be just as advanced and just as used. Two more assumptions I strongly disagree with, on both counts. History doesn’t really repeat, and saying that I’m on the side of the people who drove Swartz to suicide is just overly emotional purple prose.
Well that NYT piece at least got you one new subscriber-me!
Even though it may feel like turning back to the dark-side, I actually think it's a huge step forward having writers who dominantly post on Substack to write articles for the major news corporations. Although they have lost tons of viewership and credibility, they inevitably will stay around and continue to be read by many people, even through spite. Having credible and critical authors with a new perspective write for them outside of their primary quorum of bandits opens a lot of doors back to non-propagandized media.
Yes
Was just thinking of you as Sam Harris and Robert Sipulsky went on and on in Sam’s podcast with “downward control” tropes about causal supervenience, that seemed to call for a reading of your take. Pressure doesn’t need to”control” the billiard ball atoms to be the right level of description, is maybe how I internalize it, but I too could stand another reading.
I feel torn writing this, since I support you and it sounds like I'm being nitpicky for no good reason, but 'yolk' is for eggs. What you struggle under is their editorial 'yoke'. This article's topic is necessary and important and I agree with you, and yet a publication with editors and proofreaders would not have this in the opening sentence. I almost want to cry.
It's an endemic problem of a lack of seriousness on Substack. As a passionate supporter and reader of Substack authors, who I am proud to support with my subscriptions, I struggle every day with how much it bothers me that all of you desperately need to hire a proofreader. I can't count how many I've unsubscribed from because they cannot provide this modicum of respect to paying readers.
I commented on Freddie DeBoer's blog asking him to hire a proofreader when he made a ruinously confusing passage, that could have been fixed with a simple grammar correction, discussing an issue on which he and I agree which was dearly important to me. He accused me of being a jealous writer with a failed writing career. I'm not. I'm just a reader and a paying customer who cares about attention to detail in thinking.
That's all fine, but this is already addressed with a previous comment. It's a reference to a previous piece about editorial yokes, but somehow it got transformed into yolk in the rushed writing of this. Most of the time my pieces go through a proof-reading process. This one didn't, because Times decided to go out Friday morning after saying Saturday and it involved a massive behind-the-scenes rush (I had to skip my own kid's bedtime, which I am still sore about). I am sorry that the terminology was too egg-like for your liking.
Keep in mind, regular publications are not immune to this. When the Times piece as about to go out, I found three different mistakes in their own edits that were similarly embarrassing and had them make last-minute changes. So it's not really a Substack-specific problem, it shows up everywhere. You'd be shocked at the number of stealth edits publications do to correct such problems. It's very easy to read over word swaps like that, or missing words, even for experienced editors.
I, too, have noticed this from many writers. Erik has been one of the lesser offenders, for sure. And to be fair, I am not immune to grammatical or spelling errors myself, so I try to extend some grace to those who do not have the luxury (financial or time) of hiring someone to proofread for them.
With that said, is there a way to read Erik’s article without rendering a hard earned dollar to the organization that ran off my journalistic crush, Bari Weiss?
I heard from a little bird that the internet archive usually scoops up Times pieces and the Wayback Machine is searchable by link.
Erik, what can I say?! I/we readers have become--how should I put it--'anastedized'?--to your work. Almost boringly brilliant every single time. To step out of your lane of focus, consciousness, into the realm of NYT audiences, with 120 comments and counting, only attests further to your unending creativity. My only remark on The Intrinsic Perspective is more of your great charts & graphs, please! I come from the same habit, too many words when a picture says more--and often better....
Congratulations, Erik! I’m a fan of your writing and I’m glad to see you get in front of more people.
I work in AI, and I’ve written a couple of books (and a substack!) about it. While the idea of text watermarks makes sense, the unfortunate truth is that it’s trivial to overcome them: even if all the big companies added watermarks to their output, a user can just paste that output into a tool that subtly reworks the piece to remove them. Additionally, watermarks don’t work well for shorter textual content, like the peer reviews and social media posts you mention in the Times article. Further, “AI detection” models have a huge false positive rate.
As with most technological problems, there is no easy fix that involves piling on yet more tech. The only real solution involves societal hard work. We need to build a solid system of standards and trust that help us distinguish between garbage content and the good stuff. Ironically, I see there being a big role here for traditional publishing brands (hello NYT) and self-publishing networks that host personal recommendations (hi Substack!), along with the scientific journals that are clearly not doing their (only, and overcompensated) jobs.
Professional standards and social proof can help us—and the search engines, plus the people who train models—distinguish between high and low quality content. Technological solutions won’t work.
AI devices and programs cannot police themselves. However if a piece is AI generated that could and should be indicated at the very beginning. Most companies will not want to do so. Credibility comes from belief in authenticity. "Erik Hoel" essays will be generated from AI. will every author need to read everything "penned: in his or her name to uncover non human generated material? Can a sophisticated AI program be used to detect AI generated essays? How ironic. Misalignment seems to be inherent in Neural Net entities. Can Substack prevent such non authored material from appearing?