This is strictly true but not correct. LLMs were trained on human-written text, but they were post-trained to generate text in a particular style. And that style does have some common patterns.
Examples of LLM-style text: short & punchy sentences, negative parallelism ("not just X, it's Y"), bullet points especially with emojis and bolded text. Overuse of em-dash.
It's one thing to observe "LLM-generated writing all looks the same". Whether the LLMs were all post-trained the same way is a different question.
I don't agree "everyone says everything is AI". Do you have examples where a consensus of people are accusing something of being AI generated, where it lacks those indicators?
I consistently get accused of AI writing, but I’m not really sure why. I use spellcheck, that’s about it. Although I am a fan of an appropriately used em-dash, I don’t really follow the other patterns. I also find that people say that as a form of character assassination though, literally denying your humanity lol.
Just reading through posts on here about various blogs/posts/opinion pieces there always seems to be a handful of people that jumps to "this is AI". And maybe it is! But the driving force behind this seems not to be to identify that something is AI, but because they spite it so (AI writing), to quickly rule out caring about the material if identified as AI slop.
The problem I see this leading to is plenty of legitimate written things getting thrown away because somebodys online exposure bubbles don't end up including a lot of Medium or Tumblr or a certain Discord or whatever bubble where _huge_ groups of people actually are writing in whatever $STYLE is being identified by the reader and commenter as AI. Which then, because of their post, also gets other people to not even look.
> But the driving force behind this seems not to be to identify that something is AI, but because they spite it so (AI writing), to quickly rule out caring about the material.
Your expressed concern is "people don't like AI; this dislike motivates people to dismiss the material".
I think it's misguided to assume motivation.
For myself, I dislike the writing style because it's insincere and inauthentic. If the author isn't motivated enough to write out something in their own words, what's there to motivate a reader?
> The problem I see this leading to is plenty of legitimate written things getting thrown away because somebodys online exposure bubbles don't end up
Do you have any actual examples where legitimate writing was dismissed as written by AI? If not, I'd suggest your concern is hypothetical.
if the people who develop and release these models were all optimizing for the same goals, they could converge on strategies or behaviors, without coordinating.
I'm one of the unlucky ones who has coincidentally trained myself over the past fifteen years to write in the style that is now largely recognized to be the ChatGPT style— bolded lists, clear section breakdowns with intro and concluding sentences, correct and liberal use of semicolons and em-dashes. The only parts of it I don't do are litter my text with random emojis or directly address the reader with simpering praise.
I mean, that has always been my intention with it— particularly in the context of something like a ticket or design doc where it's critical that other busy people be able to quickly get a high level overview and then scan for the bits that are most relevant or of interest to themselves.
It's just ironic that I've now been asked if I was using AI to write (or punch up) content that I've produced in this style when I most certainly was not.
This is very much our internal newsletter at work, which is actually still written by human hand (and we know it is, she can't stand "using those things”).
Please don't post snarky comments attacking other users like this on HN, no matter what you're replying to. It's not what this site is for, and destroys what it is for.
Point taken, but I'd like to raise some serious questions - is the 18 millionth post of someone whining about having to read text written by an LLM that much more of a substantive contribution?
Is it "thoughtful criticism" to have the same pedantic complaint made everywhere?
Is offering zero feedback to OP other than whining about the presence of LLM-written text in a README not a "shallow dismissal"?
What about "Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting."?
Or is snark the only rule that matters enough to warrant reminders about rules?
Sorry for inappropriately handling the frustration I get with this kind of repetitive, shallow, pedantic no-value-add whining clogging up HN any time any LLM-generated text ever accompanies any part of a featured article / link that never gets the same kind of warnings or moderation.
The people who make these kinds of complaints need to accept that LLM-generated text is a fact of life now - even in (or perhaps especially in) interesting technical projects. We all heard their complaints the first time, and the fiftieth time, and the five thousandth time - those complaints added no value to the relevant discussions then and they add no value to the discussion here, it's just bullies taking advantage of the latest snobby, diminutive way to shit on other people's work over what amounts to little more than subjective cosmetic preferences.
A tiny CPU-only TTS model is awesome. Why is it appropriate to derail the discussion about the actual technical innovation here with a low-effort complaint that's so common it has become a trope?
We have frequently asked users not to make public accusations of posting LLM-generated content, so much so that several of the most engaged HN users routinely flag these kinds of comments and post their own reminders not to do it, and email us to let us know about it.
That is the right way to deal with undesirable activity on HN: flagging, emailing us so we can take action, and if replies are to be posted, expressing them with respect and kindness as the guidelines ask of us.
The problem with a reply like yours – one that is much worse than an already-bad comment – is that it becomes the highest-priority comment for us to respond to, and makes it harder for us to deal with the original comment with our normal approaches.
Indeed the blurb is absurd and very off-putting. It's not a big deal that "It clocks in at under 25MB with just 15 million parameters", because text to speech is a long-solved problem, in fact the Texas Speak and Spell from 1978 (half a century ago FFS) solved it, probably with a good deal less than 25MB.
I've re-upped that thread to the same position the previous discussion (this one) was at.
Here is the link to our repo: https://github.com/KittenML/KittenTTS
This is a ouroboros that will continue.
(Not saying this is or isn't, simply that these claims are rampant on a huge number of posts and seem to be growing.)
Because, well, there's a huge number of models. Are they all, as they say, "in cahoots"? (working together, clandestinely)
This is a good list: https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing
It's one thing to observe "LLM-generated writing all looks the same". Whether the LLMs were all post-trained the same way is a different question.
I don't agree "everyone says everything is AI". Do you have examples where a consensus of people are accusing something of being AI generated, where it lacks those indicators?
It’s not slop — it’s inspiration!
The problem I see this leading to is plenty of legitimate written things getting thrown away because somebodys online exposure bubbles don't end up including a lot of Medium or Tumblr or a certain Discord or whatever bubble where _huge_ groups of people actually are writing in whatever $STYLE is being identified by the reader and commenter as AI. Which then, because of their post, also gets other people to not even look.
It seems like a disaster, frankly.
Your expressed concern is "people don't like AI; this dislike motivates people to dismiss the material".
I think it's misguided to assume motivation.
For myself, I dislike the writing style because it's insincere and inauthentic. If the author isn't motivated enough to write out something in their own words, what's there to motivate a reader?
> The problem I see this leading to is plenty of legitimate written things getting thrown away because somebodys online exposure bubbles don't end up
Do you have any actual examples where legitimate writing was dismissed as written by AI? If not, I'd suggest your concern is hypothetical.
And yes, I'm not writing a research paper, I'm posting a comment. Full Disclaimer for those not paying attention, this is an Opinion.
One: https://news.ycombinator.com/item?id=44807103
Two: https://news.ycombinator.com/item?id=44807541
It's just ironic that I've now been asked if I was using AI to write (or punch up) content that I've produced in this style when I most certainly was not.
No human comments on meta formatting like that outside the deepest trenches of Apple/FB corporate stuff.
Is that tested and proven or just gut feeling?
If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.
Is it "thoughtful criticism" to have the same pedantic complaint made everywhere?
Is offering zero feedback to OP other than whining about the presence of LLM-written text in a README not a "shallow dismissal"?
What about "Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting."?
Or is snark the only rule that matters enough to warrant reminders about rules?
Sorry for inappropriately handling the frustration I get with this kind of repetitive, shallow, pedantic no-value-add whining clogging up HN any time any LLM-generated text ever accompanies any part of a featured article / link that never gets the same kind of warnings or moderation.
The people who make these kinds of complaints need to accept that LLM-generated text is a fact of life now - even in (or perhaps especially in) interesting technical projects. We all heard their complaints the first time, and the fiftieth time, and the five thousandth time - those complaints added no value to the relevant discussions then and they add no value to the discussion here, it's just bullies taking advantage of the latest snobby, diminutive way to shit on other people's work over what amounts to little more than subjective cosmetic preferences.
A tiny CPU-only TTS model is awesome. Why is it appropriate to derail the discussion about the actual technical innovation here with a low-effort complaint that's so common it has become a trope?
That is the right way to deal with undesirable activity on HN: flagging, emailing us so we can take action, and if replies are to be posted, expressing them with respect and kindness as the guidelines ask of us.
The problem with a reply like yours – one that is much worse than an already-bad comment – is that it becomes the highest-priority comment for us to respond to, and makes it harder for us to deal with the original comment with our normal approaches.