OpenAI just admitted it can't identify AI-generated text. That's bad for the internet and it could be really bad for AI models.

L4sBot@lemmy.world · 3 years ago

OpenAI just admitted it can't identify AI-generated text. That's bad for the internet and it could be really bad for AI models.

cerevant@lemmy.world · 3 years ago

Here’s the thing though - the probabilities for word choice come from the data the model was trained on. While someone that uses a substantially different writing style / word choice than the LLM could easily be identified as being not from the LLM, someone with a similar writing style might be indistinguishable from the LLM.

Or, to oversimplify: given that Reddit was a large portion of the input data for ChatGPT, all you need to do is write like a Redditor to sound like ChatGPT.