OpenAI just admitted it can’t identify AI-generated text. That’s bad for the internet and it could be really bad for AI models.::In January, OpenAI launched a system for identifying AI-generated text. This month, the company scrapped it.

  • cerevant@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 years ago

    Here’s the thing though - the probabilities for word choice come from the data the model was trained on. While someone that uses a substantially different writing style / word choice than the LLM could easily be identified as being not from the LLM, someone with a similar writing style might be indistinguishable from the LLM.

    Or, to oversimplify: given that Reddit was a large portion of the input data for ChatGPT, all you need to do is write like a Redditor to sound like ChatGPT.