Catching ChatGPT cheaters is tough: Is it bursty?

Joanne Jacobs
Feb 26, 2023
2 min read

Open AI's program to catch chatbot cheaters is very unreliable, write Armin Alimardani and Emma A. Jane on The Conversation.

The company admits that its classifier for indicating AI-written text "accurately identifies only 26% of AI-generated text (true positive) while incorrectly labelling human prose as AI-generated 9% of the time (false positive)," they write.

Edward Tian, a Princeton computer science major, came up with a more promising option called GPTZero on his winter break. His app analyzes "perplexity" (complexity) and "burstiness" (the variation between sentences) to identify AI authorship. Bots are lower on both than humans.

A would-be cheater can go online to find tools that try to mislead AI classifiers by replacing words with synonyms, Alimardani and Jane write. But the synonyms can be “tortured.” For example, it's a red flag when "big data" becomes "colossal information."

The two professors asked ChatGPT to write an essay on justice, then copied it into GPT-Minus, which offered to “scramble” ChatGPT text with synonyms. It changed 14 percent of words, turning the essay into gibberish.

ChatGPT: "Justice is a cornerstone of the rule of law and is fundamental to the preservation of social order and the protection of individual rights . . . Criminal justice involves the fair and impartial enforcement of laws . . . "

GPT-Minus: "Justness is a cornerstone of the rule of practice of law and is first harmonic to the saving of social order and the tribute of soul rights . . . Outlaw justice involves the funfair and colorblind undefined of Torah . . . "

I remember when I got my first copy of Roget's Thesaurus. I went a little crazy (berserk, bonkers, demented, delirious) too, but then I calmed down.

When they copied the GPT-Minus1 version of the justice essay back into GPTZero, it concluded the "text is most likely human written but there are some sentences with low perplexities."

Another proposal is for AI-written text to contain a “watermark” that can be picked up by software. This works by limiting the words the AI can use. Humans are likely to use some banned words, showing that the text wasn't generated by AI. This imperfect too, write Alimardani and Jane.

"AI-generated text detectors will become increasingly sophisticated," they write. "Anti-plagiarism service TurnItIn recently announced a forthcoming AI writing detector with a claimed 97% accuracy."

But the text generators are improving too. "As this arms race continues, we may see the rise of 'contract paraphrasing.' Rather than paying someone to write your assignment, you pay someone to rework your AI-generated assignment to get it past the detectors."

Or learn to write? I guess not.

Chatbot cheating is soaring, writes Vishwam Sankaran of the Independent. In a Study.com survey of 203 teachers, 26 percent had caught a student cheating using ChatGPT. It's only been out for a few months.

7 Kommentare

Mit 0 von 5 Sternen bewertet.

Noch keine Ratings

Gast

28. Feb. 2023

I have a very simple method of preventing ChatGPT from writing the student essays: require all essays to follow "prompts" that ChatGPT refuses to write.

So you can write an essay about the horrors of letting men into women's private spaces / the trans agenda, but you can't write anythign in favor of it.

You can write an article praising President Trump, but not one praising Biden, Obama, or Prince Harry.

So long as the makers of "AI" programs insist of forcing their personal political biases on the rest of us, the proper response is to make that forcing expensive

Gefällt mir

Gast

28. Feb. 2023

I use ChatGpt for assistance in writing computer code. I find that it is correct about 50% of the time. So I've taken the conservative approach and reduced my expectations. I simply treat ChatGpt's output as a shortcut. I'll accept it but then do my due diligence and seek verification. I like to see strong agreement with users of Stackoverflow. However, if I see low confidence then that makes me wary of ChatGpt's solution. In fact, that is something that I wish ChatGpt provided: Confidence Levels.

Gefällt mir

Gast

27. Feb. 2023

Do as I do. On the day papers are due, the class has to sit down and handwrite from memory and abstract of the paper they just handed in.

All the students are subject to a personal call to my office to describe their paper and its contents from memory.

Since I have been doing this, the number of papers withdrawn has gone through the roof, and the students have decided that they will take a zero rather than be interrogated on the contents of their submissions.

Put the student on the spot.

Gefällt mir

Gast

28. Feb. 2023

Antwort an

That's brilliant, because any student who actually engaged with the paper will be eager, or at least willing, to discuss it.

My kids do National History Day, and their favorite part is the judging because they get to sit and discuss their topic with experts in the field and get new ideas.

Gefällt mir

Gast

26. Feb. 2023

It is telling that all the effort is into ferreting out using AI to write papers. It is a clear demonstration of the incentives of schooling are to "get good grades" not real learning. For one thing, if the students were there for the learning then "cheating" is not in the student's interest. On the other, teachers struggle to keep the value of the grades they give by stopping cheating regardless of whether the students actually know anything after the class is over.

A real value of ChatGPT has been shown as related below. The motivated student is no longer dependent on the stochastic feedback of the instructor and how they might feel when they are reading an essay. Instea…

Gefällt mir

Gast

27. Feb. 2023

Antwort an

By changing the incentive one can get learning. I hold oral boards and presentations. The only way to get the grade is to demonstrate that one has learned.

I do oral boards and presentations twice a quarter. The first demonstration of learning is usually when I get the highest number of withdrawals from a class.

Once one has stood in front of a class and demonstrated that one knows little to nothing about the paper "they" have written, I tend to never see that student again.

I am not popular with students, but my Letters of Recommendation within the university are very very popular and sought after.

Gefällt mir

Gast

26. Feb. 2023

Intuitively, algorithms that try to determine if a given text is AI-generated have to use some way to determine if the writing is "mechanical"; that's the basic concept behind the notions of "burstiness" and "perplexity". But in the real-world many (perhaps most) people are mechanical writers. Sure, the algorithm can't produce anything as distinctive as a Joni Mitchell song (not yet), but she's exceptional, and a lot of students and adults write prose that isn't particularly good. For instance, I've seen a lot of people with advanced degrees who use a simple Subject-Verb-Object structure for virtually every sentence. I wonder how the AI detection algorithms will deal with that.

One feature of many students' writing is that it is often…

Gefällt mir