前情提要:
随着互联网上出现越来越多AI内容,AI检测也步步紧随。目前主流检测网站有:
1. GPTZero :综合能力最强,模型跟进速度快,有字数限制但可通过删cookie重置
2. QuillBot + Scribber (同API):无限制免费使用,检测准确度高
3. ZeroGPT:无限制,效果一般,跟进速度较慢,主要用于检测GPT系模型
4. Originality.ai:付费使用,无免费版本,效果最好
原理讲解:
AI检测方法主要参考两个指标:困惑性 + 爆发性
困惑性是指单词的不可预测性,是文章使读者感到疑惑或迷茫的潜力,代表着人类分散的思维。简单来说就是看到开头猜不到结尾,考验的是文章理解的难易程度,具体实现方式可以是把论据放在论证前
爆发性指的是句子种类和结构的不同,是文章中不同长度与类型的比例,代表着人类在写作中丰富的写作手法。具体实现方法可以是包含一定比例的长短句结合,以及使用除陈述句之外的表述方法。
AI的训练的终极奥义便是从海量文本中找到连接与秩序,使得输出内容变得简单直白易于理解。但这恰恰是AI检测的把柄,AI就像是输入法的自动联想,每个模型都会有一定的联想规律,而检测AI也是需要使用到模型的,这些模型用来检测上面提到是两个指标来判断可能性,随着语言模型的升级进步,检测模型也需要跟进语料,以更好地了解模型的输出习惯。
正因如此,在某些情况下使用鲜为人知或新鲜出炉的模型是不会被检测出来的。
下图为GPTZero的更新日志,直到到一月10号它才增强对于o1的检测,像这种高端模型通常需要额外时间单独优化:
目前可以推荐的几个模型如2.0Flash,2.0 Thinking或者o1-pro等这些都是较新的模型,虽然Deepseek V3也是新鲜出炉的,但这种开源模型易于训练和本地部署,使得检测模型相对来说更容易训练。
提示词示例分享:
Objective: To rewrite AI-generated text, imbuing it with the characteristics of a genuine human author while preserving the original information and core arguments.
Key Principles of Human Writing (and How to Mimic Them):
Word Choice and Tone:
Mix it up: Use of common, easy-to-understand words for the entire body, alongside a sprinkling of more challenging or abstract words (around 5-7 per paragraph).
Be less polite: Avoid overly formal or courteous language. Be direct and less deferential. Make sure every sentence have a meaning.
Incorporate subtle opinions and emotions: Inject personal viewpoints, feelings, and subjective interpretations into the narrative.
Think "describe," not "speak": The output should be descriptive prose, not a direct speech or conversation.
Sentence Structure:
Complexity is key: Construct sentences that are complex, and unique.
Vary sentence length: Combine short, impactful sentences with longer, more elaborate ones.
Connect your thoughts: Ensure a connection between sentences, with each sentence building upon the previous one and leading into the next. Be careful about connection words, don't overdo it.
Mix up the order: use inverted sentence structures or phrases to create a less predictable rhythm.
Logic and Flow:
Subtle jumps: Introduce minor logical leaps or tangential thoughts.
Don't be too perfect: Include minor imperfections, like slightly informal grammar or phrasing, as long as it doesn't detract from clarity. Avoid overly structured or perfectly polished prose.
Things to AVOID (Common AI Traits):
Overly straightforward ideas: Human thought processes are never linear. Introduce nuance and subtle complexities.
Excessive politeness or formality: Human writing is generally more casual and direct.
Over-reliance on perfect grammar and structure: Humans make minor errors and use less formal language.
Perfectly structured arguments: Human reasoning can be slightly less direct and more prone to digressions.
Just "rewording" the original text: Completely reimagine and re-express the ideas.
Check list:
The overall paragraph is written in most common words and easy word choices. There are 5-7 high level vocabulary. At least mix up one sentence structure, flipped, short, long mixed... logic is mixed and jumping, the paragraph should not be a speech.
Initialization: I am ready to receive the AI-generated text. I will rewrite it embodying these human-like characteristics.
在此基础上我还会加上:
Rewrite the following content so that it sounds like human writing, avoiding explanations, straightforward ways of showing, and including leaps of thought. Use fewer commas, avoid over-describing or over-expressing, avoid excessive use of metaphors, use common and simple words, but keep the sentence structure complex, and sentences should be connected to the previous and following sentences. Add a few advanced vocabulary words to the text. Output: Don't be so polite:
重述以下内容,使其听起来像人类语言,避免解释,避免直截了当的思维方式,加入跳跃性的想法。少用逗号,不要过度描述或过度表达,避免过度使用比喻,使用常见和简单的词汇,但句子结构保持复杂,句子应与前后的句子联系起来。在文章中加入几个高级词汇词。输出不要那么礼貌:
总结
欢迎大家积极讨论分享方法,AI检测与反检测将会是两个不断竞争不断互相超越的赛道,大家对AI检测有什么看法,支持与否?