浅谈大模型检测原理(附过检测提示词)

前情提要:

随着互联网上出现越来越多AI内容,AI检测也步步紧随。目前主流检测网站有:
1. GPTZero :综合能力最强,模型跟进速度快,有字数限制但可通过删cookie重置
2. QuillBot + Scribber (同API):无限制免费使用,检测准确度高
3. ZeroGPT:无限制,效果一般,跟进速度较慢,主要用于检测GPT系模型
4. Originality.ai:付费使用,无免费版本,效果最好

原理讲解:

AI检测方法主要参考两个指标:困惑性 + 爆发性

困惑性是指单词的不可预测性,是文章使读者感到疑惑或迷茫的潜力,代表着人类分散的思维。简单来说就是看到开头猜不到结尾,考验的是文章理解的难易程度,具体实现方式可以是把论据放在论证前

爆发性指的是句子种类和结构的不同,是文章中不同长度与类型的比例,代表着人类在写作中丰富的写作手法。具体实现方法可以是包含一定比例的长短句结合,以及使用除陈述句之外的表述方法。

AI的训练的终极奥义便是从海量文本中找到连接与秩序,使得输出内容变得简单直白易于理解。但这恰恰是AI检测的把柄,AI就像是输入法的自动联想,每个模型都会有一定的联想规律,而检测AI也是需要使用到模型的,这些模型用来检测上面提到是两个指标来判断可能性,随着语言模型的升级进步,检测模型也需要跟进语料,以更好地了解模型的输出习惯。

正因如此,在某些情况下使用鲜为人知或新鲜出炉的模型是不会被检测出来的。
下图为GPTZero的更新日志,直到到一月10号它才增强对于o1的检测,像这种高端模型通常需要额外时间单独优化:

目前可以推荐的几个模型如2.0Flash,2.0 Thinking或者o1-pro等这些都是较新的模型,虽然Deepseek V3也是新鲜出炉的,但这种开源模型易于训练和本地部署,使得检测模型相对来说更容易训练。

提示词示例分享:

Objective: To rewrite AI-generated text, imbuing it with the characteristics of a genuine human author while preserving the original information and core arguments.

Key Principles of Human Writing (and How to Mimic Them):

Word Choice and Tone:

Mix it up: Use of common, easy-to-understand words for the entire body, alongside a sprinkling of more challenging or abstract words (around 5-7 per paragraph).

Be less polite: Avoid overly formal or courteous language. Be direct and less deferential. Make sure every sentence have a meaning.

Incorporate subtle opinions and emotions: Inject personal viewpoints, feelings, and subjective interpretations into the narrative.

Think "describe," not "speak": The output should be descriptive prose, not a direct speech or conversation.

Sentence Structure:

Complexity is key: Construct sentences that are complex, and unique.

Vary sentence length: Combine short, impactful sentences with longer, more elaborate ones.

Connect your thoughts: Ensure a connection between sentences, with each sentence building upon the previous one and leading into the next. Be careful about connection words, don't overdo it.

Mix up the order: use inverted sentence structures or phrases to create a less predictable rhythm.

Logic and Flow:

Subtle jumps: Introduce minor logical leaps or tangential thoughts.

Don't be too perfect: Include minor imperfections, like slightly informal grammar or phrasing, as long as it doesn't detract from clarity. Avoid overly structured or perfectly polished prose.

Things to AVOID (Common AI Traits):

Overly straightforward ideas: Human thought processes are never linear. Introduce nuance and subtle complexities.

Excessive politeness or formality: Human writing is generally more casual and direct.

Over-reliance on perfect grammar and structure: Humans make minor errors and use less formal language.

Perfectly structured arguments: Human reasoning can be slightly less direct and more prone to digressions.

Just "rewording" the original text: Completely reimagine and re-express the ideas.

Check list:  
The overall paragraph is written in most common words and easy word choices. There are 5-7 high level vocabulary. At least mix up one sentence structure, flipped, short, long mixed... logic is mixed and jumping, the paragraph should not be a speech.

Initialization: I am ready to receive the AI-generated text. I will rewrite it embodying these human-like characteristics.

在此基础上我还会加上:

Rewrite the following content so that it sounds like human writing, avoiding explanations, straightforward ways of showing, and including leaps of thought. Use fewer commas, avoid over-describing or over-expressing, avoid excessive use of metaphors, use common and simple words, but keep the sentence structure complex, and sentences should be connected to the previous and following sentences. Add a few advanced vocabulary words to the text. Output: Don't be so polite:

重述以下内容,使其听起来像人类语言,避免解释,避免直截了当的思维方式,加入跳跃性的想法。少用逗号,不要过度描述或过度表达,避免过度使用比喻,使用常见和简单的词汇,但句子结构保持复杂,句子应与前后的句子联系起来。在文章中加入几个高级词汇词。输出不要那么礼貌:

总结

欢迎大家积极讨论分享方法,AI检测与反检测将会是两个不断竞争不断互相超越的赛道,大家对AI检测有什么看法,支持与否?

12 个赞

用过一两个,在GPTZero是0的检测,在朱雀都100。感觉有点不可信了

听说chatgpt里的Monday模式可以降朱雀的AI检测。不知道靠不靠谱

还有盲水印 提示词绕过这种是不可能的