AI detectors do not check a database. They measure perplexity, burstiness, and signature phrasing. Here is what those metrics actually mean and how to move them.
What detectors actually do
There is a common misunderstanding: that AI detectors check submitted text against a database of known AI output, the way plagiarism detectors check for matches against published sources. That is not how they work. Modern AI detectors are supervised classifiers trained on millions of examples labelled human or AI. They learn the statistical fingerprint of machine-generated text, then they score new submissions against that fingerprint.
The classifier does not need to have seen your specific text before. It only needs to recognize the pattern. Two numbers do most of the work.
Perplexity
Perplexity comes from information theory. Given a sentence, you ask a language model: how surprised were you by each word? A model that finds the next word obvious has low perplexity. A model that finds the next word unexpected has high perplexity. AI writing is low-perplexity by construction. Human writing scores higher.
Burstiness
Burstiness measures the variance in your writing rhythm. Specifically: how much do sentence lengths and structures change from one sentence to the next? Human writing is bursty. We write a four-word sentence then a forty-word one. We toss in a fragment. We start with And sometimes.
AI writing has low burstiness. Sentences cluster around the same length. Paragraphs tend to be three or four sentences each. Lists have parallel construction. The rhythm is metronomic.
Signature phrasing
Beyond the two big statistical signals, classifiers also pick up on tokens that disproportionately show up in model output. These signature words are not unique to AI but appear at rates several times higher than in equivalent human writing.
Punctuation matters too. Modern OpenAI models love em dashes. They use them where a human writer would pick a comma or a period. Heavy use of em dashes in a paragraph is a strong AI signal in 2026. See our ChatGPT humanizer page for a fuller list of GPT-4o signature patterns.
How the major detectors compare
| Detector | Tuned for | Primary signal | False-positive rate |
|---|---|---|---|
| Turnitin AI | Student essays, academic prose | Document-level classifier on academic corpus | <1% (claimed); higher for non-native |
| GPTZero | General writing, journalism | Perplexity + burstiness | 1-3% in independent tests |
| Originality.ai | SEO and content marketing | Transformer + heuristics for web copy | 2-5% in independent tests |
| Copyleaks | Enterprise + multilingual | Deep model + structural analyzer | varies by language, 1-9% |
Why false positives happen
AI detectors flag human writing as AI more often than they should. Several things drive this: non-native English speakers tend to write with simpler vocabulary and more uniform sentence structure, which raises false positive rate sharply. Formal academic writing is supposed to be uniform and well-structured. Technical writing has narrow vocabulary by necessity. Short passages of 200 words or less give the classifier insufficient signal.
What humanization actually changes
To move text from AI to human in a classifier's view, you are moving four numbers. A real humanizer does all four steps in one pass:
What this means for your work
The right mental model: AI detection is a probabilistic measurement of statistical signatures, not a search of a database. Substitution-only humanizers (the kind that swap synonyms one word at a time) leave perplexity and burstiness almost untouched and are easy to detect.
A humanizer that actually works restructures sentences, varies length, and removes signature phrasing. That is what we built into the free Humanize AI tool. For specific guidance by detector, see our walkthroughs for Turnitin, GPTZero, Originality.ai, and Copyleaks. By source model, see ChatGPT, Claude, and Gemini.
Frequently asked questions
Do AI detectors check your text against a database?
No. Modern AI detectors are statistical classifiers. They estimate whether the patterns of word choice and sentence structure match what a language model would produce.
What is perplexity in AI detection?
Perplexity measures how surprising the next word is given the previous words, scored by a language model. AI text scores low. Human text scores higher.
What is burstiness?
Burstiness is the variance in sentence length and complexity. Humans write in bursts. AI tends toward uniform sentence length, which produces low burstiness.
Can a detector tell which model produced the text?
Sometimes, but not reliably. Each model has signature words. ChatGPT favors delve, embark, navigate. Claude favors longer flowing prose. Detectors usually answer the binary AI-or-human question.
Why does humanization work?
Humanization rewrites text to raise its perplexity and burstiness. It substitutes high-probability words for less expected ones, varies sentence length, and removes signature AI phrasing.
Sources and further reading
The technical claims in this article draw on the following primary sources. We link them directly so readers can verify and dig deeper.
- Tian, Edward. "How GPTZero Works". Original methodology post explaining perplexity and burstiness scoring.
- Turnitin AI writing detection documentation. Vendor-published methodology, scoring, and stated false-positive rates.
- Originality.ai detection accuracy report. Vendor-published self-evaluation across content types.
- Liang, W., Yuksekgonul, M., Mao, Y., Wu, E., Zou, J. "GPT detectors are biased against non-native English writers" (Patterns, 2023). Peer-reviewed evidence of false-positive rates above 60% for non-native writer samples.
- Sadasivan, V., Kumar, A., Balasubramanian, S., Wang, W., Feizi, S. "Can AI-Generated Text be Reliably Detected?" (2023). University of Maryland paper on theoretical limits of detection.
- Jelinek, F., Mercer, R., Bahl, L., Baker, J. Perplexity. The information-theoretic foundation, originally formalized for speech recognition in the 1970s and adapted for language model evaluation.
- OpenAI GPT-4o model card. Includes notes on the model's writing style biases and the company's own deprecated AI-text-detection classifier.
Try the free humanizer
Paste your AI-generated text. Get back something that reads naturally and moves the perplexity and burstiness numbers in the right direction. No signup, no word limit.
Open the free tool