Token

Updated June 10, 2026

The unit AI models actually read and write — fragments of words, roughly four characters of English each.

Definition

Models don't process whole words; they process tokens — common chunks learned from data. "Writing" might be one token; "humanizing" might split into "human" + "izing". In English, a token averages about four characters, or roughly ¾ of a word; 1,000 words ≈ 1,300–1,500 tokens. Every model limit you've met — context windows, API pricing, output caps — is counted in tokens.

Why the concept matters for writing tools

Generation happens one token at a time: the model computes a probability for every possible next token and samples one (how adventurously is set by temperature). That token-by-token sampling is what leaves the statistical trail detectors read — perplexity is literally measured over tokens. Character limits in writing tools (Humanize Studio Pro handles 50,000 characters a run) exist because processing cost scales with token count.

Token

Definition

Why the concept matters for writing tools

Keep reading

Large language model (LLM)

Temperature

Perplexity

How GPTZero works

Humanize it — then verify it