Large language model (LLM)
Updated June 10, 2026
The neural networks behind ChatGPT, Claude and Gemini — trained on vast text corpora to predict the next word, and from that one skill, to write.
Definition
A large language model is a neural network — typically a transformer — trained on enormous amounts of text to do one thing: predict the next token. Scale that single skill up (billions of parameters, trillions of training words) and capabilities emerge: answering, summarizing, translating, drafting. ChatGPT (OpenAI), Claude (Anthropic), Gemini (Google), Llama (Meta) and DeepSeek are all LLMs.
Why LLM text is detectable
Because generation is next-likely-word sampling, output carries statistical regularities — low perplexity, low burstiness, shared phrasing habits — that detectors measure. Different models have house flavors (see our by-model guides) but share the skeleton; that's why detectors generalize across models they weren't specifically trained on.
Why LLMs hallucinate
An LLM predicts plausible text, not true text — when the most statistically likely continuation is a citation, it produces one whether or not the paper exists. See hallucination. This is also the deeper limit of AI writing: fluency without experience, which no rewriting fixes — only an author can add what only an author knows.