AI & MACHINE LEARNING

Hallucination

Generation, by a language model, of fluent and plausible content that is factually incorrect or unsupported by the source. It is organized along two axes: intrinsic vs extrinsic and factuality vs faithfulness to the material provided.

Extended definition

Hallucination, in language models, is the generation of fluent and plausible content that is factually incorrect or unsupported by the source. The term is a metaphor imported from perception and now carries a technical definition. Ji and colleagues (2023), in the reference review on natural language generation, fixed two distinctions that organize the concept. The first separates intrinsic hallucination, where the output contradicts the input text itself, from extrinsic hallucination, where the output contradicts world knowledge or the training data. The second separates factuality, the absolute correctness of the statement, from faithfulness, adherence to the material provided. Huang and colleagues (2025), reviewing the phenomenon in the LLM era, extended the taxonomy and argued that hallucination is not an occasional defect but a structural property of probabilistic models trained to predict the next word: the model optimizes linguistic plausibility, not truth, and therefore produces convincing text regardless of whether any factual grounding exists.

When it applies

The concept applies whenever the reliability of a generative output is assessed, especially where factual correctness matters: medicine, law, literature synthesis, journalism. It applies when distinguishing the two axes in diagnosing an error: a response can be faithful to the source and still false, if the source is wrong, or unfaithful and still true by coincidence. It also applies to detection. Farquhar and colleagues (2024) proposed an estimator based on semantic entropy that measures uncertainty at the level of meaning, not the word sequence, and identifies a subset of hallucinations, confabulations, without prior knowledge of the task. The concept is central to the responsible use of RAG and of assisted review: knowing where the model tends to hallucinate defines where human verification is mandatory.

When it does not apply

The label does not apply to every model error. An incorrect answer due to outdated training data is a knowledge error, not hallucination in the strict sense of plausible fabrication. It does not apply as a synonym for opinion or for an unwanted answer: diverging from the user’s expectation is not hallucinating. It does not apply as a fully eliminable problem. Huang and colleagues (2025) and the subsequent literature hold that, in sufficiently general models, hallucination is theoretically inevitable, so promising a model free of it is misleading; the realistic path is to detect, mitigate, and supervise. And it does not apply to factuality alone: in summarization tasks, what matters is faithfulness to the source, and a summary that is factually true but inserts information absent from the original is still a faithfulness hallucination.

Applications by field

  • Health and medicine: the highest risk; a plausible fabrication in a dose, diagnosis, or clinical reference can cause direct harm, which makes detection mandatory.
  • Law: fabrication of case law and of nonexistent citations is the most documented manifestation and has already led to sanctions against professionals.
  • Literature synthesis and science: invented references, with plausible yet false authors and DOIs, require verifying every citation before use.
  • Code generation: hallucination of nonexistent functions, libraries, or APIs, detectable by execution and by checking against the real documentation.

Common pitfalls

The first pitfall is conflating fluency with truth: the model’s well-formed text lends the error an appearance of authority that citation counts or hurried review do not catch. The second is treating factuality and faithfulness as the same thing, which hides summarization hallucinations that seem faithful but insert absent content. The third is trusting the model’s apparent confidence: a hallucination is usually expressed with the same assurance as a correct answer. The fourth is assuming RAG solves the problem; retrieving sources reduces but does not eliminate it, and noisy or outdated sources reintroduce the error. The fifth is failing to define, in the workflow, the point where human verification is mandatory, letting the model’s output proceed unaudited precisely where the cost of error is highest.

Last updated —