AIR-OP-004
AI Governance Framework Icon

Hallucination and Inaccurate Outputs

Summary

LLM hallucinations occur when a model generates confident but incorrect or fabricated information due to its reliance on statistical patterns rather than factual understanding. Techniques like Retrieval-Augmented Generation can reduce hallucinations by providing factual context, but they cannot fully prevent the model from introducing errors or mixing in inaccurate internal knowledge. As there is no guaranteed way to constrain outputs to verified facts, hallucinations remain a persistent and unresolved challenge in LLM applications.

Description

LLM hallucinations refer to instances when a Large Language Model (LLM) generates incorrect or nonsensical information that seems plausible but is not based on factual data or reality. These “hallucinations” occur because the model generates text based on patterns in its training data rather than true understanding or access to current, verified information.

The likelihood of hallucination can be minimised by techniques such as Retrieval Augmented Generation (RAG), providing the LLM with facts directly via the prompt. However, the response provided by the model is a synthesis of the information within the input prompt and information retained within the model. There is no reliable way to ensure the response is restricted to the facts provided via the prompt, and as such, RAG-based applications still hallucinate.

There is currently no reliable method for removing hallucinations, with this being an active area of research.

Contributing Factors

Several factors increase the risk of hallucination:

  • Lack of Ground Truth: The model cannot distinguish between accurate and inaccurate data in its training corpus.
  • Ambiguous or Incomplete Prompts: When input prompts lack clarity or precision, the model is more likely to fabricate plausible-sounding but incorrect details.
  • Confidence Mismatch: LLMs often present hallucinated information with high fluency and syntactic confidence, making it difficult for users to recognize inaccuracies.
  • Fine-Tuning or Prompt Bias: Instructions or training intended to improve helpfulness or creativity can inadvertently increase the tendency to generate unsupported statements.

Example hallucinations

Below are a few illustrative cases of LLM hallucination.

  1. Cited Sources That Don’t Exist An LLM asked to summarize academic work may invent references, complete with plausible authors, titles, and journal names, that are entirely fictional.

  2. Fabricated Legal or Medical Advice When prompted for legal precedents or medical diagnoses, LLMs may provide entirely fabricated cases or treatments that sound convincing but have no basis in reality.

  3. Incorrect Product or API Descriptions Given prompts about software tools or APIs, the model may hallucinate methods, parameters, or features that are not part of the actual documentation.

  4. False Historical or Scientific Claims LLMs have been known to invent historical facts (e.g., attributing events to the wrong year or country) or scientific findings (e.g., claiming a drug is approved for a condition it is not).

  5. Contradictory Reasoning In some cases, LLMs produce internally inconsistent outputs—for example, simultaneously asserting and denying the same fact in the same answer, or offering logically incompatible reasoning steps.