Hallucinations in Legal AI: How to Spot and Control Them

Photo: Risk & Governance and legal AI for UK solicitors – Hallucinations in Legal AI: How to Spot and Control Them.

Why large language models hallucinate in legal contexts, how to detect problems before they reach clients or the court, and practical guardrails you can put in place.

“Hallucinations” are not a curiosity; they are the core risk of using large language models in legal work.

A hallucination is an output that sounds confident but is factually or legally wrong. In practice, that might be:

  • a case that does not exist;
  • a misquoted statute;
  • a fictional email that was never sent; or
  • a summary of a judgment that quietly reverses who won.

This article explains, from a UK solicitor’s perspective:

  • why hallucinations happen even in good models;
  • where they are most dangerous in day-to-day practice; and
  • the practical guardrails you can put in place in a small or mid-sized firm.

Why legal hallucinations happen

Modern language models are pattern machines, not databases.

They generate the most likely next token based on their training data and the prompt you give them. Unless they are hooked up to a live research system, they do not:

  • check a neutral citation database;
  • verify that a quoted section really appears in the case; or
  • maintain an internal list of “true” and “false” statements.

Legal prompts are especially risky because:

  • case citations have recognisable patterns, so the model can plausibly invent them;
  • common legal phrases (“the court held that…”) look similar whether the holding is real or made up; and
  • many prompts implicitly reward confident answers (“give me five strong authorities supporting X”).

The result: the model happily fills gaps with plausible legal-sounding fiction.

Where hallucinations hurt most in practice

In a typical firm, hallucinations show up in a few predictable places.

1. AI-only legal research

A fee-earner asks:

“List recent Court of Appeal cases on unfair prejudice petitions where petitions were struck out as abusive. Give neutral citations and short summaries.”

A model that is not grounded in real case law may:

  • generate party names that have never existed;
  • invent neutral citations in the right format; and
  • summarise imaginary holdings that “feel” consistent with trends.

Without verification, those cases can slip into notes, advices or even skeleton arguments.

2. Over-enthusiastic drafting

Asked to “strengthen” a letter before claim or a defence, a model might:

  • assert that the other side “has repeatedly failed to comply with CPR 31”; or
  • claim that “the court has on several occasions criticised your conduct”.

Both statements might be untrue. If they are sent unchecked, they risk:

  • breaching your duty not to mislead; and
  • unnecessarily inflaming a dispute.

3. Subtle mis-summaries of authority

Even when a case is real, models can:

  • mis-state the ratio;
  • ignore important limitations or exceptions; or
  • quietly upgrade “one factor among many” into a hard rule.

These are harder to spot than completely invented cases because they often pass a quick plausibility test.

Guardrails that actually work

You do not need a research lab to reduce hallucination risk. What you do need is clear, boring process.

1. Verification rules

Adopt simple firm-wide rules such as:

  • “No case goes into a court document unless a human has checked it on an authoritative service (ICLR, BAILII, Lexis, Westlaw, PLC etc).”
  • “No statute or rule reference is relied on unless someone has checked the text in an official or subscription source.”
  • “If you cannot quickly find a cited authority, treat it as non-existent.”

Build these checks into:

  • your research notes;
  • skeleton argument templates; and
  • internal guidance for trainees and paralegals.

2. Prompt patterns that reduce hallucinations

How you ask the question matters. Compare:

“Give me five authorities on X.”

with:

“You are assisting with research for a UK solicitor. You may use your training data to suggest search terms and possible lines of authority, but you must not invent case names or citations. If you are not sure whether a case exists, say so and explain how a human researcher could check.”

Or:

“Summarise the following case law without adding new cases. If you draw any conclusion that does not appear in the text, flag it as your own inference.”

You are not eliminating hallucinations, but you are at least not rewarding the model for bluffing.

3. Use grounded tools where possible

Prefer tools that:

  • use retrieval-augmented generation (RAG) to pull in specific passages from known sources; and
  • show you exactly which documents were used to generate the answer.

Even then, apply human judgment. A system that cites the right judgment can still misunderstand it – but at least you have somewhere to look.

4. Make hallucinations visible in supervision

Partners and supervisors can help by asking questions such as:

  • “Where did this case come from originally?”
  • “Which sources did you check before including it?”
  • “Is this a direct quote or an AI-generated paraphrase?”

If AI was involved, ask to see:

  • the original prompts; and
  • the model outputs alongside the verified authorities.

That makes it clear to juniors that the firm takes hallucination risk seriously.

Handling incidents

If a hallucination slips through into advice or submissions:

  • Correct it promptly – update the client or the court as appropriate.
  • Record what happened – which tool was used, what went wrong, and how you found out.
  • Adjust your processes – update prompts, checklists or training so the same pattern is less likely next time.

Treat it like any other near-miss or complaint: a chance to improve, not a reason to ban technology entirely.

Where OrdoLux fits

OrdoLux is being designed on the assumption that AI will sometimes get things wrong – and that firms need good plumbing to manage that risk:

  • prompts and outputs are saved against the relevant matter;
  • research notes and authorities can be linked to the AI suggestions they came from; and
  • supervisors can see, in one place, how AI was used on a file.

The aim is not to remove hallucinations altogether (no system can), but to make them easier to detect, explain and prevent.

This article is general information for practitioners — not legal advice.

Looking for legal case management software?

OrdoLux is legal case management software for UK solicitors, designed to make matter management, documents, time recording and AI assistance feel like one joined-up system. Learn more on the OrdoLux website.

Further reading

← Back to the blog

Explore related guides