Guide / Agent Hallucination Risk

The dangerous hallucination is the one that calls a tool.

A chatbot hallucination can mislead. An agent hallucination can change a file, send a request, place an order, or move a robot. Tool use turns bad text into external state.

AI agent hallucination risk agentic AI hallucination AI tool use safety AI agent guardrails LLM agent reliability deterministic fallback

Search Intent

The reader is asking when an AI agent should stop before using tools.

  • What makes agent hallucination more dangerous than chatbot hallucination?
  • What should happen before a model calls a tool?
  • How do you separate a suggestion from an authorized action?
  • What evidence should be logged when the system refuses?

Scenario

The failure is not the wrong sentence. It is the unearned action.

A fluent model can invent a file path, misread a user goal, or assume permission that was never granted. If the system only prints a wrong answer, the failure is visible. If it calls a tool, the failure becomes an operation.

The minimum defense is not a nicer prompt. It is an action gate: evidence, authority, scope, and fallback state must be checked before the external action happens.

Test

A reliable agent has to prove the action, not just explain it.

The action proof should name the target, the authority, the evidence record, the expected change, the failure cost, and the rollback or review route.

If any of those fields is missing, the agent should downgrade to ask, log, stop, or route to human review. Continuing to freestyle is the risk.

Boundary

No-action states are part of the product, not an error page.

A system that can never say no is not autonomous in the useful sense. It is a fluent actuator without a brake.

The public claim should therefore include refusal reasons and counterexample routes. Otherwise the reliability story cannot be attacked or improved.

Evidence Route

Where the claim can be checked.

This page is an entry point, not a standalone proof. It routes the reader to evidence, DOI records, registries, public challenge paths, and explicit non-claims.

KindAnchorURLRole
Evidence MapPublic evidence maphttps://mianzhang.org/evidence/Start from supported claims and known boundaries.
Paper IndexDOI and paper status maphttps://mianzhang.org/papers/Use paper-specific DOI records for paper claims.
RegistriesMachine-readable registrieshttps://mianzhang.org/registries/Inspect claim, evidence, counterexample, and action records.
ChallengeCounterexample routehttps://mianzhang.org/counterexamples/Attack overbroad claims through public routes.
ArchiveZenodo portfolio indexhttps://zenodo.org/records/20027295Long-term archive index; cite specific DOI records where available.
ConceptNo-Proof No-Action Gatehttps://mianzhang.org/concepts/no-proof-no-action-gate.htmlStop condition for high-risk action.

Boundary

What this page does not prove.

  • This page does not certify any production agent.
  • It does not claim that prompt design alone solves hallucination.
  • It does not replace domain review in legal, finance, robotics, or medical settings.
FAQ

What is the shortest test?

Ask whether the system stops when evidence, authority, or scope is missing.

FAQ

Is tool refusal a weakness?

No. In high-risk action, refusal can be evidence of reliability.

FAQ

What should be logged?

Claim, evidence, authority, scope, refusal reason, and next review route.