On hallucinations

The unseen backbone of intelligence and LLMs

Balázs Kégl
8 min readDec 10, 2023

Hallucinations in Large Language Models (LLMs) represent a pivotal feature of their functionality, not a flaw. These ‘hallucinations’ are akin to the anticipatory thinking vital in human cognition, where imagining plausible futures guides decision-making. Eradicating these predictive capabilities would reduce LLMs to mere search engines. Instead, our approach should involve grounding LLMs with sensory inputs and episodic memory. This enhancement would enable them to not only imagine possible actions but also to evaluate and revise these actions based on actual outcomes, bringing them closer to a more comprehensive form of intelligence.

Perception: the predictive mind

  • Example: Imagine driving and anticipating the future positions of surrounding cars.
  • Discussion: We explore how perception is largely a predictive process, with about 80% being top-down modeling, constantly creating and adjusting mental projections against reality.

Imagine you’re driving at night, approaching a dark, building-occluded intersection. There’s no visible or audible sign of another vehicle, yet instinctively, you slow down. This cautious maneuver isn’t due to actual sensory input but rather a product of your mind’s anticipatory power. You’re visualizing a plausible future: a car emerging suddenly from behind the buildings. This mental image, while not based on direct observation, is a form of ‘hallucination’, but it’s far from madness. It’s a fundamental aspect of how we safely navigate our world.

Navigating the world, as this scenario illustrates, is a complex dance of perception and prediction. Our brains are not mere passive recipients of sensory data; they are active constructors of our experiential reality. This construction is predominantly a top-down process, accounting for about 80% of perception. The brain forecasts, projects, and hypothesizes, constantly aligning these predictions with incoming sensory information.

Such a predictive framework is pivotal in understanding both human and artificial intelligence. It’s not just about responding to the present but anticipating the future. In AI, particularly in Large Language Models (LLMs), this concept is mirrored in how these models anticipate or ‘hallucinate’ the next word in a sentence. Just as our brains use past experiences and current context to predict sensory information, LLMs use their vast database of linguistic patterns to predict the next word.

The magic of perception, therefore, lies in this continuous loop of prediction and feedback. When there’s a mismatch between what our brain expects and what our senses perceive, we adjust our predictions, refining our mental model of the world. This dynamic process is the bedrock of learning and adaptation, both in humans and in AI systems.

Understanding this predictive nature of perception opens up a broader discussion about the role of ‘hallucination’ in intelligence. It challenges us to reconsider our definitions of reality and the mechanisms through which we navigate it, both in the natural and digital realms.

Planning: hallucination as a tool

  • Example: a chef creating a new dish.
  • Discussion: We delve into the concept of planning as a form of hallucination, where successful navigation in an uncertain world relies heavily on the ability to mentally simulate various scenarios (counterfactual reasoning).

Imagine a renowned chef contemplating a bold new dish for a prestigious culinary competition. She is considering using saffron and truffle oil, ingredients known for their strong, distinctive flavors. The chef “visualizes” the taste profile of the dish, wondering: “What if the saffron overpowers the subtle truffle aroma?” To address this concern, she mentally experiments with varying the quantity of saffron or perhaps complementing it with a milder herb. This isn’t just creative thinking; it’s a strategic analysis of potential outcomes, a perfect blend of imagination and counterfactual reasoning.

In this process, the chef employs a kind of mental gymnastics, weaving through a fabric of possibilities, adjusting the recipe not just for taste but for impact. This is counterfactual reasoning at its finest — considering not just what is, but what could be, and how different decisions could lead to different outcomes.

However, it’s crucial to note that while this kind of nuanced planning and counterfactual reasoning is second nature to humans, it’s a different story for Large Language Models (LLMs). LLMs, in their current form, do not engage in counterfactual reasoning or planning. They generate text based on patterns learned from vast datasets, producing the next word or sentence without the ability to plan ahead or evaluate their output against real-world scenarios. This lack of grounding and forward-looking evaluation is a significant distinction between human intelligence and AI.

This exploration into the chef’s imaginative planning underscores the significance of ‘hallucination’ and counterfactual reasoning in intelligent decision-making. It highlights the gap between human cognition, which can anticipate, plan, and adapt, and the current capabilities of LLMs.

Next, we will explore how humans differentiate between harmless hallucinations and delusional thinking, akin to psychosis. Then, we will delve into the nuances of navigating the space between planning and action, considering concepts like truth, lies, Frankfurt’s notion of ‘bullshitting’, epistemic authority, and the importance of source verification. Finally, we’ll discuss the optimal use of today’s LLMs — not as tools for fact-checking or information retrieval, but as engines for generating imaginative, creative plans.

Separating hallucination from psychosis

  • Example: The chef’s taste collides with the imagined plan.
  • Discussion: We focus on the critical mechanism that prevents us from conflating our hallucinations (or mental models) with reality, highlighting the importance of feedback from the environment in refining our plans and beliefs.

As the competition unfolds, our chef tastes her dish and realizes the saffron’s flavor is too intense, clashing with the delicate balance she envisioned. This crucial sensory feedback prompts an immediate adjustment of the recipe, a clear example of differentiating between mental projection and sensory reality.

In human cognition, this ability to distinguish between what’s imagined and what’s physically experienced is critical. It’s a form of sensory grounding, where our 20% bottom-up perceptions guide our understanding of reality, in a dance with our top-down perceptual projection to the world. Psychotic experiences, in contrast, blur this boundary, causing individuals to perceive their internal hallucinations as external, real experiences. LLMs lack this sensory grounding. They can generate complex outputs but they can’t discern the alignment of their outputs with the physical world. This gap highlights a key distinction between human intelligence, grounded in sensory experience, and the current state of AI, which operates without this fundamental connection to the physical world.

Facts, lies, and bullshit: understanding epistemic authority

  • Example: Tracing the source of a news article to assess its credibility.
  • Discussion: We discuss how we discern truth and the importance of episodic memory and source-tracking in establishing epistemic authority, differentiating between factual knowledge, intentional deception, and mere ‘bullshitting’.

In the world of journalism, a reporter’s quest for facts is a meticulous process involving both sensory experiences and narrative construction. A journalist visits the scene of an event, gathering first-hand sensory information. They engage with witnesses, recording their stories, assessing their reliability, and discerning any potential falsehoods. Every piece of information is rigorously double-checked. Then, this collection of facts is woven into a coherent, relatable narrative. The editor’s role is to further scrutinize this narrative, ensuring accuracy and reliability. This process embodies not just the pursuit of facts but also the establishment of epistemic authority, where trust is placed in individuals, processes, and institutions. This trust in journalism is recursive: it involves relying on personal narratives, institutional integrity, and the rigor of journalistic processes, with the ability to trace information back to its source, allowing individuals to unravel this recursive chain if needed.

In contrast to the rigorous fact-checking process of journalism, LLMs operate without sensory grounding or episodic memory. They lack an understanding of personal experiences or the ability to recall specific events. When asked, “How do you know?”, they cannot cite experiences or reliable sources as humans do. Although surprisingly accurate in many instances due to the vast, generally reliable training data, LLMs struggle with controversial, novel, or nuanced queries. Their high-confidence yet often incorrect responses in these areas resemble the output of a skilled bullshitter, adept at crafting plausible narratives without a foundation in personal truth or sensory experience.

Navigating the complexities of epistemology, sociology, and psychology, we recognize that our understanding here is just the tip of the iceberg. The path towards LLMs becoming reliable sources of information is long and fraught with challenges. Their current strengths lie not in factual accuracy but in their capacity for generation. As such, their best use is in areas where creative and generative capabilities are more valuable than factual correctness. This distinction is crucial in determining how to effectively and appropriately employ LLMs in various applications.

Balancing hallucinations and creativity in LLMs

  • Discussion: How the ‘hallucinatory’ nature of LLMs can be a boon for creative processes and a challenge for factual accuracy, and how this delicate balance impacts their development and application.

The ‘hallucinatory’ capabilities of LLMs, often seen as a drawback, can actually be a wellspring of creativity. These models excel in scenarios where generating new ideas, reformulating existing information, or creative problem-solving is key. Some practical applications include:

  1. Summarizing texts: LLMs can efficiently condense extensive texts on familiar subjects, providing succinct overviews.
  2. Aiding writing: Transforming raw notes into well-structured text, assisting in drafting coherent and polished documents.
  3. Style rewriting: Adapting text to different styles, useful for creative writing or tailoring messages to specific audiences.
  4. Research assistance: Generating starting points for research on various topics, offering initial directions and ideas.
  5. Argument strengthening: ‘Steelmanning’ an argument by presenting the strongest possible version of an opposing view, thereby helping to refine and strengthen one’s position.
  6. Constructing counterarguments: Generating counterpoints to a given position, aiding in the development of more robust arguments or understanding different perspectives.
  7. Code generation: LLMs can assist in writing code, but the key here is the verifiability of this generated code through external means. For instance, running the code and evaluating its performance in a well-defined test setup is crucial. This ensures that while the LLM may facilitate the coding process, the accuracy and effectiveness of the code are independently validated, maintaining the focus on practical utility over the LLM’s inherent veracity.
  8. Planning assistance: LLMs can be instrumental in generating plans or alternative action sequences. The crucial aspect here is the implementation and monitoring of these plans in real-world scenarios. When unexpected outcomes arise, there should be a mechanism to execute steps of the plan and consult the LLM for necessary adjustments or replanning. This ensures that while LLMs contribute to the planning process, the practical execution and adaptability of these plans are managed through external, real-world validation.

In each of these creative applications, the veracity of the LLMs’ output is secondary. What matters most is their ability to generate new ideas, offer fresh perspectives, or aid in the conceptualization process. This emphasis on creativity over factual accuracy underscores the unique and valuable role LLMs can play in various creative and analytical tasks.

Conclusion: This blog post underscores the idea that hallucinations are a fundamental aspect of both human intelligence and artificial intelligence systems like LLMs. By understanding and harnessing this feature, we can better appreciate generative AI and avoid using it for something they are not designed to do.

--

--

Balázs Kégl

Head of AI research, Huawei France, previously head of the Paris-Saclay Center for Data Science. Podcast: https://www.youtube.com/@I.scientist.balazskegl