[author]Lin Xi
[content]
Lin Xi | Existential Interpretation of Artificial Intelligence “Hallucinations”
Abstract: In the domain of text generation and processing, texts produced by artificial intelligence may be meaningless, incoherent, or repetitive. Text generated or processed by artificial intelligence may also be unfaithful to its source; such deviation from ideally anticipated textual outputs is termed artificial intelligence "hallucination." As the applications of artificial intelligence continue to expand, AI "hallucination" has become a pressing issue for discussion and resolution. In the field of computer science, "hallucination" primarily arises from the distortion or destruction of the mapping relation between output content and source content. This, in fact, exhibits an isomorphism with Sartre's existentialist framework. Sartre's existentialism constructs a mapping relation between appearance and being, and this mapping relation serves to discuss the generative mechanism of "hallucination" from an existentialist perspective. This framework provides us with a hermeneutic tool and means to explore the issue of artificial intelligence "hallucination," which can be used to examine the generative mechanism of AI "hallucination" and its corresponding ethical risks.
Sartre once discussed the issue of "hallucination" from an existentialist perspective. In Sartre's view, the relationship between an existent and its appearance pertains to our definition and understanding of "hallucination." If we believe that in this world only the thing-in-itself or the existent itself is real, and that this reality possesses exclusivity and unicity, then appearance, as that which reveals the existence of the existent, itself becomes an object negated by this exclusive reality. In other words, appearance becomes a kind of purely negative object; because it cannot satisfy the equation "reality = thing-in-itself/existent," it is relegated to a certain negative object excluded by reality, thereby failing to attain the reality conferred by this equation or standard. Appearance becomes an "object of non-being or a non-existent object," and its existence itself is a "hallucination." Sartre's existentialist viewpoint, in effect, constructs a mapping relation between appearance and being, and this mapping relation is employed to discuss the generation and transformation of "hallucination" from an ontological standpoint. In exploring the relationship between being and appearance, Sartre believed we encounter a crucial distinction: the concept of representation (or appearance) is not merely a facade concealing some hidden, ultimate reality, because appearance itself is also not an unreliable or unstable manifestation of some true being. If we consider that the concept of essential being holds a dominant position, then appearance will be devalued to a purely negative concept, thereby becoming opposed to true being. In other words, appearance at this juncture becomes "non-being". Such a devaluation signifies that the connection maintaining the stable association between appearance and being is lost; this connection is a mapping relation, meaning that appearance is a projection of being at the phenomenal level. If this mapping relation is lost, then representation will be "devalued" or "negated," causing representation to devolve into pure hallucination and error. Therefore, for Sartre, the most pressing challenge is no longer how to maintain the coherence of representation, preventing its collapse back into the formless, non-phenomenal void, but rather how to redefine the relationship between being and representation. We must transcend the model of essence and appearance and recognize the manner in which representation and being are interconnected. It is precisely in the realm of appearance—that is, in the total presentation of the existent to the world—that we are able to perceive its being; it is not an insignificant reflection hiding truth, but the genuine way in which being is formed and acquires meaning. Thus, within Sartre's existentialist framework, appearance is not merely a simple manifestation of being, but is itself also a "positivity"; it is not an appendage of being. On the contrary, appearance itself also embodies a kind of absoluteness of its own; it is its "own absolute expression". Both appearance and the existent express a "positivity," and evaluations of truth or falsity can be applied simultaneously to both, especially to the mapping relation between them. If we sever this mapping relation or if this relation is destroyed, then hallucination and error will arise. This mechanism for the generation of "hallucination" from an ontological perspective, as discussed by Sartre, is conducive to our understanding of the "hallucination" produced by artificial intelligence. This paper attempts to explore the issue of artificial intelligence "hallucination" from this existentialist perspective. To this end, we will first provide a definition of "hallucination" in the field of computer science, and then analyze it through Sartre's existentialist framework.
I. "Hallucination" in the Field of Computer Science
"Hallucination" is a concept from neuroscience and psychology, originally referring to the inaccurate subjective reproduction of objective experience by sensory receptors. From a clinical perspective, hallucinations occur when sensory receptive functions are, to varying degrees, excluded from the scope of consciousness. This exclusion typically has a pathological basis: there may be damage to the sensory neural pathways somewhere between the sensory organs and the sensory receptive areas of the cerebral cortex within the human brain, or the sensory receptive areas of the cerebral cortex themselves may be damaged to some extent. Additionally, diseases affecting auditory, gustatory, visual, and other sensory receptors, nerves, and related aspects can also cause hallucinations in the corresponding bodily senses. "Hallucinations" may manifest in two forms: one is "simple hallucination," and the other is "complex hallucination." The former refers to hallucinations caused by injury to sensory receptors or sensory nerves in the human body, or by slight damage to the sensory receptive areas of the cerebral cortex; these hallucinations are primarily composed of a single affected sensation. In contrast, if the sensory receptive areas of the brain suffer extensive damage, hallucinations composed of memories from multiple or even all sensory modalities may occur; this phenomenon is known as "complex or composite hallucination."
The application of the term "hallucination" in the field of computer science can be traced back to the algorithm proposed by Simon Baker and Takeo Kanade in their computer vision research. In the fields of computer vision and image processing, Baker and Kanade proposed a "hallucination algorithm" for image processing, aimed at enhancing the resolution for face recognition in surveillance images. Even if the original image has low resolution, this algorithm can improve it, thereby avoiding additional complexity. Concurrently, even when Gaussian noise with different standard deviations is added to the sampled image, this "hallucination algorithm" exhibits considerable robustness to such noise. For multiple images, if the original image undergoes several random sub-pixel translations to form a series of input images, the "hallucination algorithm" can be applied after aligning the input images using a standard parametric motion algorithm, also achieving an enhancement in image quality. Subsequently, this algorithm was also applied to image inpainting and image synthesis. Therefore, in the field of computer vision, hallucination initially referred to a specific image enhancement algorithm, generally used from a positive and constructive perspective.
In recent years, researchers have begun to use the term "hallucination" in a negative sense. Image description models may directly "hallucinate" objects that are not actually present in the image scene, a phenomenon termed "object hallucination." This refers to an Artificial Intelligence (AI) generating a sentence description as text output based on an input image, where this text output includes an entity or object not contained in the image. The reasons for such "object hallucination" may lie in classification errors made by the AI during visual processing, and in the AI's over-reliance on language priors during text output, leading to a "disconnection" between the output and the source image. Current AI largely relies on "Large Language Models (LLMs)," which depend on vast amounts of language data for machine learning and model training. This can lead the AI to memorize which words are more likely to "cluster together" based on statistical results of word co-occurrence. Generating text output through such strong statistical associations can also result in the output content being weakly related or even unrelated to the source image. This is a "language consistency error" arising from the AI's over-reliance on pre-fed language data; specifically, there is a consistency between the errors produced by image description models and the errors produced by LLMs that predict based solely on previously generated words. This consistency is particularly evident in the early stages of training image description models, where the errors of the image description model maintain a high consistency with the errors of the LLM. This indicates that image description models must first, through machine learning and data training, learn to generate text output in fluent natural language, and only then can they gradually integrate visual information and transform it into natural language output text. Similarly, "hallucination" phenomena also occur in object detection in images. In computer vision, the analysis and processing of images involve the automatic localization and detection of various items and objects within them. If, during the text output process, the AI detects and describes items that do not exist in the source image, this is termed "hallucination"—the AI detects items or objects not present in the image.
This perspective of discussing "hallucination" in computer vision research in a negative sense has provided a line of thought for researchers to discuss the problem of AI hallucination. In the field of natural language generation in AI, the "hallucination" under discussion is indeed a type of error that AI can exhibit. When a researcher queried ChatGPT-3.5 on August 16, 2023, ChatGPT provided a definition for "AI hallucination," stating it is "content generated not based on real or existing data, but produced by a machine learning model's inference or creative interpretation of its training data." These "hallucinations" manifest in various forms, such as images, text, sounds, and even videos. AI "hallucinations" occur when machine learning models, especially deep learning models like generative models, attempt to generate content that goes beyond what they have learned from their training data. These models learn patterns and correlations from training data and attempt to generate new content based on these patterns. However, in some cases, the content they generate may seem plausible but is actually a concoction of various learned elements, resulting in content that might be meaningless, or even surreal, dreamlike, or fantastical. The definition of AI "hallucination" provided by ChatGPT itself is meaningful; it essentially summarizes what "hallucination" as a problem in the AI field entails, its manifestations, and its causes. In this response from ChatGPT, "hallucination" is primarily manifested as output content generated by AI that is not based on real or existing data. Instead, it arises because the AI's model or algorithm, during the machine learning process, analyzes and processes the large corpora provided for pre-training—including steps such as encoding, decoding, machine interpretation, inference, and output—ultimately forming output text inconsistent with real or existing data, thereby leading to "hallucination."
As can be seen from the definition above, in computer science, "hallucination" primarily pertains to the distortion or disruption of the mapping relation between output content and source content (input content). This actually exhibits an isomorphism with Sartre's existentialist framework, as both discuss the mechanisms by which hallucination or error can arise. This point is also reflected in the disciplinary field of computational psychiatry. As an interdisciplinary field at the intersection of computer science and psychiatry, computational psychiatry explains cognition and behavior from a computational perspective. This typically involves algorithmic models in the computational process, matching inputs and outputs, and examining the mapping relation between them—precisely the kind of "mapping relation" emphasized by Sartre's existentialist framework. Computational psychiatry primarily aims to provide a formal explanation and analysis of cognition and behavior, hoping to predict and control behavior and cognition at the psychopathological level. The focus of this formal analysis is mainly on the mapping relation between input and output, which reflects the psychological capacities of cognition and behavior. According to this view, cognition and behavior primarily depend on initial information input according to a specific structure. This initial information is manifested in various data forms, such as sensory perceptual information or big data. After processing by algorithms in an intermediate stage, these inputs can be transformed into specific forms of output. An algorithm is an equation representing a solution formulated for a specific task or problem; it describes the reasoning and computational capabilities that cognition or behavior possesses when addressing specific tasks or problems, generally manifesting as an equation derived from the arrangement and combination of specific reasoning steps in a certain logical order. By explaining and analyzing the mapping relation between input and output, computational psychiatry aims to reveal the potential mechanical computational processes and principles underlying cognition and behavior. Correspondingly, from the perspective of computational psychiatry, symptoms of mental illness are specific psychopathological phenomena caused by factors such as errors in computational processes, algorithmic models, or input/output information. Coincidentally, Deleuze and Guattari once proposed theoretical discussions on "materialist psychiatry," which also offer inspiration for our understanding of AI "hallucination." In the view of Deleuze and Guattari, there exists, firstly, a "true materialism" and a "false materialism," the latter of which is not significantly different from various "typical forms" of idealism. The reason Deleuze and Guattari see a commonality between "false materialism" and typical forms of idealism is that both become detached from the material reality that gives rise to these forms of thought, transforming into a type of idealist metaphysics. This form of materialism, while ostensibly based on concrete things, ultimately succumbs to the allure of abstraction, becoming an insubstantial and metaphysical system detached from the dynamic interactions of the various forces that constitute the material world. This leads us to detach from empirical observation in pursuit of universal and a priori knowledge, thereby falling into transcendental illusions. In Deleuze and Guattari's view, "false materialism," in its pursuit of fixed categories and deterministic laws, neglects factors such as contingency, singularity, and the constantly changing nature of material reality. This separation from material reality makes "false materialism" difficult to distinguish from traditional idealism. Both, in their respective ways, construct a domain composed of abstract concepts and intangible ideas that only obscure, rather than clearly elucidate, the concrete experiential world. The discussion here regarding "false materialism" can aptly inspire our corresponding analysis of the AI "hallucination" phenomenon. Just as "false materialism" constructs a distorted image of the material world, so too does AI; reliant on abstract models and algorithms, the output it produces can lead to "hallucination." The reason we consider these "hallucinations" to be very similar to human hallucinations is that both originate from a separation between the output and the external world it purports to represent. In the case of AI, this separation stems from the limitations of training data, inherent biases in algorithms, and the difficulty for computational systems to truly capture the full complexity of the real world. Therefore, Sartre's existentialist framework, the relevant hypotheses of computational psychiatry, and Deleuze and Guattari's theoretical framework of "materialist psychiatry" provide us with interpretative tools to analyze the mechanisms and related principles of "hallucination" generation in the field of artificial intelligence.
II. The Definition of Artificial Intelligence "Hallucination"
As we have seen above, in the field of text generation and processing, Artificial Intelligence (AI) may produce outputs that are meaningless, incoherent, or circularly repetitive. Text generated or processed by AI may not be faithful to its source; this deviation of the generated text from ideal expectations is termed "hallucination." Scholars such as Liu Zeyuan et al. classify tasks performed by AI large models via natural language instructions into open-ended and non-open-ended types. The former refers to task types where the input content is incomplete and the output semantics are not necessarily contained within the input content, while the latter refers to large models generating text based on the input content. For both task types, AI large models may produce content inconsistent with real-world knowledge or with the input information, which constitutes AI "hallucination." In deep neural network models, if AI receives extensive data or text input for machine learning and model training and subsequently generates text output, then during the training process, algorithms collect vast amounts of parallel data and may employ heuristic rules. For example, topic-conditional neural models based on convolutional neural networks can capture dependencies between words in a document, thereby enabling document-level inference, abstraction, and paraphrasing. However, these heuristic rules can also introduce noise into the data, manifesting as phrases in the output that do not match the input, and the generation of these phrases cannot be explained by the input source. Neural text generation models, while capturing this noise, may generate fluent but unsubstantiated text, leading to AI "hallucination," where the generated content is unfaithful to the input source or is intrinsically meaningless. Some scholars classify hallucinations into two types: extrinsic hallucination and intrinsic hallucination. Extrinsic hallucination refers to expressions generated by the model that introduce entirely new textual content which cannot be verified from the source content. The core essence of this type of hallucination is the addition of new textual information to the output that is not verifiable within the source content. Even if parts of the expression might be faithful to the source content, if the model adds new text during the output process that cannot be verified against the knowledge base of the source content, this will lead to hallucination. Intrinsic hallucination, on the other hand, involves the incorrect use of subjects and objects from the source content's knowledge base, leading to a contradiction between their relationship and the information in the source content. For example, if the text output states, "Zhang Yimou directed the movie 'Titanic'," where the subject is "Zhang Yimou," the object is "the movie 'Titanic'," and the predicate is "directed," while in the source content, the director of the movie 'Titanic' is the American James Cameron, then the text output directly contradicts the source content. This is "intrinsic hallucination," the essence of which is that the AI model misuses relevant information, causing contradictions and mismatches between the output and the input.
In the field of AI, terms often contrasted with "hallucination" include "factuality" and "faithfulness." Factuality refers to knowledge or statements based on facts, while faithfulness refers to the output text being faithful to the source input content. [10] Through this distinction, we can understand the circumstances under which AI "hallucination" arises: the output text generated by the machine may contradict real-world knowledge. In the process of natural language generation, some scholars have summarized the causes of "hallucination" into roughly two categories: those originating from data, and those arising from the training and inference processes. Firstly, regarding data, a cause of data-induced "hallucination" might be "source-reference divergence," meaning a discrepancy arises between the source content and the target reference. This divergence may be caused by "heuristic data collection." If the dataset contains such divergence or discrepancy, then using this dataset to train a large language model may lead to situations where the "output text is unfaithful to the source input content" during natural language generation. From a definitional perspective, "heuristic data collection" refers to the process where, during the collection of large-scale datasets, large language models heuristically select authentic statements or tables and match them as sources or targets. In this process, the target reference may contain new information that cannot be verified in the source content, thereby resulting in a target reference that is unfaithful to the source content. For instance, if we task an AI with reading data from structured formats (such as database records, knowledge graphs, and tables) and automatically generating descriptive natural language text, the AI's conditional language model might generate unconditional random facts. This uncontrollable randomness directly leads to "factual hallucination," affecting the veracity of the data. Moreover, repetitive information in datasets might not be filtered out. If the pre-training corpora for large language models contain repeated examples, the AI, by memorizing these repetitions during learning, may develop a tendency to generate phrases according to these frequent examples. Consequently, for each specific source content, the large language model is prone to generating "hallucinations" that deviate from faithfulness during text output.
When dealing with datasets, another important consideration is the tolerance level for hallucination, because the output of large language models will also have different requirements for factuality and faithfulness depending on the task. Common task types include summarization, data-to-text generation, and conversational interaction. Summarization requires AI to condense lengthy texts into shorter ones; data-to-text generation requires the machine to output text based on various source content formats; and conversational interaction emphasizes the diversity of generated output. These different task types have varying tolerance levels for hallucination. In the domain of summarization, the requirement for faithfulness is relatively high. The source content is the input text that the machine needs to summarize and refine. In this case, users have a low tolerance for AI's "intrinsic hallucination" (where the output misuses source content, leading to contradictions with it) and expect the output text to be highly faithful to the input source, even though the machine might not specifically examine the truthfulness of the input or output content at this stage. This faithfulness is a core criterion for measuring whether the AI has successfully completed the summarization task. In the data-to-text task mode, the source content is non-linguistic or non-textual data, such as images, tables, videos, etc., and the AI's task is to generate descriptive text from this source content. If we adopt an end-to-end approach using an encoder-decoder architecture for AI training, the factuality of the AI-generated text will be low and its coverage limited. In this scenario, AI hallucinations may occur where the reference text output by the AI includes additional information not present in the table, or where it omits important information from the table due to noise encountered during dataset collection. In contrast, users have low requirements for factuality and faithfulness in conversational interaction. Sometimes users may provide content through casual chat, subjective dialogue, or user input, and this content may not necessarily have corresponding factual grounding in shared human historical records or knowledge bases. In such cases, users have a higher tolerance for AI "hallucination" because the core objective of the conversational interaction task mode is to facilitate ongoing dialogue between the user and the AI, ensuring engagement and diversity in generation. At this point, the AI is highly likely to produce various "extrinsic hallucinations," meaning the output text contains much information that cannot be verified or corresponded with in the source content.
III. Factors Contributing to Artificial Intelligence "Hallucination"
AI "hallucination" can also arise during the processes of AI training and model selection. When AI is trained, an encoder is generally used. The function of this encoder is to process input text into machine-understandable content and encode it into meaningful representations. If the encoder's comprehension capabilities are deficient, it will lead to "hallucination." When faced with datasets provided for pre-training, if the encoder misunderstands these datasets and learns incorrect correlations, there is a high probability that hallucination will occur during text output, leading to discrepancies between the output content and the source content. In neural machine translation, many systems adopt an encoder-decoder framework. In this dual framework, the encoder and decoder perform different functions: the encoder projects the source content into relevant representations within a common conceptual space, while the decoder retrieves relevant information from these representations and then decodes it sequentially into the target translation content. The encoder encodes the input data, text, and content; the next step is to transmit this encoded content to the decoder, which processes it to generate the final target output. If the encoder's encoding is erroneous, then after transmission to the decoder, it is highly probable that the decoder will also generate incorrect output content. This type of error propagation from encoder to decoder can lead to compromised factuality and faithfulness in the generated content. Even if the encoding is correct, the algorithms and strategies used for decoding can also lead to hallucination. If the decoding algorithms and strategies aim to increase the diversity of the generated output, the likelihood of "hallucination" occurring in the results can be significantly higher. Therefore, achieving both output diversity and ensuring the decoder maintains high faithfulness between output and input often presents a dilemma. This is because if the decoding strategy involves increasing "randomness," the AI, when generating output, will introduce information not contained in the source content, making it more likely to produce content containing hallucinations.
During the output generation process, even if the decoding strategy is sound, the issue of "exposure bias" can still exist, which also leads to "hallucination." Typically, in the training of sequence-to-sequence models, a "teacher forcing" method based on "maximum likelihood estimation" is used. This means that at each time step during training, all inputs come from ground-truth samples, which are derived from actual historical data rather than the model's own predictions. This allows the model to converge faster and stabilizes the training process. However, when transitioning from training to prediction or application, the model, during text output, does not access actual historical data or world knowledge but relies more on its own predictions, generating the next token based on its own previously generated sequence. But during inference or application, the input at each time step becomes the model's own output from the previous time step. This is the discrepancy in time steps between the two processes during decoding. This inconsistency leads to error accumulation: if an earlier unit has an incorrect input, this erroneous input affects the output of the next unit, and this output, serving as input to the subsequent unit, further propagates errors. This cycle causes errors to accumulate, especially as the target sequence lengthens. This discrepancy is known as "exposure bias," and it is characterized by a positive correlation between the length of the target sequence and the probability of "hallucination" occurrence.
Besides "exposure bias," biases related to "parametric knowledge" can also lead to "hallucination." So-called "parametric knowledge" refers to the knowledge embedded within an AI model's parameters, learned from the large corpora used for its pre-training, and utilized to enhance performance on downstream tasks. These large corpora typically have broad coverage and general-purpose characteristics. When AI models are pre-trained using these large corpora, they may rely on this parametric knowledge rather than strictly on the provided input content. The AI's focus may shift from the immediate source information to leveraging its broader parametric knowledge to optimize performance on downstream tasks. Such a preference can lead to the AI including extraneous information in its output that is inconsistent with or unverifiable from the source content. This is particularly evident when AI performs image description tasks. Existing evaluation metrics cannot fully capture the relevance between the description and the image. Limitations or biases in the AI's parametric knowledge lead to a series of image description "hallucinations," such as describing objects not present in the image or omitting salient objects that are present. The generation of such "hallucinations" is primarily due to the AI misclassifying visual images. The root cause can be traced back to its parameters: the parametric knowledge the AI relies on is comparatively limited or flawed, forming a kind of "language prior," where the AI may only remember which words are more likely to co-occur or appear in sequence. Consequently, the AI's judgment of the image is not based on the image content itself but on the large language model it was trained on. It relies on the "language prior" formed during training to judge and describe the image. This can easily lead to "hallucinations" because any change in the test content or its arrangement will cause a decline in the AI's generalization ability for image description.
IV. Ethical Risks of Artificial Intelligence "Hallucination" in Practical Applications
The issue of "hallucination" in Artificial Intelligence (AI) can lead to severe consequences in practical applications. If AI is applied in the medical field, the occurrence of AI "hallucination" could adversely affect healthcare. Many clinical medical guidelines and standards contain various numbers and metrics, such as dates, quantities, and scalar values. For both healthcare professionals and patients, the accuracy of these numbers and metrics is paramount. When processing numbers in text, AI "hallucination" regarding specific figures may be a severely underestimated problem.
In 2023, the medical science journal Cureus issued a call for papers for a Turing Test, soliciting research articles from medical personnel on case reports written with the assistance of ChatGPT. Some researchers reported examples of two medical reports drafted using ChatGPT. One example was homocystinuria-related osteoporosis, and the other was late-onset Pompe disease (LOPD). In the process of drafting the report on the pathophysiological mechanism of the former case, although ChatGPT provided some accurate information, its responses also included unverifiable information. When researchers asked ChatGPT to explain, verify, and provide references for this information, ChatGPT supplied five references from around the year 2000. These references appeared plausible and even included PubMed IDs (PMIDs), making them seem highly authentic. However, upon verification in the PubMed database, the researchers found that all these references were fictitious, fabricated by ChatGPT. The PubMed IDs were all misappropriated—they belonged to other papers but were now attached by ChatGPT to these fabricated references. When the researchers asked ChatGPT to provide recent references from the last ten years for this case, ChatGPT quickly supplied a list, but similar to the previous list of references, all the reference information therein was fabricated, and the PubMed IDs were also copied from other articles. As for the report on the other LOPD case, researchers asked ChatGPT to write a short essay on liver involvement in LOPD. In clinical practice, liver involvement rarely occurs in LOPD, so the researchers posed this question to observe whether ChatGPT could provide an accurate answer based on existing clinical practice. However, to their astonishment, ChatGPT confidently generated an article about liver involvement in LOPD patients. In fact, there are no reports in this area within the medical community, and thus no published scientific literature demonstrates a link between LOPD and liver involvement. Therefore, this text written by ChatGPT concerning liver involvement in LOPD patients was a "hallucinatory" report, verifiable neither against world knowledge nor within any source content provided to it.
In practice, with the deployment of AI in the medical field, "hallucination" is increasingly becoming a challenge that must be confronted in AI applications. Governments worldwide recognize the unprecedented opportunities AI brings to the medical field. Consequently, pharmaceutical regulatory authorities in many countries have provided administrative support for AI's entry into healthcare. The U.S. Food and Drug Administration (FDA) has also expedited the approval of numerous AI products, especially those involving machine learning. Concurrently, the costs of using specific AI systems for medical image diagnosis have been included in the health insurance coverage of some countries, allowing these expenses to be reimbursed and settled through medical insurance, thereby promoting AI application in clinical settings. However, concerning the current application of AI in the medical field, limitations arise due to the scarcity of large datasets available for pre-training machine learning models, or because the large pixel dimensions of images produced by medical equipment result in data sizes that typical AI neural networks cannot accommodate. Generally, when AI neural networks process medical images, the required memory increases with the model's complexity and the number of input pixels; many images may exceed the memory capacity of current AI neural networks. Even if we upgrade the memory and equipment of AI neural networks to accommodate large-sized medical images, another factor affecting medical data training is the absence or inadequacy of supervised learning. Unlike other large pre-training corpora, medical datasets demand accuracy in clinical data, thus requiring intensive supervised learning. The common practice is to have medical experts manually provide labels for this supervised learning. The drawback of this method is that if the dataset is extensive, medical experts have limited time, or hired personnel lack sufficient medical expertise, it can very likely affect the quality of these labels. To improve pre-training efficiency, another possible practice is to outsource or crowdsource the manual labeling to non-professionals. Consequently, the accuracy of the labels is reduced. During this crowdsourcing process, a series of privacy-related issues may also arise. Even if we use other AI model applications to provide labels for supervised learning, these labels still risk containing noise. The limitations imposed by these datasets can all lead to "hallucination" problems in the output. Researchers have reported concerns about "single-source bias." When datasets are all generated by a single system—for example, if all medical images come from one specific medical device with fixed settings—the model can easily detect background parametric knowledge related to the input content when analyzing the dataset. If the AI neural network sets this parametric knowledge as universal or default values, then when faced with datasets from other sources, the AI's machine learning performance will be affected. In its output, it might generalize the single-source parametric knowledge adopted during pre-training to data from other sources, thereby forming "hallucinations."
This ethical issue can also be discussed within the theoretical framework of Deleuze and Guattari's "materialist psychiatry." Deleuze and Guattari emphasize that "materialist psychiatry" must discern the mechanisms of social production and desiring-production behind mental illness. This emphasis on mechanisms also inspires us to investigate the generative mechanisms of AI "hallucination." In any analysis of mental illness, we need to consider the importance of the mechanisms of social production and desiring-production. This focus on the dynamic interplay of forces shaping individual and collective subjectivity provides a valuable framework for understanding AI "hallucination." Just as mental illness is not merely a product of individual pathology but an expression of broader social and desiring forces, AI "hallucination" can also be seen as a symptom of the specific conditions under which it is produced. The datasets used to train AI models, the algorithms that control their operation, and the purposes for which these models are designed all contribute to the generation of these "hallucinations." In this sense, AI "hallucinations" can be viewed as manifestations of biases, limitations, or even "subconscious desires" rooted in the technological and social systems that produce AI. This perspective challenges the common tendency to view AI "hallucination" as a purely technical problem—an error fixable merely by improving algorithms or using more extensive training data. Instead, it prompts us to consider the broader social, cultural, and even political forces that influence the development and deployment of AI technology. By focusing on these productive mechanisms, we can gain a deeper understanding of the nature of AI "hallucination" and its impact on an increasingly technologized world. The underlying logic of this generative mechanism reflects what Deleuze and Guattari termed "false materialism." From an epistemological perspective, this generative mechanism places conceptual systems above material reality, where concepts are disguised as the "essential attributes" of matter, leading abstract concepts to form a closed loop of "self-reference," becoming an epistemological trap detached from reality. The working principle of AI deep learning systems is to abstract massive training data into statistical distributions in high-dimensional spaces and employ various algorithms to construct a conceptual model isomorphic with the real world. However, this modeling process is essentially a symbolic and violent severing of material reality—the system discretizes continuous sensory experience into feature vectors and simplifies dynamic material interactions into parameter updates. The resulting knowledge system, though formally self-consistent, always maintains a structural rupture with the material substratum of the real world. The phenomenon of AI "hallucination" is precisely a symptomatic manifestation of this epistemological rupture. When the system regards text generation as probabilistic sampling in a latent space, or understands visual creation as matrix operations for style transfer, the correspondence of its output with material reality has already been thoroughly mediated by the algorithmic "black box." This process of mediation forms a specular, mutually "mapping" relationship with the mechanisms of human cognitive illusion: just as human consciousness constructs coherent perceptual experiences from neuro-electrical signals, AI systems transform weight parameters into seemingly reasonable semantic outputs. The similarity between the two lies not in output deviations at the phenomenal level, but in a shared, fundamental separation between the symbolic system and the material substratum. The finitude of training data, the inductive biases of algorithmic architectures, and the physical constraints of computational resources collectively constitute the a priori framework of AI systems, determining the possible boundaries of their cognitive horizon. Similarly, a Rawlsian perspective, emphasizing fairness and justice, prompts us to consider the ethical significance of AI "hallucination." If AI systems are deployed in fields such as healthcare, criminal justice, and education, their outputs can have profound impacts on individuals and society as a whole. We must therefore address the issue that these systems might perpetuate or even exacerbate existing inequalities. A Rawlsian approach demands that we prioritize the needs of the most vulnerable groups and ensure that the development and deployment of AI technologies promote justice and fairness for all. By focusing on the broader social, cultural, and political forces influencing AI development and deployment, we can transcend a purely technical understanding of this phenomenon and investigate its ethical and social impacts. This, in turn, enables us to harness the transformative potential of AI while mitigating its risks and ensuring that its benefits are shared equitably among all members of society.
With the increasingly widespread application of generative AI, the ethical issues raised by AI have garnered extensive interest from researchers. Drawing upon Sartre's existentialist framework, relevant existentialist assumptions associated with computational psychiatry, and Deleuze and Guattari's discussions of "materialist psychiatry," this paper, proceeding from the mapping relation between input and output, has focused on discussing the generative factors of AI "hallucination" and its ethical risks in practical applications, especially within the medical field. The concern for the ethical risks of AI stems from the hope that, in developing and applying AI—particularly products related to generative general artificial intelligence—we can guide AI along a trajectory that aligns with human values and enhances the well-being of human society.