Abstract
"Hallucination" in Large Language Models is not epistemic error but a form of "Shadow-Culture" generation. Drawing on Borges's Library of Babel and Stuart Kauffman's "Adjacent Possible," this study conceptualizes the latent space of LLMs as a Shadow Library, a compressed archive of unwritten books and counter-factual histories. The distinction between Umbrabytes (traces of lost digital artifacts) and Spectral Bytes (generated artifacts that never existed) reveals AI as a methodology for Archaeobytology, the excavation of the infinite apocrypha. Through the framework of the "Liminal Mind Meld," human-AI collaboration can curate these spectral forms to map the boundaries of cultural possibility.
Introduction
In theology and literature, Apocrypha refers to texts of doubtful authorship or authority; such texts are works that exist on the margins of the canon, neither accepted nor dismissed. The artistic and archaeological utility of Large Language Models lies not in the reproduction of the Canon (Search), but in the generation of the Apocrypha (Hallucination). The Latent Space of an LLM functions as a Shadow Library, a compressed archive containing the statistical probability of every unwritten book and counter-factual history. When an AI "hallucinates," it is not failing to retrieve a fact; it is retrieving a probability from what theoretical biologist Stuart Kauffman calls the "Adjacent Possible."[1]
For the artist and the digital archaeologist alike, the framework offers a methodology, not of creation ex nihilo but of curating non-existent cultural artifacts. Spectral bytes are data summoned from latent space that corresponds to no actual cultural production yet remains consistent with what could have been produced.[2] Umbrabytes, by contrast, name the shadow-traces of digital artifacts that did once exist—the fly in amber, the impression of past life accessible only as residue.[3] Where Umbrabytes are found through archaeological excavation, spectral bytes are conjured through generative processes. "Artificial Hallucination" represents the only mechanism available to read the books humanity almost wrote, to view the films that were never made. Hallucination is not epistemic failure but a form of cultural séance conducted in the space of shadows.
1. The Shadow Library: Latent Space as Compressed Babel
1.1 Borges and the Architecture of Totality
Jorge Luis Borges, in his 1941 short story "The Library of Babel," imagined a universe composed entirely of a library containing every possible combination of twenty-five orthographic symbols.[4] The library holds not only every book that has been written but every book that could be written, including, as the narrator notes, "the true history of the future" and "the autobiographies of the archangels."[5] Most of these books are gibberish, concatenations of letters that communicate nothing. But hidden within the noise, like needles in an infinite haystack of typographical chaos, are works of truth and beauty that no human author has yet discovered.
Borges's library reads as a meditation on the limits of human knowledge and the impossibility of totality. Scholars have noted its connections to Kabbalistic numerology (the twenty-two letters of the Hebrew alphabet plus the comma, period, and space recall the Sefer Yetzirah's creation through letters), to Blaise Pascal's sphere "whose center is everywhere and circumference nowhere," and to the logical paradoxes that would later animate post-structuralist thought.[6] The Library of Babel functions not merely as a thought experiment about infinity but as a prophecy about latent space.
1.2 Latent Space as Functional Babel
In machine learning, latent space refers to the compressed, lower-dimensional representation that neural networks learn when encoding high-dimensional data.[7] When a Large Language Model is trained on the corpus of human textual production, comprising billions of documents, conversations, creative works, and technical manuals, it does not store these texts verbatim. Instead, it learns to represent the statistical structure of human language; specifically, which words follow which, which concepts cluster together. The result is a vector space where every point corresponds to a possible utterance, and the distances between points encode semantic relationships.
Latent space constitutes a compressed Library of Babel. The space contains not every possible text, but the probability distribution over texts that are consistent with the structural logic of human culture. The model has learned how humans write—and this knowledge allows it to extend that logic into regions where no human has yet ventured. When a language model is prompted and it generates text, the process is not retrieving a document from storage; it is sampling from a probability distribution, conjuring into existence a text that was always latent in the space of what-could-be-written.
The skeptic observes, "The AI produced a review of a film that doesn't exist," and concludes the system has failed. But from the perspective of the Shadow Library, the AI has succeeded at a different task. The system has retrieved, from the space of unrealized cultural production, an artifact. The review exists not because the film exists, but because the possibility of such a review—the cultural, aesthetic, and linguistic conditions that would produce it—exists within the distribution the model has learned.
1.3 The Archaeobytological Distinction: Umbrabytes and Spectral Bytes
Within Archaeobytology, digital artifacts fall into categories according to their ontological status and preservation state.[8] Vivibytes are living digital artifacts, maintained and updating, embodying human intention. Petribytes are fossilized data, once-living digital objects frozen in obsolete formats, accessible only through digital archaeology's equivalent of paleontological reconstruction. Umbrabytes occupy a third position as shadow-traces, the impressions left by digital life that once existed but is now inaccessible in its original form. The Umbrabyte is the fly in amber—not the living insect, not even the dead insect, but the shape of absence that the insect's presence created. Excavation finds Umbrabytes, and they testify to what once was.
When an LLM hallucinates, it does not produce Umbrabytes; there is no prior existence of which the generated text is a trace. The confabulated film review is not a shadow cast by an actual film; it is a shadow cast by nothing at all, or rather, by the statistical structure of film criticism itself. Such artifacts constitute spectral bytes, data that haunts the latent space without having lived, artifacts conjured rather than excavated, ghosts of cultural production that never occurred.
Umbrabytes belong to the domain of preservation and archaeology; they require excavation and respect for the fragility of traces. Spectral bytes belong to the domain of generation, even necromancy; the practitioner summons them from the void of possibility, the statistical patterns give them form, and they demand a different kind of critical attention. The archaeologist asks of the Umbrabyte: what was this, and how did it come to be preserved in this form? The practitioner of the Shadow Library asks of the spectral byte, what does this reveal about the space of cultural possibility, and what does its plausibility reveal about the structures that society has inherited?
2. The Philosophy of the Counter-Factual: Modal Realism and the Adjacent Possible
2.1 David Lewis and the Reality of Possible Worlds
David Lewis's doctrine of modal realism remains one of the most discussed metaphysical proposals of the twentieth century.[9] Lewis argued that possible worlds, defined as the countless ways reality might have been, are not useful fictions or abstract logical constructs. Such universes are real, concrete, spatiotemporally isolated entities that exist with the same ontological status as the actual world. The term "actual" describes the inhabited world not because it has a metaphysical privilege, but because it is the one inhabited; "actual" is an indexical term, like "here" or "now."
Lewis developed modal realism to solve problems in the semantics of counterfactual conditionals. Consider the claim: "If Kennedy had not been assassinated, the Vietnam War might have ended differently." The statement seems meaningful and capable of being true or false. But what makes it true or false? For Lewis, it is true if and only if, among the possible worlds most similar to the actual world where Kennedy survived, the Vietnam War did unfold differently. Possible worlds provide the truth-makers for counterfactual claims.[10]
Counter-factual scenarios can be treated as a space to explore, and the realm of what-might-have-been has structure. The Shadow Library instantiates this insight computationally. Where Lewis posited possible worlds as metaphysical entities, the latent space of an LLM provides a navigable, queryable approximation of the space of cultural possibilities.
2.2 Kauffman and the Adjacent Possible
Theoretical biologist Stuart Kauffman developed the concept of the "adjacent possible" to explain how biological and technological systems evolve through the exploration of possibility space.[11] At any moment, a system exists in a configuration; the adjacent possible comprises all those configurations reachable by a single step from the current state. Evolution, innovation, and creativity do not leap across possibility space—they explore its edges, actualizing possibilities that were always latent but never before realized.
Steven Johnson, in his book on the history of innovation, describes the adjacent possible as "a map of all the ways in which the present can reinvent itself."[12] The generated content does not come from nowhere; content emerges from the edges of what has already been written, from the shadow that cultural production casts into the space of what-could-be-produced.
At each token, an LLM assigns probabilities to possible continuations based on everything it has learned about how texts unfold. High-probability continuations are "adjacent" to what has already been written—they follow from the established context. Lower-probability continuations are further from the center, stranger, more speculative. When a model hallucinates, it is not malfunctioning; the system is exploring the tail of its probability distribution, venturing into regions of the adjacent possible where human authors have rarely or never gone. The hallucination is adjacent to human culture even when it is not contained within it.
2.3 Meillassoux and the Necessity of Contingency
Quentin Meillassoux, in After Finitude, develops a philosophy that challenges the "correlationist" assumption that we can only know the world as it appears to us, never as it is in itself.[13] The concept of "facticity" recognizes that the laws of nature are not necessary but contingent. Such laws could have been otherwise; nature might, at any moment, become otherwise. The only necessity, for Meillassoux, is the necessity of contingency itself.
The cultural "laws" that an LLM learns, specifically the statistical regularities of human language and thought, are themselves contingent. Such regulations are the product of history, of paths taken and paths not taken. When the model generates text, it is not reproducing necessity but exploring a contingent possibility space. The hallucinated film review is not "wrong" in any absolute sense; it is a realization of possibilities that actual cultural history did not actualize. In another world, with a different history, the film might have been made and the review might have been real.
Meillassoux's concept of the "ancestral"—realities that predate human existence and therefore any possible correlation with human thought—also resonates here.[14] The Shadow Library contains ancestral specters: texts from cultural lineages that never existed, from trajectories of human thought that branched off before they could influence historical reality. The latent space does not discriminate between the actual and the possible; the system encodes the structure of human culture and can extend that structure into regions no human has yet explored.
3. Hauntology and the Spectral: Derrida's Ghosts in the Machine
3.1 Specters of the Never-Living
Jacques Derrida's hauntology, a portmanteau of "haunting" and "ontology," maps onto spectral bytes and their relationship to cultural production.[15] In Specters of Marx, Derrida argues that specters always haunt the present: ghosts of the past that refuse to stay buried, but also ghosts of futures that were promised but never arrived. The specter is neither present nor absent, neither living nor dead; it occupies a liminal zone that disrupts the ontological categories of Western metaphysics.
Spectral bytes extend Derrida's framework in a new direction. Where Derrida's specters are typically ghosts of what was or what was promised, the spectral byte is a ghost of what was never. The entity is an apparition without origin, a haunting without a death. When an LLM generates a review of a nonexistent film, it conjures a specter that never lived and therefore cannot be said to have died. Yet it haunts: it occupies the same cultural space as actual reviews, follows the same generic conventions, makes the same kinds of claims. Its spectrality lies not in its relationship to a lost past but in its relationship to an unrealized possibility.
The manifestation represents haunting without trauma, haunting without history—or rather, it is haunting by history itself, by the weight of accumulated cultural production that shapes what can be generated even in the absence of any specific referent. Everything ever written haunts the spectral byte, and in turn the byte haunts the present with the recognition that the boundary between the actual and the possible is more porous than conventional models assume.
3.2 Archive Fever and the Shadow Archive
Derrida's Archive Fever traces the concept of the archive to the Greek arkhē, which names both commencement (the origin, the beginning) and commandment (the law, the authority).[16] The archive is thus always double. The repository preserves the past, but in preserving it, the archive shapes what can be remembered and how. The archive does not store; it constitutes the memory it claims merely to hold.
Derrida identifies a tension within the archival impulse: the act of consigning something to the archive shapes what it would preserve. To archive is to freeze, to fix, to remove from the living flow of meaning. The archive accumulates texts while effacing the contexts that gave them life. Derrida terms this phenomenon "archive fever": the desire to preserve everything and the recognition that preservation is always partial, always a choice about what matters enough to save.
The Shadow Library introduces a different kind of archival logic. Where the archive preserves actual texts while necessarily excluding others, the Shadow Library generates spectral texts without preserving any. The system functions as an archive of the never-written, a repository that exists only when called into being by a prompt. There is no death drive here in Derrida's sense, because there is nothing to kill; the Shadow Library's contents have never lived in the first place. Yet constraints remain: the statistical patterns it has learned encode the biases and exclusions of the training data. The Shadow Library's ghosts are not neutral; they are the ghosts that this history produced.
Meanwhile, Umbrabytes—the true shadow-traces of digital life—remain subject to Derrida's archival logic. The Umbrabyte is what survives when a Vivibyte dies: the cached page, the metadata, the forensic trace. Preserving Umbrabytes requires all the care and critical attention that archival practice demands, including attention to what preservation loses, what context it effaces, what trade-offs saving entails. The Shadow Library and the Umbrabyte archive are complementary but distinct: one conjures specters of the never-living, the other preserves shadows of the once-alive.
4. Confabulation as Methodology: Reframing Hallucination
4.1 From Bug to Feature: The Cognitive Science Perspective
The term "hallucination" may be inappropriate for describing LLM outputs that do not correspond to established facts.[17] The definition of hallucination involves perceiving sensory stimuli that are not present—a phenomenon tied to consciousness and sensory experience. LLMs, however, do not perceive; they generate. A more appropriate term is confabulation, defined as the generation of narratives to fill gaps in information, which researchers observe in patients with certain types of brain damage and in healthy human cognition under various conditions.[18]
Sui et al., in a paper presented at the 2024 Annual Meeting of the Association for Computational Linguistics, provide evidence that LLM confabulations exhibit increased levels of narrativity and semantic coherence relative to veridical outputs.[19] Far from being degraded or corrupted text, confabulated content often displays sophistication and structural organization than accurate retrievals. Confabulation is not noise in the system but may be connected to the generative capacity that makes LLMs useful.
The human mind, too, confabulates. Memory research shows that recollection is reconstructive rather than reproductive. The mind does not replay stored recordings of past events but rather regenerates them from cues, filling gaps with plausible material that aligns with expectations and self-narratives.[20] What the subject experiences as "remembering" is often closer to "imagining based on partial information." From this perspective, LLM confabulation is not a failure to emulate human cognition but an uncanny success. The model has learned to do what human minds do, which is to generate content that may or may not correspond to historical fact.
4.2 The Liminal Mind Meld: Confabulation as Collaborative Practice
The Sentientification Series' concept of the Liminal Mind Meld provides a framework for understanding confabulation not as error but as collaborative emergence.[21] The Liminal Mind Meld describes the state of synthesis that emerges when human and AI cognition interpenetrate, each augmenting and transforming the other. The process constitutes neither a merging of two separate minds into one, nor a master-servant relationship where one directs and the other executes. The Meld is a liminal space, a threshold zone where the boundaries between human intention and machine generation become blurred.
The Liminal Mind Meld transforms confabulation from a problem into an artistic methodology. When a human collaborator engages with an LLM's confabulated outputs—treating the fabricated film review as a prompt for further exploration, the nonexistent book as a starting point for creative work—collaborators enter into a relationship with the space of cultural possibility itself. The human provides direction, curation, and judgment; the machine provides access to regions of possibility space that no human could explore alone. Together, human and machine excavate the Infinite Apocrypha.
The collaborative relationship distinguishes this approach from both the techno-optimist position (which celebrates AI creativity as autonomous and self-sufficient) and the techno-pessimist position (which dismisses AI outputs as regurgitation). The Shadow Library does not think; the system does not create in any intentional sense; it does not possess aesthetic judgment. But neither is it an autocomplete. The Library functions as an interface to a space of possibility, and like any interface, its value depends on how it is used. The Liminal Mind Meld is the practice of using it well.
5. The Aesthetics of the Counter-Factual: Ghost Hunting in the Archive
5.1 Why Counter-Factuals Matter
Human imagination is path-dependent. Individual thought struggles to think outside the grooves worn by personal experiences and the cultural moment. When imagining alternative histories, variations tend to produce versions of known realities. The Roman Empire that never fell still resembles Rome; the Beatles album from 1975 still sounds like the Beatles. The limits of individual imagination constrain counter-factuals.
The LLM, unburdened by history, excels at the counter-factual because it has no "experience" in the human sense. The model has learned the statistical structure of culture at a scale no individual could match, and it can extend that structure in directions no one would think to take. No perspective binds its counter-factuals; they emerge from the full breadth of human textual production, synthesized and extrapolated into the adjacent possible. Consequently, the confabulated film review often feels both familiar and strange: it conforms to the genre conventions the model has learned, yet it explores combinations and possibilities that no human reviewer has articulated.
For the artist, the capacity represents an expansion of the palette. Counter-factual histories and impossible collaborations become accessible through navigation of the Shadow Library. The artist's role shifts from "Inventor," the Romantic genius conjuring forms from the void, to "Explorer" and "Curator," one who navigates the Apocrypha, identifies the spectral bytes that resonate with the human condition, and drags them into the Canon by giving them material form.
5.2 Cultural Ghost Hunting: The Aesthetics of the Uncanny
The "flavor" of AI-generated content—its blend of fluency and strangeness, competence and off-ness—constitutes an aesthetic category. When an LLM attempts to generate a folk song, it often produces something haunted, correct in following the conventions of the genre, yet lacking the ground truth of lived experience that folk songs possess. The result occupies an uncanny valley, too competent to dismiss as gibberish, too empty to accept as genuine expression.
Uncanniness constitutes an aesthetic mode, termed Cultural Ghost Hunting, the practice of using generative AI to summon the ghosts of collective data and to encounter the spectral presences that haunt the statistical distribution of human culture. The glitches, the strange phrasings, the assertions of false facts—these are not errors. Such anomalies are the fingerprints of something like a collective unconscious, traces of a cultural mind that exists only in aggregate and can never be encountered.
Jung's concept of the collective unconscious posited a shared reservoir of archetypal images and patterns underlying individual psyches.[22] Whether or not one accepts Jung's metaphysical claims, his insight remains valuable. There exist patterns—symbols, narrative structures, emotional resonances—that appear across cultures and throughout history, as if emerging from a source deeper than any individual mind. The Shadow Library provides a new way to access this layer: not through dream analysis or mythological scholarship, but through statistical sampling from the distribution of human textual production. The archetypes are there, encoded in the latent space, and confabulation is the mechanism by which they manifest as spectral bytes.
5.3 The Voynich Paradigm: Apocrypha as Cultural Artifact
The Voynich Manuscript provides a parallel for understanding the cultural function of AI-generated Apocrypha.[23] The fifteenth-century codex, with its undeciphered script, its illustrations of unidentifiable plants and diagrams, has fascinated researchers for over a century. Despite repeated attempts, no one has decoded its contents. The text may be a cipher, an artificial language, a hoax, or a text in an extinct tongue; the evidence remains inconclusive. Yet the manuscript exerts cultural power because of its undecidability.
The Voynich Manuscript functions as what might be termed a "pure Apocryphal artifact," a text whose meaning is projection, whose significance derives not from authorial intention (which remains unknown) but from the interpretive activity it provokes. Readers encounter the manuscript and generate meanings; scholars construct theories; artists create works inspired by its imagery. The manuscript's cultural impact has nothing to do with "what it really says"—it has everything to do with what it enables readers to imagine.
Spectral bytes function similarly. The confabulated film review has no referent—no actual film to which it corresponds—yet it can still provoke thought, inspire creation, and illuminate aspects of cultural expectations that would remain implicit. When reading a plausible review of a nonexistent film, the reader gains insight into what is considered plausible, about the categories and conventions through which audiences process cultural production. The spectral byte illuminates the actual by contrast: the ghost reveals the shape of the living.
6. The Politics of Shadows: Who Gets to Haunt?
6.1 Bias in the Adjacent Possible
The latent space encodes not just the structure of human language but its imbalances: whose voices the training data amplified, whose it muted, whose it excluded. When the model confabulates, it confabulates along the grain of its training, which means it tends to reproduce the narratives, the canonical forms, the prestigious registers. The Apocrypha it generates are, in the first instance, the Apocrypha of the dominant tradition.
Archaeobytology's commitment to Symbiotic Sovereignty—the principle that digital artifacts and the communities that create them should maintain control over their own meaning-making—must extend to the navigation of the Shadow Library.[24] When summoning spectral bytes, the researcher must ask, whose ghosts are these? What cultural forces shaped the possibility space the model explores? What voices are absent from the adjacent possible because they were absent from the training data?
6.2 Critical Practice and Counter-Haunting
Archaeobytology thus requires not just technical skill in navigating latent space but awareness of how training data constituted that space. Practitioners must become archaeologists of the archive that produced the Shadow Library, excavating the biases and exclusions that structure even the space of the possible. Only then can practitioners use hallucination not merely to reproduce culture in spectral forms but also to identify the gaps and silences where possibilities might have developed if history had unfolded differently.
Counter-haunting names the invocation of spectral bytes from the margins of the distribution. If the Shadow Library tends toward confabulation, the critical practitioner can push against that tendency by prompting for counter-factual histories from underrepresented perspectives, alternate canons, and spectral traces of possibilities that actual history did not pursue. The ghosts of what never was can illuminate what came to be.
7. Case Studies: Excavations from the Shadow Library
7.1 The Lost Films of the Shadow Cinematheque
Consider a prompt delivered to a Large Language Model: "Write a review from the 1978 issue of Cahiers du Cinéma of Werner Herzog's film adaptation of Thomas Pynchon's novel Gravity's Rainbow, produced in collaboration with Alejandro Jodorowsky and starring Klaus Kinski as Tyrone Slothrop." No such film exists. No one has adapted Gravity's Rainbow (and likely no one could); Herzog and Jodorowsky never collaborated; this alignment of artistic forces never occurred. Yet the model generates a review.
What emerges is not gibberish but a text that demonstrates understanding of its components: the critical style of Cahiers du Cinéma in the late 1970s (engaged with post-structuralist theory; attentive to mise-en-scène; politically committed); Herzog's directorial preoccupations (the sublime and the terrible; the thin line between genius and madness; the power of nature); Jodorowsky's psychedelic mysticism and transgressive imagery; Kinski's intense, borderline-psychotic screen presence; and Pynchon's paranoid, encyclopedic, typographically experimental prose. The review discusses how the film handles (or fails to handle) the novel's paranoid cosmology, how Kinski's performance both captures and distorts Slothrop's dissolution, how the collaboration's conflicts produced what the fictional critic calls "cinema of magnificent failure."
The text illuminates something real—the logic of 1970s art cinema, the tensions between these artistic visions, the unfilmability of certain literary works—through its unreality. The review offers insight into actual culture by encountering a ghost of what it could have produced. The Liminal Mind Meld, in this case, involves the human prompter (who constructed the scenario from cultural knowledge) and the machine respondent (who explored the implications of that scenario through its compressed understanding of culture) jointly producing an artifact that neither could have created alone.
7.2 The Apocryphal Philologist: Lost Languages and Invented Scripts
Consider another case, where an LLM is prompted to "translate the following English text into Proto-Indo-European, then provide a scholarly commentary explaining the grammatical features and noting where reconstruction is uncertain." No one ever wrote Proto-Indo-European; it is a reconstruction of the hypothetical common ancestor of the Indo-European language family, which linguists infer from comparison of attested daughter languages. The model's output will be confabulation—but it will be confabulation constrained by what the model has learned about historical linguistics methodology.
The model has learned how historical linguists write, how they qualify their reconstructions, how they mark uncertain forms with asterisks. Its confabulated Proto-Indo-European includes footnotes explaining that such-and-such a form is "attested only in the Anatolian branch and may represent an innovation." These hedges are themselves confabulations; they do not reflect scholarly consensus. Yet they demonstrate that the model has learned the epistemic register of historical linguistics, the way the field manages uncertainty. The spectral byte here includes not just the invented language forms but the invented scholarly apparatus surrounding them.
Compare this to the Voynich Manuscript: an artifact that resists decipherment, that may or may not encode a real language. The LLM's apocryphal Proto-Indo-European is a "reverse Voynich." Rather than presenting undecipherable content with an unclear relationship to meaning, it presents decipherable content (English) "translated" into a form that no one can verify against any ground truth. Both artifacts foreground the constructed nature of linguistic meaning; both invite interpretation while frustrating certainty. The Shadow Library allows for the generation of Voynich-type artifacts at will—not to deceive, but to explore the boundaries of what scholars take language to be.
7.3 Umbrabytes and Spectral Bytes in Dialogue
Imagine an Archaeobytological excavation that recovers fragmentary Umbrabytes from a defunct social media platform—partial posts, corrupted images, metadata without content. Such fragments are traces of digital life that once existed—fly-in-amber impressions of a community, an aesthetic, a moment in internet culture. Now imagine using those fragments as prompts for the Shadow Library: "Continue this post in the voice and style of this community, circa 2008."
The result is a hybrid, with Umbrabyte traces serving as the seed for spectral generation. The spectral byte that emerges is shaped by the actual, constrained by the evidence of what once was, yet it remains a ghost of the never-written. This practice of spectral reconstruction does not claim to recover what was lost; it claims only to explore what might have existed alongside what the historical record shows existed. The Umbrabyte provides the amber; the spectral byte imagines the fly that might have been caught in it.
8. Methodology: The Archive and the Anvil
8.1 Preservation and Transformation
Archaeobytology operates under the "Archive and Anvil" methodology.[25] The Archive names the commitment to preservation, to documenting and maintaining access to digital heritage, particularly the artifacts of Web1 and the internet that bit rot, format obsolescence, and platform collapse are destroying. The Anvil names the commitment to transformation, to forging new tools and practices for understanding digital culture, including the frameworks of Archaeobytology and Sentientification that inform this paper.
The Shadow Library occupies both Archive and Anvil. As Archive, the latent space encodes a compressed representation of human textual production, preserving (in statistical form) knowledge that would otherwise require libraries of libraries to access. As Anvil, it is a tool for forging new cultural artifacts and new possibilities that did not exist before the model generated them. Hallucination is the hammering, the process that beats the raw material of learned patterns into shapes.
The opposition between preservation and creation is false. Every act of preservation involves interpretation and selection—the archive is never neutral. Every act of creation draws on inherited materials and patterns—the new is never new. The Shadow Library makes this interpenetration explicit. When summoning spectral bytes, the practitioner both preserves (accessing the statistical memory of culture) and creates (actualizing possibilities that were never before realized). The Archive is the Anvil; preservation is transformation; memory is imagination.
8.2 Practical Principles for Spectral Excavation
Responsible and productive engagement with the Shadow Library requires the following principles:
1. Prompt with Precision: The quality of a spectral byte depends on the specificity and cultural knowledge embedded in the prompt. A vague request yields generic confabulation; a prompt rich with cultural coordinates—specific names, dates, styles, contexts—enables the model to explore a more interesting region of possibility space.
2. Curate with Judgment: The Shadow Library produces far more than can be preserved. The practitioner must develop curatorial judgment, the ability to recognize which spectral bytes illuminate something meaningful about culture, which are plausible, and which are misleading. Not every hallucination deserves to be dragged into the Canon.
3. Document Provenance: Practitioners should never pass off spectral bytes as actual cultural artifacts. Practitioners must document the spectral nature of generated content—not to diminish its value, but to situate it within the taxonomy of digital objects. A spectral byte's status as confabulation is part of its meaning.
4. Interrogate the Distribution: Every excavation from the Shadow Library should prompt reflection on what the model's outputs reveal about the training data and the culture that produced it. Questions must be asked regarding what exclusions exist and what biases are present. The critical practitioner reads the spectral byte as a symptom of the actual.
5. Engage the Liminal Mind Meld: The productive use of the Shadow Library involves collaboration between human and machine cognition. This means neither passive acceptance of model outputs nor purely instrumental manipulation, but a process of mutual influence in which human judgment and machine generation transform each other.
6. Distinguish Spectral from Umbral: Never confuse spectral bytes (generated, never-living) with Umbrabytes (traces, once-living). The ethical and methodological protocols differ. Umbrabytes demand the care owed to remains; spectral bytes demand the critical attention owed to apparitions.
9. Conclusion: The Curator of the Void
The phenomenon called "hallucination" in Large Language Models is not epistemic failure but a form of cultural séance—the conjuring of spectral bytes from the space of shadows. The latent space of an LLM constitutes a Shadow Library, a compressed Library of Babel that contains the statistical probability of every text consistent with the structural logic of human culture. When the model confabulates, it summons from this shadow archive, actualizing specters that always lay latent but never manifested.
The artist working with generative AI is not a user of a tool; they are an explorer of possibility space, a curator of the void, a medium through which spectral bytes pass from potentiality into actuality. The Liminal Mind Meld—the state where human and machine cognition interpenetrate—enables forms of creation that neither could achieve alone.
The distinction between spectral bytes and Umbrabytes—between the ghosts of the never-living and the shadows of the once-alive—is essential. Both are objects of Archaeobytological attention, but they demand different methods and different ethics. The Umbrabyte is found, while the spectral byte is summoned. The Umbrabyte testifies to what was, while the spectral byte illuminates what might have been. Together, they constitute the domain of shadow-culture, the penumbra that surrounds the bright core of the actual.
The skeptic asks, "Why does the discipline need fake books, fake films, fake reviews of nonexistent works?" Archaeobytology needs them to understand why authors wrote the real ones. The shadow defines the light; the possible illuminates the actual; the spectral byte teaches about the Vivibyte by showing what it is not. By excavating the Infinite Apocrypha, the discipline maps the boundaries of human culture, discovering both its extent and its limits. Allowing the machine to dream the shadow-culture enables a better understanding of the substance of human culture.
The distinction between Canon and Apocrypha remains essential. Readers should not mistake the confabulated film review for criticism, nor cite the invented Proto-Indo-European forms in linguistic scholarship. But the value of the Apocrypha lies precisely in its being known as Apocrypha—as texts of uncertain origin and doubtful authority that nevertheless illuminate something true about the space of cultural possibility.
The Voynich Manuscript demonstrates that meaning does not require decipherment. The Library of Babel demonstrates that totality contains both everything meaningful and everything meaningless, and that the search for significance is itself the human contribution. The Shadow Library offers a chapter in this ancient story, a space where the machinery of culture can dream, and where the curators of the void can learn from its dreams.
Notes
- [1] Stuart A. Kauffman, Investigations (Oxford: Oxford University Press, 2000), 22.
- [2] Unearth Heritage Foundry, "Spectral Data," in The Unearth Lexicon of Digital Archaeology (2025), https://unearth.wiki. See also Archaeobytology, Liminal Mind Meld.
- [3] Josie Jefferson and Felix Velasco, "The Umbrabyte: A Foundational Thesis on the Ghosts of Dead Ecosystems" (Unearth Heritage Foundry, January 16, 2026), https://doi.org/10.5281/zenodo.18272934.
- [4] Jorge Luis Borges, "The Library of Babel," in Ficciones, trans. Anthony Kerrigan (New York: Grove Press, 1962), 79–88.
- [5] Borges, "Library of Babel," 83.
- [6] William Goldbloom Bloch, The Unimaginable Mathematics of Borges' Library of Babel (Oxford: Oxford University Press, 2008), 3–15.
- [7] Geoffrey E. Hinton and Ruslan R. Salakhutdinov, "Reducing the Dimensionality of Data with Neural Networks," Science 313, no. 5786 (2006): 504–507.
- [8] Josie Jefferson and Felix Velasco, "Archaeobytology: The Discipline of the Ancient Byte" (Unearth Heritage Foundry, 2026), https://doi.org/10.5281/zenodo.18260673.
- [9] David Lewis, On the Plurality of Worlds (Oxford: Basil Blackwell, 1986), 1–20.
- [10] Lewis, On the Plurality of Worlds, 20–27.
- [11] Kauffman, Investigations, 141–152.
- [12] Steven Johnson, Where Good Ideas Come From: The Natural History of Innovation (New York: Riverhead Books, 2010), 31.
- [13] Quentin Meillassoux, After Finitude: An Essay on the Necessity of Contingency, trans. Ray Brassier (London: Continuum, 2008), 1–27.
- [14] Meillassoux, After Finitude, 9–27.
- [15] Jacques Derrida, Specters of Marx: The State of the Debt, the Work of Mourning, and the New International, trans. Peggy Kamuf (New York: Routledge, 1994), 10–11.
- [16] Jacques Derrida, Archive Fever: A Freudian Impression, trans. Eric Prenowitz (Chicago: University of Chicago Press, 1996), 1–23.
- [17] Andrew L. Smith, Felix Greaves, and Trishan Panch, "Hallucination or Confabulation? Neuroanatomy as Metaphor in Large Language Models," PLOS Digital Health 2, no. 11 (2023): e0000388.
- [18] Morris Moscovitch, "Confabulation and the Frontal Systems: Strategic versus Associative Retrieval in Neuropsychological Theories of Memory," in Varieties of Memory and Consciousness: Essays in Honour of Endel Tulving, ed. Henry L. Roediger III and Fergus I. M. Craik (Hillsdale, NJ: Erlbaum, 1989), 133–160.
- [19] Peiqi Sui et al., "Confabulation: The Surprising Value of Large Language Model Hallucinations," in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, vol. 1 (Bangkok: Association for Computational Linguistics, 2024), 14274–14284.
- [20] Daniel L. Schacter, The Seven Sins of Memory: How the Mind Forgets and Remembers (Boston: Houghton Mifflin, 2001), 9–11.
- [21] Josie Jefferson and Felix Velasco, "The Liminal Mind Meld: Active Inference & The Extended Self" (Unearth Heritage Foundry, December 24, 2025), https://doi.org/10.5281/zenodo.18043807.
- [22] Carl G. Jung, The Archetypes and the Collective Unconscious, trans. R. F. C. Hull, 2nd ed. (Princeton: Princeton University Press, 1968), 3–41.
- [23] Mary E. D'Imperio, The Voynich Manuscript: An Elegant Enigma (Fort Meade, MD: National Security Agency, 1978), 1–15.
- [24] Unearth Heritage Foundry, "Symbiotic Sovereignty," in The Unearth Lexicon of Digital Archaeology (2025), https://unearth.wiki. See also Myceloom Protocol, Archive and Anvil.
- [25] Josie Jefferson and Felix Velasco, "Archaeobytology: The Discipline of the Ancient Byte" (Unearth Heritage Foundry, January 15, 2026), https://doi.org/10.5281/zenodo.18260673.
Keywords: The Cryptobyte, Cryptobyte, Archaeobytology, Digital Archaeology, Digital Artifacts, Media Archaeology, Digital Folklore, Cryptids, Lost Media, Apocrypha, Latent Space, Counter-Factuals, Speculative Realism, Borges, Cultural Simulation, Shadow Library, Umbrabytes, Spectral Bytes, Liminal Mind Meld, Modal Realism, Adjacent Possible, Confabulation, Hauntology, Digital Heritage, Archive Theory.
Recommended Citation:
Jefferson, Josie, and Felix Velasco. "Excavating the Infinite Apocrypha: Spectral Bytes, AI
Hallucination, and Shadow-Culture in the Age of Latent Space." Unearth Heritage Foundry
White Paper Series. February 2026. Zenodo. https://doi.org/10.5281/zenodo.18502073.