Preamble: The "How-To" for the "Why"
This document is the formal, open-source methodology for the Archaeobytologist. It is the practical "field guide" that complements the "Anvil" thesis (the "why"). This protocol provides the "how-to" for excavation and triage.
It is a decision engine designed to answer the practitioner's two core questions:
- 1."I see 'digital dust.' Where do I even begin?" (Excavation Protocol)
- 2."I have found an artifact. What do I do now?" (Triage Protocol)
Part 1: The Excavation Protocol (The Trowel)
Before the "Triage Protocol" (The Microscope) can be engaged, the practitioner must first find a discrete artifact. The "digital past" is not a curated museum; it is a "crisis of noise," a field of "undifferentiated dust."1
The Excavation Protocol is the "Field Guide for the Trowel." It is a formal, three-phase methodology for turning a chaotic "junkyard" into a defined "dig site" and finding a discrete "signal" (an Archaeobyte) within that noise.
It answers the practitioner's first question: "I see 'digital dust.' Where do I even begin?"
Scope the "field" to turn a "junkyard" into a "dig site." Where are you looking?
What *kind* of artifact are you hunting for?
Apply a formal method to find the "find."
Section 1.1: Phase 1: Define Dig Site (The "Where")
The first act of excavation is to scope the "field." The practitioner must define the boundaries of their search.
- Path 1: Curated Archive
A known, structured, and often-cataloged collection. This is the most common and lowest-friction dig site.
Examples: The Internet Archive, a specific GeoCities Mirror (e.g., Restorativland), a museum's digital collection, a specific "Archive Team" rescue. - Path 2: "Wild" Archive
An unstructured, abandoned, or "living" site that is not formally preserved. This requires more skill and often involves "404-hunting" or server exploration.
Examples: Abandoned FTP servers, dead (but still hosted) websites, expired forum domains. - Path 3: Conceptual Site
A "living" platform that is not "ancient" but is excavated for the "ghosts" of its own past rituals.
Examples: Excavating GitHub for theREADME.mdritual, studying Twitter for "Conceptual Petribytes" (like the "retweet-with-comment" vs. the original "RT"), or Reddit for "forum signature"-like user flairs.
Section 1.2: Phase 2: Define Target (The "What")
Once the "dig site" is defined, the practitioner must define what they are hunting for.
- Target 1: Tangible (The File)
This is a hunt for specific files, formats, or code. The query is technical.
Methodology: The practitioner is looking for "potsherds."
Examples: "Find all .swf (Flash) files," "Look for guestbook.cgi scripts," "Isolate all .mp3 files encoded before 1999." - Target 2: Conceptual (The Ghost)
This is a hunt for lost behaviors, rituals, or concepts. The query is cultural and anthropological.
Methodology: The practitioner is looking for "cultural ghosts."
Examples: "Find examples of the 'AIM Away Message' ritual," "Trace the social connections of a 'Webring'," "Document the visual language of 'Forum Signatures'."
Section 1.3: Phase 3: Sifting Methodology (The "How")
With a "dig site" and "target" defined, the practitioner applies a formal method to find the first "find."
- Method 1: Keyword Sifting
The most basic method. Searching the "dig site" for specific terms that act as "index fossils" for a time or behavior.
Examples: "guestbook," "home page," "under construction," "blogroll," "about me." - Method 2: Relational Mapping
A more advanced method. This involves following the "hand-built bridges" of the past to map a community, rather than just finding a single file.
Examples: Following every link in a "blogroll," mapping the users in a "Webring," tracing arel="friend"network. - Method 3: Stratigraphic Sampling
A precise, "core sample" method. The practitioner isolates a specific "slice" of time or technology to find a representative artifact.
Examples: "Only artifacts from the 'GeoCities/Vienna' neighborhood in 1999," "Only Flash 5 games from Newgrounds," "Only README.txt files from the 1980s demoscene."
The output of this three-phase protocol is the "First Find": a discrete Archaeobyte that has been successfully excavated from the "digital dust."
This artifact is then handed off to the next protocol: the Triage Protocol (The Microscope).
Part 2: The Triage Protocol (The Microscope)
This protocol engages after the "Excavation Protocol" is complete. It answers the practitioner's second question: "I have found an artifact. What do I do now?"
Archaeobyte)
Classify the artifact's "state of being" to determine the preservation strategy.
Determine the technical and ethical path to preservation.
Determine the artifact's final scholarly output and value.
Section 2.1: Phase 1: Classification (The "What")
The first act of the Triage Protocol is to define the "find." The practitioner must classify the artifact's "state of being" using the Archaeobyte Lexicon (as defined in Theses 1-4).
- Vivibyte (Living): An active, changing artifact.
Preservation Strategy: Episodic Capture (e.g., periodic web-harvesting of a living site).
- Umbrabyte (Liminal): A static but "lost" artifact.
Preservation Strategy: Discrete Capture and Contextual Documentation (noting how it was found and what is broken).
- Petribyte (Fossilized): An artifact whose original context is dead.
Preservation Strategy: Object Migration and Contextual Reconstruction (e.g., emulation).
This initial classification is the practitioner's first practical "so what." It defines the urgency and the technical challenge of the preservation required.
Section 2.2: Phase 2: Preservation Triage (The "How & Can I")
Once classified, the artifact enters a two-filter Triage to determine the technical and ethical path to preservation.
Filter A: The Contextual Filter (How to Preserve)
This filter forces the practitioner to decide if the true artifact is the object or its environment.
- Path 1: Object-based (e.g., .mp3, .txt, source code). The file is the artifact.
Action: MIGRATION. Save the file to a stable, modern repository and document its source.
- Path 2: Experience-based (e.g., a Flash game, an interactive website). The file alone is useless.
Action: EMULATION. Preserve the entire software and hardware environment (using tools like OldWeb.today or Ruffle) required to run the artifact as it was intended.
Filter B: The Custodial Filter (Is it Ethical & Legal?)
This filter addresses the real-world barriers to preservation.
- Question 1: Privacy (PII). Does the artifact contain Personally Identifiable Information (e.g., a forum database, private emails)?
Action: REDACT. The practitioner must scrub all PII, or place the artifact under an Embargo restricting access to cleared researchers.
- Question 2: Copyright. Does the artifact have clear copyright limitations (e.g., a commercial .mp3, proprietary software)?
Action: CITE & "FAIR USE". The practitioner's action is to archive the artifact under the academic principle of "fair use for research and preservation," and to meticulously Cite the original copyright holder in the metadata.2
Section 2.3: Phase 3: Academic Synthesis (The "So What")
This is the final decision engine. The artifact is now classified, preserved, and cleared. This phase answers the practitioner's ultimate question: "What is the scholarly output?"
This is a three-path filter to determine the artifact's final value.
- Path 1: Digital Dust (No Resonance)
Definition: The artifact has no discernible human story or cultural value (e.g., a corrupted file, a machine-generated log, spam).
Action: ARCHIVE. The practitioner logs the artifact with minimal metadata and moves on. - Path 2: Curio (Resonance, No Rigor)
Definition: The artifact has a clear human story but has no verifiable "echo" or relevance to a modern cultural conversation (e.g., an inside joke from a dead forum).
Action: PRESERVE & DOCUMENT. This is the "museum piece." The scholarly output is a "Digital Monument"—a paper or exhibit that explains the artifact's story and preserves its context for future researchers. - Path 3: Landmark (Resonance & Rigor)
Definition: The artifact's story is highly relevant today (e.g., the GeoCities Guestbook as the "ancestor" of all social media).
Action: AMPLIFY & SYNTHESIZE. This is the "thesis-driver." The scholarly output is a new thesis, book, or framework that connects this "fossil" directly to the "living" present.
Works Cited
- [1] ↑See "The Archaeobyte: A Foundational Thesis on the Artifacts of the Digital Past" (Thesis 1 of this series).
- [2] ↑Stanford University Libraries. (n.d.). What is Fair Use? Retrieved November 5, 2025, from https://fairuse.stanford.edu/overview/fair-use/what-is-fair-use/