Foundations Series / Vol 01 Est. 2025

Appendix A: Glossary of Terms


Core Concepts

Archaeobytology The study and practice of excavating, preserving, interpreting, and building with digital artifacts—particularly those murdered by platform shutdowns or rendered obsolete by technological change. Combines retrospective preservation (the Archive) with prospective creation (the Anvil).

Archaeobyte A digital artifact that was once alive (accessible, functional), died through platform shutdown or obsolescence, and has been preserved in some form. Exists in liminal state between death and potential resurrection. Example: GeoCities pages saved by Archive Team.

Vivibyte A digital artifact that is currently alive (accessible, functional) but exists on vulnerable infrastructure facing existential threats. The "living endangered species" of digital culture. Example: Content on Twitter/X during ownership instability.

Umbrabyte A digital artifact that is technically dead (inaccessible, non-functional) but has not been properly preserved. Exists in fragmentary or corrupted form, haunting the present through memory and partial remnants. Example: Lost MySpace music from 2003-2013.

Petribyte A digital artifact so old that its original context is historical, has been durably preserved by institutions, and is treated as cultural heritage. Has achieved monumental stability. Example: ARPANET documentation preserved by Computer History Museum.


The Three Pillars of Digital Sovereignty

Declaration (I Am) The principle that you should be able to declare your identity and existence without permission from platforms or intermediaries. Includes self-owned identity (username@yourdomain.com), persistent presence, and uncensorable voice.

Connection (Instant Message) The principle that you should be able to communicate directly with others without platform mediation, monitoring, or monetization. Includes peer-to-peer communication, portable relationships, and intentional discovery.

Ground (Digital Real Estate) The principle that you should own the infrastructure your digital life is built on, not rent it from landlords who can evict you. Includes data ownership, infrastructure control, and persistence independent of platform survival.

Digital Sovereignty The ability to exist, communicate, and build in digital space without corporate gatekeeping. Achieved through embodying all Three Pillars. Not absolute freedom (legal and social accountability remain), but freedom from arbitrary platform power.


The Archive and the Anvil

The Archive The retrospective practice of Archaeobytology: excavating endangered artifacts, preserving them with technical and cultural fidelity, curating collections, interpreting for future generations, and providing access. Looks backward to save what's endangered.

The Anvil The prospective practice of Archaeobytology: forging tools, protocols, and institutions that embody digital sovereignty and resist the forces that murdered previous platforms. Looks forward to build alternatives. Named for the blacksmith's anvil where new things are forged.

Dual Soul The integration of Archive and Anvil as complementary practices. Neither is sufficient alone: Archives without alternatives accept defeat; building without remembering repeats mistakes. The complete Archaeobytologist embodies both.


Preservation and Triage

Triage The methodology for deciding what to preserve when you cannot save everything. Borrowed from emergency medicine. Requires making difficult choices about cultural significance, technical fragility, rescue feasibility, redundancy, and ethics.

The Custodial Filter Five-question ethical framework for triage decisions: (1) Cultural Significance—does this represent something that would otherwise be lost? (2) Technical Fragility—how close to disappearance? (3) Rescue Difficulty—how hard to preserve? (4) Existing Redundancy—is someone else saving this? (5) Consent and Ethics—should we preserve this?

Custodial Responsibility The ethical burden of preservation: by choosing what to save, you decide what future generations can know about the past. Every preservation decision is also a decision to let something else die. Carries weight of gatekeeping historical memory.

Platform Murder Deliberate erasure of digital artifacts by platforms through shutdown, terms of service purges, or acquisition-and-closure. Distinguished from passive obsolescence (technological decay) or neglect (link rot). Active corporate choice to kill content.


Technical Terms

Web Scraping Automated extraction of data from websites using tools like wget, HTTrack, or custom scripts. Can range from simple HTML downloads to complex JavaScript rendering. Often operates in legal gray area when done without platform permission.

API Harvesting Using a platform's Application Programming Interface to bulk-download content. More reliable than scraping when available, but platforms control API access and can revoke it. Example: Using Twitter's API to download tweets before shutdown.

Emulation Running old software or systems in a simulated environment. Allows obsolete programs (Flash games, DOS applications) to function on modern hardware. Preserves not just files but user experience. Example: Ruffle emulator for Flash content.

Format Migration Converting files from obsolete formats to current standards to ensure long-term accessibility. Risk: May lose fidelity or functionality in translation. Example: Converting WordPerfect documents to modern Word format.

Bit Rot Gradual degradation of digital storage media over time. Hard drives fail, CDs deteriorate, flash memory loses charge. Requires active preservation through redundant copies and periodic data migration.

WARC (Web ARChive format) ISO standard format for archiving web content. Stores HTTP headers, request/response data, and metadata. Used by Internet Archive's Wayback Machine. Preserves not just content but context of how it was accessed.

LOCKSS (Lots of Copies Keep Stuff Safe) Distributed digital preservation system and philosophy. Multiple institutions maintain copies of collections; if one fails, others survive. Embodies redundancy principle. Example: University library consortia preserving journal archives.

Metadata "Data about data"—information describing an artifact's context, provenance, technical characteristics, and relationships. Essential for making preserved artifacts discoverable and interpretable. Example: Title, creator, date, file format, original URL.


Institutional and Economic Terms

The Archive Business Model Organizational design for sustainable preservation. Includes funding sources (grants, donations, subscriptions, services), governance structure (non-profit, cooperative, hybrid), and technical infrastructure. Must survive 50+ years to succeed.

The Anvil Business Model Organizational design for profitable sovereignty tools that don't become extractive platforms. Includes revenue models that avoid surveillance capitalism (subscriptions, freemium, open-core, cooperatives, grants). Must embody Three Pillars in business design itself.

Federated Architecture System design where multiple independent servers (instances) interoperate using open protocols. No central authority controls the network. Example: Mastodon (ActivityPub protocol), email (SMTP protocol). Enables sovereignty through distribution.

Platform Capitalism Economic system where digital platforms (Facebook, Google, Amazon) extract value by controlling access to networks, users, and data. Creates walled gardens, lock-in effects, and surveillance business models. Archaeobytology seeks alternatives.

Surveillance Capitalism Shoshana Zuboff's term for business model based on extracting behavioral data as raw material for prediction products sold to advertisers. Platforms surveil users to monetize attention and influence. Incompatible with digital sovereignty.

Commons Governance Elinor Ostrom's framework for collectively managing shared resources without privatization or state control. Applied to digital preservation: distributed networks (Seed Banks) governed by communities following design principles (boundaries, proportionality, collective choice, monitoring, sanctions, conflict resolution, recognition, nested governance).


Right to Be Forgotten Legal concept (especially in EU's GDPR) that individuals can request deletion of personal data. Creates tension with preservation: historians want to save everything, but privacy advocates prioritize consent and erasure.

Fair Use / Fair Dealing Legal doctrine allowing limited use of copyrighted material without permission for purposes like criticism, education, research, and preservation. Archivists often rely on fair use to preserve copyrighted platform content. Courts have not definitively ruled on platform scraping.

Context Collapse When content created for one audience (small community, friends) becomes visible to different audience (public, searchers, future generations). Common in archives when private/semi-private content is preserved and made accessible.

Informed Consent Ethical principle that people should understand and agree to how their data/content is used. Complicated in preservation: platform users often didn't expect permanent archiving, can't retroactively consent, and may be unreachable. Triage must navigate consent ambiguity.

Custodial Ethics Framework for responsible stewardship of preserved artifacts. Includes: minimizing harm, respecting explicit deletion, restricting access when appropriate, consulting affected communities, and transparent decision-making.


Movement and Discipline Terms

Discipline Formation Process by which scattered practices become recognized academic/professional field. Requires: intellectual coherence (shared questions/methods), institutional infrastructure (departments, journals, conferences), professional pathways (jobs), boundary work (defining what field IS/ISN'T), canonical texts, and external recognition (funding, legitimacy).

Boundary Work Defining a discipline by exclusion—stating what it is NOT. Archaeobytology is not (just) digital history, computer science, library science, media archaeology, or activism—though it draws from all. Clarifies distinct identity.

Knowledge Infrastructure Journals, conferences, textbooks, handbooks, online platforms, archives, and datasets that standardize and disseminate field's knowledge. Essential for discipline legitimacy. Example: Journal of Archaeobytology, annual conference, shared tool repositories.

Institutional Anchors Universities, centers, institutes, labs, and programs that provide stable homes for discipline. Include: degree programs (certificates → master's → PhDs), research centers with funding/space, and professional schools.

Professional Pathways Clear career routes for people trained in discipline. For Archaeobytology: academic (tenure-track), practitioner (archivist, curator), industry (tech companies), non-profit (EFF, Internet Archive), consulting, and certified credentials.

Movement Building Strategic work to grow discipline from scattered practice to recognized field. Five dimensions: (1) Knowledge infrastructure, (2) Institutional anchors, (3) Professional pathways, (4) Public visibility, (5) Policy advocacy. Timeline: 10-20 years to maturity.


Historical Platforms and Projects

GeoCities Web hosting service (1994-2009) that gave millions of people free homepages. Yahoo shut it down with 3 weeks warning, murdering ~30 million sites. Archive Team scraped 650GB before shutdown. Canonical example of platform murder and urgent preservation.

Vine Short-form video platform (2012-2017) owned by Twitter. 6-second looping videos created distinctive aesthetic and meme culture. ~200 million videos existed at shutdown. Internet Archive saved millions; many lost. Example of cultural significance vs. preservation difficulty.

Flash Player Multimedia platform by Adobe (1996-2020) that enabled interactive web content, games, and animations. Discontinued due to security issues and HTML5 replacement. Millions of Flash artifacts became Umbrabytes. Flashpoint Project preserves 500,000+ games via emulation.

Internet Archive Non-profit digital library (1996-present) founded by Brewster Kahle. Runs Wayback Machine (800+ billion archived web pages), preserves books/movies/software, and advocates for digital rights. Gold standard for institutional preservation.

Archive Team Guerrilla digital archiving collective (2009-present) that mobilizes volunteers to rescue dying platforms. Fast, agile, operates in legal gray areas. Saved GeoCities, Vine, Google Reader, and hundreds of smaller platforms. Complementary to Internet Archive's institutional approach.

Mastodon Federated social network (2016-present) using ActivityPub protocol. Anyone can run instance; instances interoperate. Alternative to centralized platforms like Twitter. Embodies Declaration (federated identity) and Connection (open protocol), though Ground varies by instance.


Digital Humanities Interdisciplinary field using computational methods for humanities research. Includes: text mining, data visualization, digital archives, and tool building. Related to but distinct from Archaeobytology (DH studies past; Archaeobytology intervenes to create future past).

Media Archaeology Theoretical field excavating dead media to understand technological change. Philosophical and interpretive. Influences: Foucault, Kittler, Ernst, Parikka. Archaeobytology draws theory from media archaeology but focuses on applied preservation and building.

Library and Information Science (LIS) Professional field managing information collections, providing access, and ensuring preservation. Expertise in metadata, cataloging, and archival standards. Archaeobytology operates in spaces LIS often can't (guerrilla archiving, legal gray areas, radical system-building).

Science and Technology Studies (STS) Interdisciplinary field studying how science/technology shape society and vice versa. Examines power, politics, and social construction. Provides frameworks for analyzing platform power and sovereignty struggles.

Platform Studies Approach examining how platforms shape cultural production through technical affordances and constraints. Example: Twitter's 140 characters shaped brevity culture. Archaeobytology studies platforms as they die, not just as they function.


Key Thinkers and Works

Brewster Kahle Founder of Internet Archive. Advocate for "universal access to all knowledge." Demonstrates how one individual with vision and resources can build institution that preserves culture at scale.

Cory Doctorow Science fiction writer, activist, journalist. Wrote The Internet Con (2023) advocating for interoperability and user sovereignty. Coined "adversarial interoperability." Exemplifies public intellectual bridging technology and policy.

Elinor Ostrom Nobel Prize-winning political economist who studied commons governance. Governing the Commons (1990) established 8 Design Principles for sustainable collective resource management. Applied to digital preservation (Seed Banks) and federated platforms.

Shoshana Zuboff Scholar who named "surveillance capitalism" in The Age of Surveillance Capitalism (2019). Critiques platforms' extraction of behavioral data. Provides framework for understanding what digital sovereignty resists.

Lawrence Lessig Legal scholar, coined "Code is Law" in Code (2006). Argues that software architecture shapes power and rights as much as legal code. Influences Archaeobytology's focus on designing sovereign systems.

Wendy Hui Kyong Chun Media theorist, wrote "The Enduring Ephemeral" (2008) on paradox of digital permanence and decay. Theorizes how digital objects are simultaneously archived forever and constantly dying. Foundational for understanding Archaeobytes.

Matthew Kirschenbaum Digital humanities scholar, wrote Mechanisms (2008) on digital materiality and forensics. Shows that digital objects have physical reality (bits on drives) and forensic recovery is possible. Methodological foundation for excavation practices.


Acronyms and Abbreviations

API — Application Programming Interface (how programs interact with platforms)

CAPTCHA — Completely Automated Public Turing test to tell Computers and Humans Apart (anti-bot system)

CSS — Cascading Style Sheets (web styling language)

DH — Digital Humanities

DMCA — Digital Millennium Copyright Act (US copyright law often used for takedowns)

DNS — Domain Name System (maps domain names to IP addresses)

DRM — Digital Rights Management (encryption protecting copyrighted content)

E2E / E2EE — End-to-End Encryption (communication encrypted so intermediaries can't read)

EFF — Electronic Frontier Foundation (digital rights advocacy organization)

ENS — Ethereum Name Service (blockchain-based domain system)

GDPR — General Data Protection Regulation (EU privacy law including right to be forgotten)

HTML — HyperText Markup Language (web page structure language)

HTTP/HTTPS — HyperText Transfer Protocol (Secure) (web communication protocol)

ICANN — Internet Corporation for Assigned Names and Numbers (controls DNS root)

IPFS — InterPlanetary File System (peer-to-peer distributed storage)

IRB — Institutional Review Board (ethics committee for research)

ISP — Internet Service Provider

LIS — Library and Information Science

LOC — Library of Congress

NARA — National Archives and Records Administration (US)

NEH — National Endowment for the Humanities (US funding agency)

NSF — National Science Foundation (US funding agency)

P2P — Peer-to-Peer (decentralized network architecture)

POSSE — Post On your Site, Syndicate Elsewhere (IndieWeb practice)

RSS — Really Simple Syndication (protocol for content feeds)

SMTP — Simple Mail Transfer Protocol (email protocol)

STS — Science and Technology Studies

TOS — Terms of Service (platform rules users agree to)

URI/URL — Uniform Resource Identifier/Locator (web address)

WARC — Web ARChive format (ISO standard for web preservation)

W3C — World Wide Web Consortium (web standards organization)


Concepts from the Textbook Chapters

Archive Sustainability Matrix Three-dimensional framework from Chapter 11 for designing preservation organizations: Funding (grants, donations, subscriptions, services, hybrid), Governance (non-profit, cooperative, foundation, hybrid), Technical Infrastructure (storage, servers, distributed, cloud, hybrid). Must balance all three for 50+ year survival.

Foundry Business Matrix Framework from Chapter 12 for building sovereign businesses: What to Sell (hosting, tools, support, content, infrastructure) × Revenue Model (subscription, freemium, open-core, cooperative, grant-funded) × Business Structure (for-profit, non-profit, cooperative, hybrid). Must embody Three Pillars while remaining financially viable.

Sovereignty Stack Six-layer infrastructure analysis from Chapter 15: (1) Physical (cables, servers, datacenters), (2) Network (TCP/IP, DNS, protocols), (3) Identity (authentication, naming), (4) Storage (data location and control), (5) Application (user-facing platforms), (6) Economic (payment systems, revenue). Sovereignty requires addressing all layers, not just one.

Movement-Building Matrix Five-dimensional framework from Chapter 16 for discipline formation: (1) Knowledge Infrastructure (journals, conferences, textbooks), (2) Institutional Anchors (departments, centers, programs), (3) Professional Pathways (jobs, certification), (4) Public Visibility (books, media, advocacy), (5) Policy Advocacy (legislation, testimony, model laws). Timeline: 10-20 years to maturity.

Public Intellectual Toolkit Five skills from Chapter 17 for translating research into impact: (1) Writing for Different Audiences (academic → practitioner → public → policy), (2) Media Engagement (journalists, op-eds, podcasts, TV), (3) Public Speaking (conferences, TED talks, testimony), (4) Platform Building (blog, newsletter, social media with sovereignty), (5) Policy Influence (whitepapers, testimony, legislation). Career trajectory: Years 1-2 build platform, 3-5 gain legitimacy, 5+ high-profile influence.


End of Appendix A: Glossary of Terms

Next: Appendix B — Essential Tools & Resources