Module 1: The Business of the Archive

Archaeobytology 300: Institution Building & Strategic Infrastructure

Module Overview

Core Question: What are sustainable business models for preservation work that embody sovereign Ground ownership?

Learning Objective: Students will design a complete preservation organization—including funding model, governance structure, technical architecture, and 10-year sustainability plan—that avoids the failure modes identified in Module 0.

Time: Week 3-4

The Challenge

Module 0 diagnosed why preservation institutions fail. Module 1 asks you to build one that doesn't.

The Constraint: You cannot rely on: - ❌ Platform revenue (you're trying to escape platforms) - ❌ Surveillance advertising (violates Three Pillars) - ❌ Heroic volunteer labor (identified as fragile in Module 0) - ❌ Speculation/financialization (Web3's failure mode)

The Goal: Design a preservation organization that: - ✅ Can sustain itself for 10+ years - ✅ Embodies The Three Pillars (Declaration, Connection, Ground) - ✅ Survives founder departure - ✅ Resists platform capture or shutdown

Core Reading

Primary Texts

Kahle, B. (1997). "Preserving the Internet." Scientific American. - Focus: How Internet Archive was initially conceived and funded

Benkler, Y. (2006). The Wealth of Networks. Chapters 1-2. - Focus: Commons-based peer production vs. market-based models

Doctorow, C. (2023). The Internet Con. Chapter 3: "The Collective Action Problem" - Focus: Why individual solutions fail; need for institutional coordination

Case Study Readings

Internet Archive Annual Reports (2020-2024) - Revenue sources: Donations (40%), Digital lending (25%), Scanning services (20%), Grants (15%) - Operating budget: ~$40M/year - Model: 501(c)(3) non-profit + earned revenue

Software Heritage White Paper (2016) - Revenue sources: University consortium fees, government grants - Model: Academic collaboration - Technical: Distributed storage across partner institutions

Archive Team Field Reports - Revenue sources: $0 (volunteer labor + personal hardware) - Model: Anarchist collective - Technical: Distributed scraping, uploaded to Internet Archive

Perma.cc Case Study - Revenue sources: Library partnership fees ($500-5000/year per institution) - Model: University consortium - Technical: Distributed "vaults" hosted by partner libraries

Lecture: The Archive Sustainability Matrix

There is no single "correct" model for preservation. The Matrix helps you map options:

Dimension 1: Funding Model

| Model | Advantages | Disadvantages | Three Pillars Fit | |-------|------------|---------------|-------------------| | Donations (Internet Archive) | Aligned with mission, no strings | Unpredictable, requires constant fundraising | ✅ Declaration (independent) | | Grants (Software Heritage) | Large sums, academic legitimacy | Bureaucratic, limited duration | ⚠️ Declaration (funder influence) | | Memberships (co-op model) | Predictable revenue, community buy-in | Limited scale, free-rider problem | ✅ Connection (intentional) | | Earned Revenue (services) | Sustainable, market validation | Mission drift risk, commercial pressure | ⚠️ Can compromise mission | | Endowment (university model) | Long-term stability | Requires massive upfront capital | ✅ Ground (self-sustaining) | | Blockchain (protocol tokens) | Decentralized incentives | Speculation risk, complexity | ❌ Often violates all three |

Key Insight: Most successful institutions use hybrid models (donations + earned revenue + grants).

Dimension 2: Governance Structure

| Structure | Power Distribution | Succession | Three Pillars Fit | |-----------|-------------------|------------|-------------------| | Non-profit (501c3) | Board of directors | Formal transition process | ✅ Declaration (mission-locked) | | For-profit (C-corp) | Shareholders/CEO | Can be sold/acquired | ❌ Ground (subject to market) | | Benefit Corp (B-corp) | Shareholders + mission | Acquisition allowed if mission preserved | ⚠️ Better than C-corp, but sellable | | Co-operative | Member-owned | Democratic control | ✅ Connection (community governed) | | DAO (blockchain) | Token holders | Code-based rules | ❌ Often plutocratic | | Public Institution (library/museum) | Government | Civil service rules | ⚠️ Declaration (subject to politics) |

Key Insight: Governance determines who the Archive serves. Choose structure that prevents capture by founders, funders, or speculators.

Dimension 3: Technical Architecture

| Architecture | Resilience | Cost | Control | Three Pillars Fit | |--------------|-----------|------|---------|-------------------| | Centralized (single datacenter) | Fragile (single point of failure) | Efficient | Total control | ❌ Ground (vulnerable) | | Replicated (multiple datacenters) | Better redundancy | Higher cost | Managed complexity | ⚠️ Ground (still centralized control) | | Federated (independent nodes) | Decentralized | Variable | Distributed | ✅ Ground (no single owner) | | P2P (IPFS, BitTorrent) | Very resilient | Cheap (volunteer bandwidth) | No central authority | ✅ Ground (fully distributed) | | Blockchain (Filecoin) | Cryptographically secured | Expensive | Trustless | ⚠️ Cost barrier |

Key Insight: Technical decentralization ≠ institutional sustainability. You still need governance and funding.

Framework: The Archive Business Canvas

Your assignment will use this structured framework:

Section 1: Mission & Vision (The "Why")

- Mission Statement: What specific preservation problem are you solving? - Artifact Scope: What are you preserving? (Personal homepages? Indie games? Underground zines?) - Three Pillars Alignment: How does your work embody Declaration, Connection, Ground?

Example (hypothetical):

Mission: Preserve the "Digital Garden" movement—personal knowledge bases built on sovereign ground (Obsidian, Roam, TiddlyWiki) before they're lost to link rot.
Scope: Public digital gardens (with owner permission), tooling documentation, community knowledge.
Pillars: Declaration (gardens are sovereign identity), Connection (intercommunication via backlinks), Ground (self-hosted on personal domains).

Section 2: Governance Model (The "Who")

- Legal Structure: Non-profit? Co-op? For-profit? Public institution? - Decision-Making: Who has power? Board? Members? Community vote? - Succession Plan: What happens when founder leaves? - Accountability: Who ensures mission fidelity?

Key Questions: 1. Can this structure be captured by a single person/funder? 2. Does it allow community input? 3. Will it outlive the founder?

Section 3: Revenue Model (The "How It's Funded")

Design a hybrid model with multiple revenue streams:

Primary Revenue Stream

Choose one as your foundation: - Donations: Individual donors, crowdfunding, major gifts - Memberships: Annual fees from individuals or institutions - Services: Consulting, preservation-as-a-service, tooling - Grants: Foundations, government, academic partnerships

Secondary Revenue Streams

Add 1-2 supplementary sources: - Earned Revenue: API access, premium features, workshops - Partnerships: Corporate sponsorships, university affiliations - Endowment: Long-term investment fund

Budget Template:

| Revenue Source | Year 1 | Year 3 | Year 5 | Year 10 | % of Total | |---------------|--------|--------|--------|---------|-----------| | Primary Stream | $XXk | $XXk | $XXk | $XXk | XX% | | Secondary 1 | $XXk | $XXk | $XXk | $XXk | XX% | | Secondary 2 | $XXk | $XXk | $XXk | $XXk | XX% | | Total Revenue | $XXk | $XXk | $XXk | $XXk | 100% |

Expense Budget:

| Expense Category | Year 1 | Year 3 | Year 5 | Year 10 | % of Total | |-----------------|--------|--------|--------|---------|-----------| | Staff Salaries | $XXk | $XXk | $XXk | $XXk | XX% | | Infrastructure (servers, storage) | $XXk | $XXk | $XXk | $XXk | XX% | | Development (software, tools) | $XXk | $XXk | $XXk | $XXk | XX% | | Marketing/Outreach | $XXk | $XXk | $XXk | $XXk | XX% | | Overhead (legal, admin) | $XXk | $XXk | $XXk | $XXk | XX% | | Total Expenses | $XXk | $XXk | $XXk | $XXk | 100% |

Sustainability Question: When does revenue exceed expenses? What's the minimum viable budget?

Section 4: Technical Architecture (The "How It Works")

Storage Layer

- Where is data stored? (Cloud? On-prem? P2P?) - How much redundancy? (LOCKSS: "Lots of Copies Keep Stuff Safe") - Cost projections? ($X per TB per year)

Access Layer

- How do people retrieve preserved artifacts? - Read-only mirrors? Interactive emulation? API access? - Public or restricted? (Open access vs. institutional partnerships)

Preservation Pipeline

- How are artifacts ingested? (Automated scraping? Manual curation? Community submission?) - What formats? (WARC? Git repos? Database dumps?) - Metadata standards? (Dublin Core? Schema.org? Custom?)

Diagram Your Architecture: ``` [Artifact Sources] → [Ingestion Pipeline] → [Validation/Triage] → [Storage Layer] → [Access Layer] → [End Users] ```

Section 5: Community & Partnerships (The "Who Helps")

Stakeholders

- Contributors: Who submits artifacts or metadata? - Volunteers: Who helps with curation, tagging, development? - Partners: What institutions collaborate? (Libraries, universities, archives) - Funders: Who provides financial support? - Users: Who accesses the preserved artifacts?

Engagement Strategy

- How do you recruit contributors? - How do you acknowledge/compensate labor? - How do you prevent burnout (identified in Module 0)?

Section 6: Risk Analysis & Mitigation

Identify your biggest vulnerabilities (from Module 0 failure modes) and how you address them:

| Risk | Likelihood | Impact | Mitigation Strategy | |------|-----------|--------|---------------------| | Founder departure | High | Critical | Formal succession plan, institutional governance | | Funding loss | Medium | High | Multiple revenue streams, reserve fund | | Platform shutdown | Low | Medium | Self-hosted on owned infrastructure | | Volunteer burnout | High | Medium | Paid staff + sustainable contributor policies | | Legal challenge (copyright) | Medium | High | Legal defense fund, fair use expertise | | Technical obsolescence | Medium | Medium | Open standards, format migration plan |

Question: What's your "single point of failure"? How do you eliminate or mitigate it?

Case Study Deep-Dives

Case 1: Internet Archive (The Centralized-But-Massive Model)

Revenue Model (2023): - Donations: ~$16M (40%) - Digital Lending: ~$10M (25%) - Scanning Services: ~$8M (20%) - Grants: ~$6M (15%) - Total: ~$40M

Governance: - 501(c)(3) non-profit - Board of directors - Brewster Kahle as founder/digital librarian

Technical: - Centralized in San Francisco (PRIMARY VULNERABILITY) - ~70 PB of data - WARC format for web archives

Strengths: - ✅ Massive scale (800B+ pages) - ✅ Hybrid revenue (not dependent on single source) - ✅ Public trust and legitimacy

Weaknesses: - ❌ Single physical location (one disaster away from catastrophic loss) - ❌ Founder-centric (succession plan unclear) - ❌ US copyright law vulnerability (ongoing lawsuits from publishers)

Student Discussion: 1. If you were redesigning IA, how would you decentralize storage without losing coordination? 2. Should preservation organizations take legal risks (controlled digital lending) to advance mission?

Case 2: Perma.cc (The University Consortium Model)

Revenue Model: - Partner libraries pay $500-$5,000/year based on size - ~150 institutional partners - Total: ~$300k/year

Governance: - Harvard Law School Library hosts - Advisory board of partner institutions

Technical: - Federated: Each partner library runs a "preservation vault" - Partner-hosted ensures distributed redundancy - Simple web interface for link preservation

Strengths: - ✅ Distributed storage (no single point of failure) - ✅ Institutional partnerships (academic legitimacy) - ✅ Sustainable revenue (predictable membership fees)

Weaknesses: - ❌ Limited scope (only preserves links submitted by partners) - ❌ Dependent on university funding (vulnerable to budget cuts) - ❌ Not fully open access (partners prioritized)

Student Discussion: 1. Could this model scale to millions of artifacts (not just links)? 2. How do you balance "open access" mission with "membership fee" sustainability?

Case 3: Archive Team (The Anarchist Collective Model)

Revenue Model: - $0 official budget - Volunteers contribute personal bandwidth, storage, time

Governance: - No formal structure - IRC channel coordination - Jason Scott as informal leader/advocate

Technical: - Distributed: Volunteers run scraping scripts on personal hardware - Upload results to Internet Archive - Focus: "Rescue missions" when platforms announce shutdowns

Strengths: - ✅ Extremely agile (can mobilize in hours) - ✅ No financial overhead - ✅ Ideologically aligned (pure preservation mission)

Weaknesses: - ❌ Dependent on volunteer goodwill (burnout risk) - ❌ No long-term guarantees (people disappear) - ❌ Limited scope (only reactive, not systematic)

Student Discussion: 1. Can you "professionalize" Archive Team without losing its anarchist spirit? 2. What would a "hybrid" look like? (Paid coordinator + volunteer contributors?)

Assignment: Design Your Archive

Objective: Create a complete preservation organization design that avoids Module 0 failure modes.

Deliverable: Archive Business Plan (3000-4000 words)

Required Sections:

1. Executive Summary (250 words)

- Mission statement - Artifact scope - Why this preservation gap exists - Three Pillars alignment

2. Governance Model (500 words)

- Legal structure (with justification) - Decision-making process - Succession plan - Accountability mechanisms

3. Revenue Model (750 words)

- Primary revenue stream (detailed projection) - Secondary streams - 10-year budget (revenue + expenses) - Break-even analysis - Sustainability argument

4. Technical Architecture (750 words)

- Storage strategy (centralized/federated/P2P) - Redundancy plan - Access model (who can retrieve artifacts? how?) - Preservation pipeline (ingestion → validation → storage) - Cost projections

5. Community Strategy (500 words)

- Stakeholder map (contributors, partners, users) - Engagement plan (how do you recruit and retain?) - Labor model (volunteers? paid staff? mix?)

6. Risk Analysis (500 words)

- Identify 5 biggest risks (from Module 0 taxonomy) - Mitigation strategy for each - "Single point of failure" test: What would shut you down?

7. 10-Year Roadmap (250 words)

- Years 1-2: Proof of concept (what do you build first?) - Years 3-5: Growth phase (scaling up) - Years 6-10: Maturity (self-sustaining institution)

Evaluation Criteria:

| Criterion | Points | What We're Looking For | |-----------|--------|------------------------| | Three Pillars Alignment | 25 | Does governance/funding/tech embody Declaration, Connection, Ground? | | Financial Realism | 20 | Is the budget plausible? Does revenue cover expenses? | | Institutional Resilience | 20 | Did you address Module 0 failure modes? Can it survive founder departure? | | Technical Feasibility | 15 | Is the architecture buildable? Did you address storage/access/preservation? | | Originality | 10 | Is this just copying Internet Archive, or a genuinely new model? | | Writing Quality | 10 | Is it clear, organized, persuasive? |

Total: 100 points

Optional Extension: Pitch Competition

Students with the strongest designs may present to a panel of external reviewers: - Internet Archive staff - Library/museum professionals - Grant program officers - Impact investors (for for-profit models)

Prize: $5,000 seed funding to prototype the design (if feasible).

Discussion Questions for Seminar

1. The Revenue Dilemma: Can preservation work ever be fully self-sustaining, or does it require perpetual fundraising? Is "earned revenue" always mission drift?

2. The Access Question: Should preserved artifacts be fully open (public good) or restricted (sustainability via membership)? Can you have both?

3. The Scale Problem: Internet Archive is massive but fragile. Mastodon is distributed but chaotic. Is there a "Goldilocks zone"?

4. The Legal Risk: If your preservation work violates copyright (like IA's controlled digital lending), is that justified? Who decides?

5. The Founder Problem: Can you really design an institution that doesn't need you? Or is "founder-centricity" inevitable in early stages?

Module Deliverables

By the end of Module 1, students will have:

1. ✅ Completed Reading Responses (3 texts on funding/governance models) 2. ✅ Case Study Analysis (comparative analysis of IA, Perma.cc, Archive Team) 3. ✅ Archive Business Plan (3000-4000 words, complete institutional design) 4. ✅ 10-Year Budget Projection (revenue/expenses, break-even analysis) 5. ✅ Institutional Resilience Self-Assessment (Module 0 failure modes checklist)

Looking Ahead: Module 2

Next week, we flip from Archive to Anvil. Module 2: The Economics of the Anvil asks:

"How do you monetize sovereignty in an economy optimized for platform rent-seeking?"

You'll design a "foundry" business—not preserving artifacts, but forging new ones (domains, monuments, frameworks) in a way that's sustainable, principled, and resistant to platform capture.

Instructor Notes

- Guest Speaker: Invite Internet Archive CFO or Perma.cc director to discuss real operational challenges - Bring Real Budgets: Share actual financial reports (anonymized if needed) so students see what "sustainable" looks like - Peer Review: Have students exchange drafts and critique each other's risk mitigation strategies - Balance Idealism and Pragmatism: Students should design ambitious institutions that could actually launch (not just thought experiments)

End of Module 1