Foundations Series / Vol 01 Est. 2025

Chapter 5: Triage Methodology — The Custodial Filter and Ethical Preservation


Opening: The Impossible Choice

October 16, 2016. Vine announces it will shut down in three months. Archive Team mobilizes immediately, but the math is brutal:

Even working around the clock, they can't save everything. They must choose.

Do they prioritize:

Every choice means something else dies. Every video saved means another left behind.

This is triage—borrowed from battlefield medicine, where doctors must decide which wounded soldiers to treat first when resources are scarce. In emergency rooms, triage saves lives by allocating attention efficiently. In digital preservation, triage saves culture by allocating effort strategically.

But triage is agony. It forces us to confront uncomfortable truths:

This chapter explores how to make those impossible choices—not perfectly (perfection is impossible), but ethically, systematically, and transparently.

We call this framework the Custodial Filter: a methodology for deciding what to preserve, when to preserve it, and when—painfully—to let go.


Part I: The Ethics of Triage

Why Triage Is Necessary

Infinite Culture, Finite Resources

The internet produces content at a rate no human effort can fully capture:

Even with unlimited storage (which doesn't exist), the labor of curation—adding metadata, providing context, ensuring accessibility—is scarce.

Platform Death Accelerates Urgency

When a platform announces shutdown, the timeline collapses:

In crisis mode, triage becomes life-or-death for artifacts.

Preservation Requires Stewardship

Saving bits is relatively cheap (storage costs drop constantly). But meaningful preservation requires:

These activities consume human time and expertise—resources that will always be scarce.

The Ethical Stakes of Triage

Who Decides What's Worth Saving?

Triage decisions encode power and values:

Every triage framework embodies ethical commitments, whether explicit or not.

The Permanence of Loss

Physical artifacts can be rediscovered—buried ruins excavated, manuscripts found in attics. But digital artifacts vanish completely when platforms shut down. There's no archaeological dig 100 years later to recover what we failed to save.

Triage decisions are irreversible. What we don't preserve now is lost forever.

The Burden of Custodianship

To preserve is to claim custodial responsibility:

This burden can't be escaped. Even choosing not to preserve is a choice with consequences.


Part II: The Custodial Filter — A Five-Question Framework

The Custodial Filter is a systematic methodology for triage. Before preserving any artifact, ask five questions:

Question 1: Cultural Significance

Does this artifact represent a community, movement, or cultural moment that would otherwise be lost?

Criteria:

High Significance Examples:

Lower Significance Examples:

Challenge: Whose Significance?

What's "significant" is contested:

Best Practice: Default to over-preservation when significance is uncertain. We can't predict what future scholars will want to study.

Question 2: Technical Fragility

How close to disappearance is this artifact?

Fragility Spectrum:

Critical (Hours/Days)

High (Weeks/Months)

Medium (Years)

Low (Decades)

Triage Priority:

Example: GeoCities vs. Library of Congress

When both GeoCities and LOC's web archive need attention:

Question 3: Rescue Difficulty

How hard is this artifact to preserve?

Ease Assessment:

Easy (Can automate)

Medium (Requires manual effort)

Hard (Technical barriers)

Very Hard (Near-impossible)

Triage Tension:

Should you spend 100 hours preserving one hard artifact, or preserve 100 easy artifacts in the same time?

No universal answer, but factors to consider:

Example: Flash Games

Flashpoint Project prioritized Flash games (medium-hard difficulty) because:

They chose one hard project over many easy ones—and succeeded.

Question 4: Existing Redundancy

Is someone else already preserving this?

Check for Redundancy:

Redundancy Matrix:

Situation Action
No one preserving Urgent priority (you might be the only chance)
One fragile preservation Valuable redundancy (create backup of backup)
Multiple stable institutions Lower priority (focus elsewhere unless you add unique value)
Already in Internet Archive + LOC + universities Deprioritize (unless you're doing different kind of curation)

Exception: "Preserve Differently"

Even if something is archived, you might preserve it differently:

Example: Vine

Internet Archive scraped Vine comprehensively (quantity). Individual fans created curated collections (quality—"Best Vines 2013-2017"). Both were valuable.

Should we preserve this?

This is the hardest question—and the one most often skipped. Just because you can preserve something doesn't mean you should.

Ethical Red Flags:

Privacy Violations

Potential for Harm

Contested Consent

Cultural Sensitivity

Trauma and Re-traumatization

The Ethical Tension:

Preservation often conflicts with privacy/consent:

No Easy Resolution, but principles to guide:

Principle 1: Minimize Harm

Principle 2: Respect Explicit Deletion

Principle 3: Restrict Access When Appropriate

Principle 4: Community Consultation

Principle 5: Transparent Decision-Making

Case Study: Tumblr NSFW Purge

In 2018, Tumblr banned all "adult content," deleting millions of posts, many of which were:

Ethical Dilemma: Should archivists preserve purged content?

Arguments FOR:

Arguments AGAINST:

What Actually Happened:

Ethical Assessment:


Part III: The Triage Decision Matrix

Combine all five questions into a scoring system to prioritize artifacts systematically.

Scoring Framework (0-5 scale for each dimension)

Cultural Significance (0 = spam, 5 = irreplaceable cultural artifact)

Technical Fragility (0 = stable/safe, 5 = will disappear in hours)

Rescue Feasibility (0 = impossible, 5 = trivial to preserve; inverted for priority)

Redundancy Gap (0 = many redundant copies, 5 = unique, no other preservation)

Ethical Clarity (0 = serious ethical problems, 5 = clearly ethical to preserve)

Example Triage Matrix: Vine Shutdown

Artifact Significance Fragility Feasibility Redundancy Ethics Total Priority
Viral memes (already copied) 4 5 5 2 5 21 Medium
Small creator archives 5 5 4 5 5 24 High
Corporate brand accounts 2 5 5 1 5 18 Low
Private accounts 3 5 3 5 2 18 Low (ethics)
Representative sample 4 5 5 4 5 23 High

Priority Ranking:

  1. Small creator archives (24 points) — Unique voices, no other preservation, highly fragile

  2. Representative sample (23 points) — Cultural cross-section, high feasibility

  3. Viral memes (21 points) — Significant but already widely copied

  4. Private accounts (18 points) — Ethical concerns override other factors

  5. Corporate accounts (18 points) — Low cultural value

Triage in Action: Three Scenarios

Scenario 1: Imminent Shutdown (48 hours)

Situation: Small forum announces shutdown in 2 days. 10,000 posts, no warning.

Triage Decision:

Action: Immediate scrape. Archive everything, sort out curation later. In crisis, preservation > perfection.

Method:

  1. Use wget or HTTrack to scrape visible content

  2. Ask community members for database dump (if possible)

  3. Archive now, curate later (when not under deadline)

Scenario 2: Declining Platform (1-2 years warning)

Situation: Google+ shutdown announced for 2019. Company gives 1 year notice.

Triage Decision:

Action: Systematic preservation with community partnership

Method:

  1. Partner with Internet Archive for Wayback crawls

  2. Create guides for users to export their own data

  3. Identify high-value communities (e.g., Photography+ had professional communities)

  4. Curate sample collections (not everything, but representative)

  5. Take full year to do it right (not crisis mode)

Scenario 3: Ongoing Platform with Contested Content

Situation: Twitter still operational, but waves of account suspensions. Some suspended accounts have historically important content.

Triage Decision:

Action: Selective proactive archiving with ethical review

Method:

  1. Identify accounts with high historical/cultural value (activists, journalists, politicians)

  2. Proactively archive (before suspension) using tools like Twitter Archiver

  3. For already-suspended: check if Internet Archive captured (Wayback Machine)

  4. Ethical case-by-case: Don't archive hate groups, do archive wrongfully suspended activists

  5. Restrict access for contentious material (researcher-only)


Part IV: Practical Triage Workflows

Workflow 1: Crisis Triage (Platform Shutdown Imminent)

Phase 1: Assess (Hours 1-4)

  1. How much time until shutdown?

  2. How much content exists?

  3. Who else is archiving?

  4. What tools are available?

Phase 2: Mobilize (Hours 4-24)

  1. Recruit volunteers (Archive Team, Twitter, Reddit)

  2. Set up infrastructure (servers, storage, coordination)

  3. Divide labor (different people scrape different sections)

Phase 3: Execute (Remaining time)

  1. Quantity over quality: Save everything you can

  2. Metadata is secondary (just get the bits)

  3. Accept losses (you won't get everything)

Phase 4: Post-Shutdown

  1. Consolidate scraped data

  2. Remove duplicates

  3. Begin curation (add metadata, organize)

  4. Make accessible (upload to Internet Archive, create search interface)

Example: GeoCities Rescue

Workflow 2: Anticipatory Preservation (Platform Declining)

Phase 1: Monitor (Ongoing)

Phase 2: Plan (When decline evident)

  1. Identify most valuable content (communities, creators, cultural artifacts)

  2. Assess redundancy (what's already archived?)

  3. Develop curation strategy (can't save everything, but can save representative sample)

Phase 3: Execute (Before crisis)

  1. Methodical crawling (not frantic scraping)

  2. Add metadata as you go

  3. Coordinate with platform (ask for data dumps, export tools)

Phase 4: Maintain (After shutdown)

  1. Preserve archives long-term (storage, format migration)

  2. Make accessible (search, browse, context)

  3. Document (write history of the platform for future scholars)

Example: LiveJournal Migration

Workflow 3: Continuous Curation (Ongoing Platforms)

Phase 1: Define Scope

Phase 2: Automate

Phase 3: Curate Regularly

Phase 4: Respond to Crises

Example: Internet Archive's Wayback Machine


Part V: Ethical Edge Cases

Edge Case 1: The Deleted Tweet from a Public Figure

Scenario: A politician tweets something racist, then deletes it 20 minutes later. Should you archive it?

Ethical Considerations:

FOR Archiving:

AGAINST Archiving:

Custodial Filter Analysis:

Recommendation: Preserve with context

Edge Case 2: The Fan Fiction Archive

Scenario: A LiveJournal community for a specific fandom (slash fiction, LGBTQ+ content) is abandoned. Creators have scattered. Should you archive?

Ethical Considerations:

FOR Archiving:

AGAINST Archiving:

Custodial Filter Analysis:

Recommendation: Archive with restrictions

  1. Scrape the content (preserve the bits)

  2. Don't make fully public (no Google indexing)

  3. Researcher access only (require application, explain use)

  4. Allow author-requested takedowns (if someone says "please remove my fic," do it)

  5. Document the community culture (not just stories, but context of why this mattered)

Edge Case 3: The Hate Forum

Scenario: A white supremacist forum announces shutdown. It documents radicalization pathways and extremist organizing. Should you archive?

Ethical Considerations:

FOR Archiving:

AGAINST Archiving:

Custodial Filter Analysis:

Recommendation: Very restricted archive, if at all

Option A (Maximum Security):

  1. Archive for research only (no public access)

  2. Require IRB approval + academic credentials to access

  3. Redact personal information of victims

  4. Coordinate with law enforcement (if active threats)

  5. Provide to hate-monitoring organizations (ADL, SPLC)

Option B (Don't Archive):

Most archivists choose Option A: Preserve but lock down tightly. History includes ugly things, and understanding extremism requires evidence.

Edge Case 4: The Private Message Leak

Scenario: Someone leaks a trove of private Discord messages revealing corporate malfeasance. The messages are newsworthy but were shared under expectation of privacy. Should you archive?

Ethical Considerations:

FOR Archiving:

AGAINST Archiving:

Custodial Filter Analysis:

Recommendation: Selective archive with redaction

  1. Archive the newsworthy messages (evidence of wrongdoing)

  2. Redact personal information of non-involved parties (people who just happened to be in the server)

  3. Remove sensitive personal details (even of wrongdoers—focus on the malfeasance, not their kids' names)

  4. Make available to journalists and researchers

  5. Consider time embargo (publish now, release full archive in 10 years when people involved are less vulnerable)

Principle: Public interest can override privacy, but minimize collateral damage.


Part VI: Institutional Triage Policies

Building a Triage Policy for Your Organization

If you're creating an archive, museum, or preservation institution, codify your triage principles:

Policy Components:

1. Mission Statement

2. Significance Criteria

3. Ethical Red Lines

4. Restricted Access Guidelines

5. Takedown Process

6. Transparency Commitment

Example: The Internet Archive's Policy (Simplified)

Example: A Hypothetical Trans Archive's Policy


Part VII: When to Let Go

The Hardest Lesson: Accepting Loss

Not everything can be saved. Sometimes, the ethical choice—or the practical choice—is to let something die.

When to Let Go:

1. Ethical Harm Outweighs Value

2. No Viable Path to Preservation

3. Resources Better Spent Elsewhere

4. Respecting Intentional Ephemerality

The Grief of Triage

Letting artifacts die is painful. You're choosing what future generations can never know. You're accepting that some stories will be lost, some voices silenced, some memories erased.

This grief is unavoidable. The role of the Archaeobytologist includes mourning.

But: Grief that paralyzes is counterproductive. Mourn, then act. Save what you can. Document what you couldn't save (at least record that it existed). Move forward.

The Triage Paradox:

The better you get at triage, the more aware you become of loss. Beginners think they can save everything. Experts know they can't—and carry the weight of every choice.

This is the burden of custodianship.


Conclusion: Triage as Ethical Practice

The Custodial Filter isn't a formula—it's a framework for ethical deliberation. It forces you to ask hard questions:

Every triage decision is an ethical act. You're deciding what the future can know about the past. You're allocating scarce resources (time, labor, storage, attention). You're potentially overriding someone's wishes (to be forgotten, to be private).

These decisions should be:

The Custodial Filter provides structure for these decisions—not certainty, but rigorous ethical thinking.

In the next chapter, we'll explore the boundaries of Archaeobytology as a discipline—how it differs from adjacent fields, what makes it distinct, and why it deserves recognition as its own domain of study.

But first, practice triage. Look at your own digital life. What would you save if you had 48 hours to archive everything? What would you let go? And how would you justify those choices?

The Custodial Filter begins with seeing your own values clearly.


Discussion Questions

  1. Personal Triage: If your email account announced shutdown in 48 hours, what would you prioritize saving? Why? What would you let go?

  2. Ethical Boundaries: Where do you draw the line? What content should never be archived, even if historically significant?

  3. Competing Values: How do you balance (a) preserving everything for future research vs. (b) respecting privacy and consent?

  4. Bias and Representation: How can triage avoid reproducing systemic biases (racism, sexism, class privilege)? Is "objective" triage possible?

  5. Institutional vs. Individual: Should triage decisions be made by institutions (museums, archives) or individuals (you with your hard drive)? What are the pros/cons of each?

  6. Future Regret: Imagine it's 2075. What digital culture from 2020s do you think future historians will wish we'd preserved but didn't?


Exercise: Conduct a Triage Simulation

Scenario: You have 72 hours and 1TB of storage to archive a dying platform before it shuts down. The platform has:

You cannot save everything. Conduct triage.

Part 1: Define Your Values (300 words)

Part 2: Apply the Custodial Filter (500 words)

Create a triage matrix for these artifact types:

  1. Viral posts (high engagement, widely seen)

  2. Marginalized community content (LGBTQ+, disability, etc.)

  3. Long-form creative work (fiction, art, tutorials)

  4. Personal journaling/diaries

  5. Corporate/brand accounts

Score each on:

Part 3: Make Decisions (500 words)

Part 4: Reflect (200 words)


Further Reading

On Triage and Preservation Ethics

On Digital Preservation Methods

On Ethics of Difficult Knowledge


End of Chapter 5

Next: Chapter 6 — Discipline Formation and Boundaries: Why Archaeobytology Needs to Exist