Professional Practice / Preservation Operations

The Triage Workflow: From Discovery to Access

Transform Panic into Systematic Rescue Operations

When a platform announces a shutdown, time is the enemy. This 8-Phase Workflow transforms panic into a systematic rescue operation.

Derived from Chapter 10: Triage Workflow and the Archaeobytology Protocol v1.0, this guide operationalizes preservation ethics into actionable steps.


The 8-Phase Workflow

Phase 1
Discovery

Detect endangerment via "Canary" monitoring systems. Watch for:

  • Official shutdown notices
  • Mass user exodus patterns
  • Terms of Service changes signaling platform pivot
  • Leadership turnover or acquisition announcements
  • Sudden removal of API access or export tools
Phase 2
Assessment (The Go/No-Go Decision)

Determine three critical factors:

  • Scope: How much data exists? (GB? TB? Estimate server load)
  • Urgency: How many days remain? (Critical if < 7 days)
  • Ethics: Does this pass the Custodial Filter?

Decision Point:

Apply the 5-Step Ethical Decision Matrix (see below). If the artifact fails the ethics check, do not proceed.

The 5-Step Ethical Decision Matrix

Before preserving or publishing an artifact, applying the following filter is mandatory:

1. Cultural Significance

Does this artifact represent a community, movement, or moment that would otherwise be lost?

Priority: Elevate marginalized voices and grassroots movements over corporate or mainstream content.

2. Technical Fragility

How close is this to extinction?

Action: Critical fragility (48 hours to shutdown) overrides lower-priority concerns.

3. Rescue Difficulty

Is it technically feasible to save this with current resources?

Trade-off: Do not spend 100 hours on one low-value artifact if it costs saving 1,000 high-value ones.

4. Existing Redundancy

Is the Internet Archive or Library of Congress already saving this?

Rule: Do not duplicate effort unless you are adding unique context or fidelity.

5. Consent and Ethics (The Hardest Question)
  • The Harm Principle: Does preserving this cause direct harm (doxxing, revenge porn)? If yes, do not preserve.
  • The Right to be Forgotten: Did the creator explicitly delete this? If so, respect the deletion unless there is an overriding public interest (e.g., public official accountability).
  • Context Collapse: Will preserving this expose a private community to public scrutiny they did not consent to?
Phase 3
Mobilization

Assemble the rescue team and select appropriate tools:

  • For static sites: wget, HTTrack
  • For dynamic content: Selenium, Puppeteer
  • For API-accessible data: Custom API scrapers
  • For social media: Platform-specific tools (e.g., gallery-dl, yt-dlp)
Phase 4
Capture (The Scrape)

Strategy:

  1. Breadth-First: Capture the list of all URLs first (the "index"). This ensures you know what exists before the server dies.
  2. Depth-Second: Download heavy media/content after ensuring the index is safe.

⚠️ Critical Rule:

Respect rate limits to avoid crashing the dying server. You are preserving, not attacking. Use delays between requests and monitor server response times.

Phase 5
Validation

Verify data integrity to ensure nothing was corrupted during capture:

  • Generate and store checksums (MD5, SHA-256) for all files
  • Verify WARC file integrity if using Web ARChive format
  • Spot-check random samples to ensure content is complete and readable
  • Document any missing or corrupted files
Phase 6
Storage
The 3-2-1 Rule
  • 3 copies of the data
  • 2 different storage formats/media types
  • 1 copy stored off-site (different physical location)

Apply LOCKSS Principles:

Lots of Copies Keep Stuff Safe. Distribute copies to multiple trusted stewards when possible. Single points of failure doom artifacts to eventual loss.

Phase 7
Access

Determine appropriate access level based on ethical considerations:

  • Public Archive: Open web access (appropriate for public content with no privacy concerns)
  • Restricted Access: Researchers only, requires authentication (for sensitive but historically valuable content)
  • Dark Archive: Preserved but sealed (content with privacy concerns but historical value; accessible only with special permission or after time delay)
Phase 8
Documentation (Provenance)

Record the Chain of Custody:

Without provenance, the artifact is just a file. Documentation transforms it into a historical record.

Document the following:

  • Who: Who performed the capture? (Individual or organization)
  • When: Date and time of capture
  • How: What tools were used? What parameters?
  • Where: Original source URLs and server information
  • Why: Context about the platform shutdown and preservation rationale
  • Integrity: Checksums and verification methods
  • Completeness: Known gaps or limitations in the capture

Summary: When Time is the Enemy

Platform deaths are unpredictable. The difference between a successful rescue and permanent loss often comes down to having a rehearsed workflow.

This 8-Phase system ensures that when the alarm sounds, your team moves with purpose rather than panic.

Remember: Preservation is an act of power. Wield it responsibly.


Source Note

This workflow is derived from "Chapter 10: Triage Workflow," "Chapter 9: The Custodial Filter," and the "Archaeobytology Protocol v1.0."