Foundations Series / Vol 01 Est. 2025

Chapter 8: Digital Forensics for Archaeobytologists


Opening: The Crime Scene Is Digital

In 2019, a hard drive arrived at the Internet Archive. It had been recovered from a dumpster behind a defunct web hosting company. The company had gone bankrupt, its servers sold for scrap, its customer data—thousands of personal websites from the early 2000s—abandoned.

The hard drive was physically intact but logically corrupted. The file system was damaged. File names were mangled or missing. Timestamps were wrong. Some files were partially overwritten with random data. But somewhere in those magnetic sectors were websites that existed nowhere else—personal blogs, family photos, amateur art portfolios. Digital artifacts on the verge of permanent loss.

This required digital forensics: the practice of recovering, analyzing, and authenticating digital evidence from damaged, deleted, or deliberately obscured sources.

Digital forensics emerged from law enforcement (recovering deleted files from criminals' computers) and IT security (analyzing malware, investigating breaches). But Archaeobytologists need these same skills for different purposes:

Unlike law enforcement, we're not building criminal cases. Unlike IT security, we're not defending against attacks. We're rescuing cultural artifacts from technological decay.

This chapter teaches digital forensics adapted for Archaeobytological practice. You'll learn:

By the end, you'll be able to take a corrupted hard drive, deleted website, or mysterious file format and systematically extract whatever cultural value remains.


Part I: Foundations of Digital Forensics

The Digital Artifact as Evidence

Physical artifacts are tangible—you can touch a clay pot, examine it with eyes and hands. Digital artifacts are abstract—they're electromagnetic patterns interpreted by software.

This abstraction creates both challenges and opportunities:

Challenges:

Opportunities:

The Forensic Workflow

Digital forensics follows a systematic process:

1. Acquisition (get a copy without altering the original) 2. Preservation (create forensic images, maintain chain of custody) 3. Analysis (examine the data, extract information) 4. Documentation (record findings, methods, provenance) 5. Presentation (make findings accessible to non-technical audiences)

This workflow ensures:


Part II: File System Forensics — Finding the Lost

Understanding File Systems

When you delete a file, it doesn't vanish immediately. The operating system marks the space as "available" but doesn't erase the data until something overwrites it. This is why "deleted" files can often be recovered.

Common File Systems:

FAT32 (old Windows, USB drives)

NTFS (modern Windows)

ext4 (Linux)

APFS (modern macOS)

HFS+ (older macOS)

Data Carving: Recovering Files Without Metadata

When file systems are severely damaged (corrupted directory structure, missing file allocation table), you can't rely on the filesystem to tell you where files are. Instead, you use data carving: scanning raw disk sectors looking for file signatures.

How It Works:

Every file type has a signature (magic bytes) at the beginning:

Data carving tools scan the entire disk, looking for these signatures. When found, they extract the file.

Tools:

Example: Carving a Corrupted USB Drive

Result: PhotoRec dumps recovered files into output directory, organized by type. Files are renamed generically (f0001.jpg, f0002.png) since metadata is lost.

Limitations:

Archaeological Application:

When you recover an old hard drive from a defunct web hosting company, data carving may be your only option. You won't know which files belong to which user or what they were originally named, but you'll have the actual content—which is better than nothing.

File System Timeline Analysis

Even when files aren't deleted, timestamps reveal important information:

MAC Times:

Plus NTFS adds:

Why Timestamps Matter:

Example 1: Identifying Original Creator

Example 2: Detecting Tampering

Tools:

Example: Creating a Timeline

Archaeological Application:

When analyzing a preserved platform, timeline analysis reveals:


Part III: Metadata Forensics — The Hidden Stories

What Is Metadata?

Metadata is "data about data"—information embedded in files describing their creation, modification, and context.

Types of Metadata:

1. File System Metadata (from OS)

2. Embedded Metadata (inside file)

3. Application Metadata (created by software)

Extracting Metadata

Tool: ExifTool (universal metadata reader)

Example Output (JPEG from phone):

What This Reveals:

Privacy and Metadata

Ethical Dilemma: Metadata often contains personally identifiable information (PII):

Archaeobytologist's Responsibility:

When preserving:

When publishing:

Case Study: Geotagged Photos from Protests

Photos from 2020 Black Lives Matter protests contain GPS metadata. Should archivists preserve it?

Arguments FOR:

Arguments AGAINST:

Compromise:

Metadata as Provenance

Metadata helps establish provenance—the history and origin of an artifact.

Example: Authenticating a Leaked Document

Someone claims to have a "leaked internal memo from Facebook, dated 2016."

Forensic Analysis:

Output reveals:

Findings:

Conclusion: Document is likely fabricated or misdated. Further investigation needed.

Forensic Best Practice:


Part IV: Format Forensics — Identifying the Unknown

The Format Identification Problem

You receive a folder of files from a defunct platform. Many have no file extensions, or wrong extensions (.dat, .tmp, .db). How do you figure out what they are?

Don't trust extensions. Extensions are metadata (easily changed). Instead, examine the file signature.

Magic Numbers and File Signatures

Every file format has a magic number—specific bytes at the beginning that identify the type.

Common Signatures:

Format Hex Signature ASCII
JPEG FF D8 FF ...
PNG 89 50 4E 47 0D 0A 1A 0A ‰PNG....
GIF 47 49 46 38 GIF8
PDF 25 50 44 46 %PDF
ZIP 50 4B 03 04 PK..
MP3 49 44 33 or FF FB ID3 or ÿû
EXE 4D 5A MZ
SQLite 53 51 4C 69 74 65 20 66 6F 72 6D 61 74 20 33 00 SQLite format 3.

Tool: file command (Unix)

Tool: DROID (UK National Archives)

Obsolete and Proprietary Formats

The Hardest Cases:

1. Proprietary formats with no documentation

2. Custom binary formats

3. Encrypted or obfuscated formats

Strategies:

A. Search for Format Documentation

B. Reverse Engineer

C. Find Old Software

D. Convert via Emulation

Example: Recovering WordPerfect 5.1 Documents

WordPerfect was dominant in 1980s-90s. Many legal documents, dissertations, novels exist only in .wpd format.

Solution:

  1. Download WordPerfect 5.1 (abandonware)

  2. Run in DOSBox emulator

  3. Open .wpd files

  4. Export to ASCII or RTF (WordPerfect can do this)

  5. Import to modern word processor

Alternative: LibreOffice can open some WordPerfect formats (but not perfectly).


Part V: Emulation and Compatibility

When Files Require Specific Environments

Some digital artifacts aren't just files—they're experiences that require specific software, hardware, or operating systems.

Categories:

1. Software Applications

2. Websites with Complex JavaScript

3. Games

4. Interactive Art

Emulation Strategies

Strategy 1: OS Emulation

Run the entire original operating system in a virtual machine.

Tools:

Example: Running Windows 95 Software

  1. Download Windows 95 ISO (abandonware/legally gray)

  2. Create VirtualBox VM

  3. Install Windows 95

  4. Install old software (from CD image or floppy disk image)

  5. Take VM snapshot (preserve working state)

  6. Users can run VM, experience software as originally intended

Strategy 2: Browser-Based Emulation

Internet Archive's approach: run emulators in web browser.

Technologies:

Example: Internet Archive's Software Collection

Strategy 3: Format Migration

Convert old formats to modern equivalents (lossy but pragmatic).

Examples:

Trade-offs:

Best Practice: Do both when possible. Preserve original + create migrated version.


Part VI: Authentication and Chain of Custody

Proving a Digital Artifact Is Authentic

Physical artifacts can be authenticated through material analysis (carbon dating, paint chemistry). Digital artifacts are perfectly copyable—a copy is identical to original. So how do you prove authenticity?

Cryptographic Hashing

A hash is a unique fingerprint of a file. Change one bit, and the hash changes completely.

Common Hash Functions:

Example: Computing SHA-256 Hash

Use Cases:

1. Proving Integrity

2. Detecting Tampering

3. Chain of Custody

Digital Signatures

For legally significant documents, cryptographic signatures prove:

How It Works:

  1. Author creates document

  2. Author signs with private key (generates signature)

  3. Anyone can verify signature with author's public key

  4. Signature proves: (a) author had private key, (b) document unchanged

Tools:

Archaeological Application:

When archiving controversial or historically important documents (leaked memos, government records, deleted tweets), sign them immediately. This proves:


Part VII: Forensic Documentation

Recording Your Process

Forensic work is worthless if you can't explain what you did. Document everything:

Acquisition Documentation

Record:

Example Log:

Analysis Documentation

Record:

Example Analysis Notes:


Part VIII: Ethical Boundaries in Forensics

When Forensics Becomes Invasion

Digital forensics is powerful—but power requires ethical limits.

Scenarios Where Forensics Is Inappropriate

1. Private Communications

2. Intimate Content

3. Trade Secrets / Proprietary Information

4. Ongoing Harm

Forensic Ethics Framework

Ask before analyzing:

1. Consent

2. Public Interest

3. Harm Potential

4. Alternative Methods

Example: The Deleted Political Tweet

A politician deletes a tweet. You have forensic tools to recover it from cached data. Should you?

Analysis:

Conclusion: Ethical to recover and publish (accountability > privacy for public officials).

Example: The Abandoned Teenager's Blog

You recover a hard drive with a teenager's private blog from 2005 (they're now 35). Should you publish it?

Analysis:

Conclusion: Don't publish without consent. Document that it existed, archive privately, contact them if possible.


Conclusion: The Forensic Archaeobytologist

Digital forensics transforms you from passive archivist to active investigator. You don't just accept what platforms give you—you dig deeper, recover what was lost, authenticate what's dubious, and extract meaning from the opaque.

Every corrupted hard drive, every deleted file, every mysterious format is a puzzle. Your forensic skills determine whether that puzzle is solved or remains forever mysterious.

But with great power comes great responsibility. Forensics can invade privacy, resurrect deliberately forgotten content, and cause harm. The Custodial Filter applies here too: just because you can recover something doesn't mean you should.

In the next chapter, we'll explore the ethics of preservation in depth—examining the hardest dilemmas Archaeobytologists face, and building frameworks for navigating them.

For now, practice your forensic skills. Find an old hard drive, a corrupted file, a mysterious binary. Apply these methods. Document your process. And ask: What stories are hidden in these bits?

The artifacts are waiting. Now go uncover them.


Discussion Questions

  1. Metadata Privacy: You're archiving a photo collection from a defunct platform. GPS coordinates reveal protesters' locations. Do you strip the metadata or preserve it for research?

  2. Format Obsolescence: You find files in a proprietary format with no documentation. Do you spend weeks reverse-engineering it, or accept that some content will be lost?

  3. Deleted Content: A user intentionally deleted their account and content. You have a backup. Do you preserve it?

  4. Authentication: Someone claims a document is a "leaked corporate memo." Your forensic analysis shows metadata inconsistencies. How do you publish your findings without enabling misinformation?

  5. Emulation vs. Migration: Is it better to maintain perfect fidelity through emulation (expensive, complex) or accept some loss through format migration (pragmatic, sustainable)?

  6. Chain of Custody: How do you prove to skeptics that an archived artifact is authentic and unaltered?


Exercise: Forensic Recovery Project

Task: Conduct a forensic analysis of a digital artifact.

Part 1: Acquire an Artifact (Choose one)

Part 2: Forensic Analysis (1000 words)

Document:

  1. Acquisition: How did you obtain it? Document device info, date, method

  2. Hashing: Compute SHA-256, document hash

  3. Format Identification: What type of file? Use file command or DROID

  4. Metadata Extraction: What metadata exists? Use ExifTool

  5. Content Analysis: What's inside? Can you open it? Recover data?

  6. Timeline: When was it created, modified, accessed?

  7. Findings: What did you learn? Any surprises?

Part 3: Ethical Reflection (500 words)

Part 4: Documentation (Create forensic report)


Further Reading

On Digital Forensics Methods

On Format Preservation

On Emulation

Tools Documentation


End of Chapter 8

Next: Chapter 9 — The Custodial Filter: Ethics of Preservation