# theleagueofhonor.com — Recovery Summary

**Date:** 2026-05-12
**Source:** Internet Archive Wayback Machine (every byte came from `web.archive.org`)
**Working directory:** `C:\Users\Jason\Desktop\CursorProjects\theleagueofhonor`

---

## How to view the recovery

The local web server is running on port 8080. Open these URLs in Chrome or Edge:

- **🎬 Flash intro viewer** — http://localhost:8080/05_local_rebuild/index.html
  Plays `Intro_TLOH_Finalv2.swf` in Ruffle (the WebAssembly Flash emulator). Click the "▶ Play in Ruffle" button.

- **🌐 Stitched HTML site** — http://localhost:8080/04_stitched_site/index.html
  Every recovered HTML page, grouped by era (2010–2012 WordPress, 2013–2017, 2018–2021, 2022–2024 hand-coded). Click any page to view it with images and styling intact.

- **📁 Raw recovered files** — http://localhost:8080/03_best_versions/
  Browse the dedup'd canonical copies of every file (HTML, images, swf, mp3, other).

- **🔧 JPEXS .swf asset dumps** — http://localhost:8080/04_extracted_assets/
  Extracted ActionScript, frames, shapes from the Flash intro.

---

## Recovery statistics

| Metric | Value |
|---|---|
| CDX index rows pulled from Wayback | **1,184** |
| Eligible (200 / blank status) | 798 |
| Unique original URLs identified | 649 |
| URLs successfully fetched | **640 (98.6%)** |
| URLs failed | 9 (non-critical: 8 daily blog archive index pages 2010-08-03 through 2010-08-10, plus 1 webmail CSS file) |
| Unique files after dedup | **615** |
| Total disk size of recovered assets | 35 MB |
| Earliest Wayback snapshot | 2010-07-16 |
| Latest Wayback snapshot | 2025-02-20 |
| Domains crawled | theleagueofhonor.com, www.theleagueofhonor.com, mail., webmail., cpanel. |

### Files by type
| Bucket | Count |
|---|---|
| HTML pages | 495 |
| Images (PNG/JPG/GIF/SVG/ICO) | 93 |
| Flash (.swf) | 1 |
| Media (MP3) | 1 |
| Other (CSS/JS/XML/PHP/etc.) | 25 |

---

## What was recovered

### Flash content
- **`Intro_TLOH_Finalv2.swf`** (319 KB, captured 2014-04-27) — the original Flash intro animation.
  - JPEXS extraction produced: 8 ActionScript files, 2 frame definitions, 1 shape, 2 binary data blobs.
  - No embedded raster images or audio in the SWF itself — it's pure vector animation + code.
  - Plays in Ruffle pixel-perfect to the original.

### Character art & site imagery
All character photos recovered from the 2011 WordPress era (multiple resolution variants each):
- **William "Texas" Tyler** — `wtt-aiming-670` (12 variants)
- **Jules Dean** (Creator) — `Jules_Dean2` (12 variants)
- **Jake Davis / JD** — `J-D-aiming-670` (12 variants)
- **Coming Soon graphic** — `LOH_ComingSoon2` (9 variants)
- **Savannah** — `Savannah-fincolor4-site9` (6 variants)
- **"Only the Best" promo** — `onlythebest-CAPS2` (6 variants)
- **Creative team photo** — `creative_team2` (6 variants)
- **Alex** — `alex-standing.resized2` (3 variants)
- **Justin** — `Justin-4web10` (3 variants)
- **Montana Slater** — `montanacloseup2` (multiple)

Plus theme images: `logo_dark`, `header_dark`, `footer_dark`, `nav_bg_dark`, social-media icons, button sprites (Juggernaut WordPress theme).

### Hand-coded era buttons (2022 redesign)
- `home2.png`, `characters2.png`, `Blog2.png`, `Store2.png`, `Contact_Us2.png`, `Design_Team2.png`, `THELOH.png` — 3 size variants each.

### Animations (animated GIFs)
- `2-LOH-animation.gif` — site animation, 3 size variants
- `CosmicGelButton2.gif` — animated button, 3 size variants

### HTML pages (495 total)
Spanning 14 years of site history: homepage versions, character profile pages (`/characters/montana-slater`, etc.), blog posts dating back to 2010, sitemaps, RSS feeds, the 2022 hand-coded relaunch, and webmail/cpanel auxiliary pages.

---

## What was NOT recovered (and why)

- **8 daily-blog-archive index pages** for 2010-08-03 through 2010-08-10. These were retried against alternate timestamps and Wayback genuinely doesn't have working copies. The individual blog posts from those dates are recovered separately — only the date-based index pages are missing.
- **1 mail.theleagueofhonor.com CSS file** — webmail styling, irrelevant to site content.
- **External CDN-hosted assets** (if any) — fonts and embeds from third-party CDNs were not captured by Wayback. No specific instances identified during recovery.

**Recovery completeness: 98.6%** of all URLs the Internet Archive ever indexed for this domain.

---

## Pipeline (what ran)

1. **Phase 1 — Setup.** Installed Python packages (`waybackpack`, `requests`, `tqdm`, `playwright`), Playwright Chromium browser, JPEXS Free Flash Decompiler, Microsoft OpenJDK 21 (required by JPEXS), Ruffle Nightly, and Ruffle self-hosted WASM bundle.
2. **Phase 2 — Inventory.** Pulled the full Wayback CDX index for the domain (1,184 unique-by-digest captures, 2010-07-16 to 2025-02-20). Built `inventory.csv` and `target_urls.txt`.
3. **Phase 3.1 — Bulk mirror.** `waybackpack` pulled 40 timestamp variants of the homepage before Wayback rate-limited it.
4. **Phase 3.2 — Targeted fetch.** Python script with `requests` (fast path) + Playwright fallback fetched 640/649 unique original URLs. Used Wayback's `id_` modifier so files are exact original bytes, no toolbar injection.
5. **Phase 3.3 — Retry sweep.** Re-attempted the 9 failures against alternate timestamps. None recoverable.
6. **Phase 4 — Rebucket + dedupe.** Re-classified every file by **file signature** (magic bytes), not URL extension. Deduped by SHA1 keeping the earliest timestamp. 640 → 615 unique files.
7. **Phase 4b — Stitch.** For each unique original URL, picked the largest valid copy across all captures. Built a directory tree under `04_stitched_site/` mirroring the URL structure. Rewrote Wayback URL prefixes in HTML to point to local relative paths.
8. **Phase 5 — JPEXS extraction.** Decompiled the Flash intro into individual frames, scripts, shapes, and binary data under `04_extracted_assets/`.
9. **Phase 6 — Ruffle viewer.** Generated `05_local_rebuild/index.html` and per-`.swf` wrapper pages that embed Ruffle to play the recovered Flash file in a modern browser.
10. **Phase 7 — Verification.** Started local HTTP server on port 8080. Confirmed all key URLs return HTTP 200.

---

## Next steps (Phase 9 — when ready)

Publishing to a Brilliant Directories site is gated on receiving the new BD site URL + API key from Jason. When ready:
- Upload `04_stitched_site/` to BD media library
- Create BD web pages mirroring the site structure
- Embed the Ruffle player on the intro page via CDN: `https://unpkg.com/@ruffle-rs/ruffle`
- Set up a top-level "archive" menu

---

## Critical files

- [04_stitched_site/index.html](../04_stitched_site/index.html) — entry point to recovered HTML site
- [05_local_rebuild/index.html](../05_local_rebuild/index.html) — Ruffle viewer for the Flash intro
- [03_best_versions/provenance.csv](../03_best_versions/provenance.csv) — every recovered file ↔ its Wayback origin
- [01_inventory/inventory.csv](../01_inventory/inventory.csv) — full CDX index (1,184 rows)
- [06_reports/losses.csv](losses.csv) — the 9 URLs that couldn't be recovered
- [06_reports/losses_final.csv](losses_final.csv) — same 9 after retry sweep
