TwinkLoads Scraper

Twink Loads Scraper

Summary

A Python scraper for twinkloads.com with built-in scene code cache for instant lookups. Extracts scene codes from filenames (e.g., tld0022) and resolves them to full scene metadata.

Source URL

Features

  • Scene scraping by URL - Scrape any twinkloads.com video URL
  • Scene scraping by code - Extract scene codes from filenames (e.g., tld0022_720p.mp4) with instant cache lookup
  • Scene scraping by name - Search scenes by title via sitemap matching
  • Performer scraping by URL - Basic performer details
  • Built-in code cache - Pre-built mapping of 66 scene codes to URLs (tld0001-tld0068)
  • Auto-refresh cache - Automatically rebuilds the cache weekly from the sitemap
  • Fast lookups - No API calls needed for code-based scraping

What It Does

The Twink Loads scraper uses a local cache file to instantly resolve scene codes to URLs. Since twinkloads.com has no search API, the scraper:

  1. Maintains a cached mapping of scene codes (like tld0022) to video URLs
  2. Extracts scene codes from filenames automatically
  3. Looks up the scene URL in the cache and scrapes the full metadata
  4. Rebuilds the cache weekly to keep it up-to-date

Example filename:

tld0022_720p.mp4

Extracts code tld0022 and fetches:

  • Title: Scene title from page
  • Studio: TwinkLoads
  • Code: tld0022
  • Description: Scene description
  • Performers: Full cast with images
  • Cover: High-quality poster image

Installation (Docker)

Files required:

  • TwinkLoads.yml (attached)
  • TwinkLoads.py (attached)
  • code_cache.json (attached)

Folder Structure

docker/
└── docker-compose.yml
└── scrapers/
    └── TwinkLoads/
        ├── TwinkLoads.yml
        ├── TwinkLoads.py
        └── code_cache.json

Docker Compose Configuration

Add the following under volumes in your docker-compose.yml:

- ./scrapers/TwinkLoads:/root/.stash/scrapers/TwinkLoads

Restart the Stash container:

docker compose up -d

The scraper will appear in Stash’s scraper list.

Usage

Scraping by URL

  1. Edit a scene in Stash
  2. Enter a twinkloads.com URL in the URL field
  3. Click “Scrape with… > TwinkLoads”

Scraping by Scene Code (from filename)

  1. Your file is named something like tld0022_720p.mp4
  2. Click “Scrape with… > TwinkLoads”
  3. The scraper extracts tld0022, looks it up in the cache, and fetches the scene

Scraping by Search

  1. Edit a scene in Stash
  2. Click the search icon next to “Scrape with…”
  3. Search for keywords from the scene title
  4. Select the correct scene from results

Technical Details

  • Language: Python 3
  • Dependencies: requests, beautifulsoup4 (auto-installed via py_common)
  • Platform: Barebackplus HTML platform
  • Cache file: code_cache.json - maps scene codes to URLs
  • Cache refresh: Automatically rebuilds weekly by fetching the sitemap
  • Scene code pattern: tld + 3-5 digits (e.g., tld0022, tld0068)
  • Total scenes: 66 videos cached

Cache Management

The scraper automatically manages the cache:

  • Loads code_cache.json on startup
  • If a scene code isn’t found, forces a cache rebuild
  • Rebuilds the cache weekly (7 days since last modification)
  • Fetches the sitemap and scrapes each page’s og:image tag to extract scene codes

To manually rebuild the cache:

docker exec stash python /root/.stash/scrapers/TwinkLoads/TwinkLoads.py rebuildCache

Notes

  • Twink Loads has no search API, so the cache is essential for code-based lookups
  • The cache includes codes from tld0001 to tld0068 (with gaps in numbering)
  • Performer images are extracted from lazy-loaded data-src attributes
  • The scraper automatically converts scene codes to lowercase for consistency

Attachments:

Important note! Rename code_cache.json.txt to code_cache.json after downloading (uploading json is unsupported on discourse)

Enjoy!

Submitted to CommunityScrapers.