Twink Loads Scraper
Summary
A Python scraper for twinkloads.com with built-in scene code cache for instant lookups. Extracts scene codes from filenames (e.g., tld0022) and resolves them to full scene metadata.
Source URL
Features
- Scene scraping by URL - Scrape any twinkloads.com video URL
- Scene scraping by code - Extract scene codes from filenames (e.g.,
tld0022_720p.mp4) with instant cache lookup - Scene scraping by name - Search scenes by title via sitemap matching
- Performer scraping by URL - Basic performer details
- Built-in code cache - Pre-built mapping of 66 scene codes to URLs (tld0001-tld0068)
- Auto-refresh cache - Automatically rebuilds the cache weekly from the sitemap
- Fast lookups - No API calls needed for code-based scraping
What It Does
The Twink Loads scraper uses a local cache file to instantly resolve scene codes to URLs. Since twinkloads.com has no search API, the scraper:
- Maintains a cached mapping of scene codes (like
tld0022) to video URLs - Extracts scene codes from filenames automatically
- Looks up the scene URL in the cache and scrapes the full metadata
- Rebuilds the cache weekly to keep it up-to-date
Example filename:
tld0022_720p.mp4
Extracts code tld0022 and fetches:
- Title: Scene title from page
- Studio: TwinkLoads
- Code: tld0022
- Description: Scene description
- Performers: Full cast with images
- Cover: High-quality poster image
Installation (Docker)
Files required:
TwinkLoads.yml(attached)TwinkLoads.py(attached)code_cache.json(attached)
Folder Structure
docker/
└── docker-compose.yml
└── scrapers/
└── TwinkLoads/
├── TwinkLoads.yml
├── TwinkLoads.py
└── code_cache.json
Docker Compose Configuration
Add the following under volumes in your docker-compose.yml:
- ./scrapers/TwinkLoads:/root/.stash/scrapers/TwinkLoads
Restart the Stash container:
docker compose up -d
The scraper will appear in Stash’s scraper list.
Usage
Scraping by URL
- Edit a scene in Stash
- Enter a twinkloads.com URL in the URL field
- Click “Scrape with… > TwinkLoads”
Scraping by Scene Code (from filename)
- Your file is named something like
tld0022_720p.mp4 - Click “Scrape with… > TwinkLoads”
- The scraper extracts
tld0022, looks it up in the cache, and fetches the scene
Scraping by Search
- Edit a scene in Stash
- Click the search icon next to “Scrape with…”
- Search for keywords from the scene title
- Select the correct scene from results
Technical Details
- Language: Python 3
- Dependencies: requests, beautifulsoup4 (auto-installed via py_common)
- Platform: Barebackplus HTML platform
- Cache file:
code_cache.json- maps scene codes to URLs - Cache refresh: Automatically rebuilds weekly by fetching the sitemap
- Scene code pattern:
tld+ 3-5 digits (e.g.,tld0022,tld0068) - Total scenes: 66 videos cached
Cache Management
The scraper automatically manages the cache:
- Loads
code_cache.jsonon startup - If a scene code isn’t found, forces a cache rebuild
- Rebuilds the cache weekly (7 days since last modification)
- Fetches the sitemap and scrapes each page’s
og:imagetag to extract scene codes
To manually rebuild the cache:
docker exec stash python /root/.stash/scrapers/TwinkLoads/TwinkLoads.py rebuildCache
Notes
- Twink Loads has no search API, so the cache is essential for code-based lookups
- The cache includes codes from tld0001 to tld0068 (with gaps in numbering)
- Performer images are extracted from lazy-loaded
data-srcattributes - The scraper automatically converts scene codes to lowercase for consistency
Attachments:
Important note! Rename code_cache.json.txt to code_cache.json after downloading (uploading json is unsupported on discourse)
- code_cache.json.txt (6.2 KB)
- TwinkLoads.yml (635 Bytes)
- TwinkLoads.py (12.2 KB)
Enjoy!