Stash-scrape-ci

:placard: Summary real-instance stash scraper validation
:link: Repository https://github.com/feederbox826/stash-scrape-ci

Presenting, The greatest thing for scraper testing since sliced bread (in my biased opinion)

an automated way to test and share scraper and scraper results

  • uses CommunityScraper index to search for and automatically install scrapers
    • if there are >1 scraper that covers the same URL, it will not auto-install
  • uses uv-py for (almost) seamless dependency installs
  • Images excluded since they are too large
  • 1d retention for failed runs, 7d retention for successful ones

Demos

Edge Cases

Types:

1 Like

Since people have been using it now: a quick user guide

Head to https://scrape.feederbox.cc/upload to create a scrape entry, let it run (progress bar coming soon :tm: The authentication key will be loaded from localStorage on subsequent runs, inserting it once is enough. The backend uses a real stash instatance in docker, with https://discourse.stashapp.cc/t/http-s-proxy-for-backers/4071 as an http(s) proxy. (Not running on the node itself due to CDP ram requirements)

Select the type of media you’d like to scrape and hit submit. Optionally you can hit “Update Scrapers” to force all the existing scrapers to be updated to the newest version (via CommunityScrapers), it will automatically be triggered when a scrape is submitted.

After a bit, you’ll get a shareable link that shows the process of scraping, along with all the information needed to debug the scrape. This includes

  • Last 30 log entries
  • Error message (if any)
  • Scraper ID (detected from URL) and version (hash)
  • Stash version and git hash
  • Run date

This was not originally designed nor intended to be a service to download/ scrape from but in future might be made free for backers.