Per-Studio Scraper Configuration

I have been working on an improved clips4sale scraper, and one of the features it supports is per-studio configuration! This allows a dependent scraper to specify a studio link so that scene-by-name searches can use a studio’s individual search, rather than the site-wide search (which usually has a bunch of unhelpful results). It also allows regexes to be specified to strip metadata from titles so that clips which are the same but have different formats/resolutions can be merged into one result!

The current method that I am using requires that an individual scraper be made for each studio’s Clips4Sale page! This is rather tedious, and it requires the user to always be on top of which scraper (in potentially a long list of them) they should be using for a given scene to get the best results! I also intend to contribute my scrapers to the community repository when I feel they are ready, but doing so now may have the unintended effect of artificially inflating the overall number of scrapers if others choose to contribute their own studio configurations!

The best solution for this, in my opinion, is what I call “Scrapelets”! Essentially, scrapers would be able to define a set of optional config parameters, which a user would be able to add to a given studio (similar to custom fields on a performer, but with a schema pre-defined by an installed scraper). Then, when the scraper is used on an asset (like a scene or image) from that studio, the corresponding Scrapelet would be passed along with the other data used for scraping! Perhaps there could even be a system in place for automatically acquiring community Scrapelets for your studios/scrapers!

An ever-increasing number of creators and studios are ditching their own websites in favor of centralized platforms like Clips4Sale, so it seems to me that a feature like this would become increasingly valuable as time goes on! I’m just one person, though, and I don’t know if other scraper devs would find it as useful as I would, so I would love to hear any thoughts or feedback you might have!

If it is determined that this suggestion is too complex or not useful enough to be implemented, then I just ask that sceneByName searches be modified to include basic info about the scene it’s being used on (even just the database ID), as that would allow me to implement a rudimentary version of this proposal myself!