I have a bunch of scenes from the website Give Me Pink. The parent studio is considered either Sapphix, SapphicErotica, or Perfect Gonzo, I’m not sure if it matters which. I looked on StashDB and there are 82 scenes there, but I have a total of 338, so StashDB is missing 256 scenes.
The problem is not with the scene really, but with the performers. They are all given generic names like “Alexis” or “Angel”. If I look on ThePornDB, they apparently have 409 scenes under the Give Me Pink studio. My downloads are a bit old, so it’s possible that I’m just missing many new ones. Anyway, it seems like I should be able to use the performer data from ThePornDB to add them to StashDB, but how can I streamline that process?
I found a plugin called Stash Matched Performer Scrape which seems to correctly get all ThePornDB data added correctly to the performer when I’m scraping the scene, but then how should I add the performer to StashDB? How do I ensure they don’t already exist on StashDB under a different name?
Add the performers first and have the disambiguation of the site, we can merge performers later if we find that they are duplicate.
indexxx is a good site for some of these performers and I would search for the site name and go to the text based list of performers and if you are lucky there is an alias for the site.
From that you might get lucky and they appear on another site and someone has already added them to the database.
I will answer your question on discord as I have posts there that I can reference. I did a full download of Sapphix and Perfect Gonzo (Dev8) in Feb of 2022. This was my first automated studio download of metadata and content.
I quickly discovered that it is not practically possible to put my scraped data directly into StashDB since I would need to identify the performers.
I scraped Indexxx first and I tried to link single name performers to entries in StashDB. This is typical for Euro studios and I had a few of these. You have to match performers like Alexis or Angel
I spent a LOT of time thinking about this and 100s of hours trying to solve it.