Requiring a primary source is tricky, because often that’s easier said than done. Scrapers break, paywalls go up, more studios start geoblocking… it’s a lot to keep track of. And if the scene’s been taken down, or if the website simply doesn’t exist anymore, you’d better hope you can find a snapshot on the Wayback Machine.
So unfortunately, cobbling together scraps from secondary sources like IAFD, Data18, Indexxx, and — yes — TPDB could be your best option. Depending on the situation, any one of those three could be more useful than the others. Each one has their own strengths, weaknesses, and blind spots, so it’s going to be different for every scene and studio.
(Sidenote: We’ve discussed using the Ministry category here to track quirks for particular studios. Maybe it would be worth doing the same for secondary sources too. Strengths vs. weaknesses, tips for finding things quickly, quirks to look out for, that sort of thing.)
I don’t think we’ve ever established a firm consensus on this sort of thing, unfortunately. If the studio link is publicly accessible and easily scraped, then yes, I think it’s fair to expect that to be used as the primary source in a scene creation. And seeing those scene creations use a secondary source alone (typically TPDB) is frustrating when you can see all the eccentricities that introduced (weird casing, altered URLs, under-sized covers, missing tags, etc.), especially when you know fixing those could be as easy as clicking a single button.
But is that enough to justify a downvote? I’m not so sure. There’s no explicit guideline that’s being violated here. Maybe it falls under the umbrella of a “low-effort” submission. But like I said earlier, there’s a lot to keep track of. Are you sure they’re blindly scraping TPDB because they’re lazy? Or is there something else (broken scraper, paywall, geoblock) preventing them from scraping the studio directly? Because if it turns out there was no better option, then a downvote would be counter-productive. Do we expect OPs to know the difference? Do we expect voters to?
For me, I’ll only downvote if I can tell the data doesn’t match the primary source and they don’t say where the data came from instead. Because then the violation is actually an insufficient comment, not the use of a secondary source or a general lack of effort.
But if I know the studio link is scrapable and they comment something simple like “Scraped from TPDB” without any other explanation? Then I’ll leave an Abstain vote (ensures I get notifications), list any mismatched data I’ve noticed, and point out how scraping the studio link would’ve easily fixed all of that. Basically, I’ll resort to nagging them instead. That way it doesn’t block the edit with a downvote, doesn’t assume it’s either laziness or ignorance from the OP, provides a list of requests that are immediately actionable, and hopefully encourages higher quality edits moving forward.
Hopefully that’s helpful. I usually try not to dictate when voters “must” vote one way or another. Too much of the edit queue isn’t so black-and-white, requiring voters to be more flexible and use their best judgment. So without a hard guideline to fall back on, these are just my own thoughts on the matter. Downvoting as “low-effort”, upvoting because it’s close enough, or (respectfully) annoying them in the comments, to me that’s all a matter of opinion and personal preference.