I have a working SQL query that analyzes performer tag frequency in my local stash database. It retrieves performers along with:
- Total scenes they appear in
- Number of scenes tagged with a specific tag (like “Interracial”)
- The percentage of their scenes with that tag
The query groups by performer and sorts by scenes with the tag, helping identify performers most associated with specific tags.
Here’s my current query for local stash:
SELECT
p.id,
p.name,
COUNT(DISTINCT ps.scene_id) AS total_scenes,
COUNT(DISTINCT CASE WHEN st.tag_id = t.id THEN ps.scene_id END) AS scenes_with_tag,
ROUND(
100.0 * COUNT(DISTINCT CASE WHEN st.tag_id = t.id THEN ps.scene_id END) / COUNT(DISTINCT ps.scene_id), 0
) AS percentage
FROM
performers p
JOIN
performers_scenes ps ON p.id = ps.performer_id
LEFT JOIN
scenes_tags st ON ps.scene_id = st.scene_id
LEFT JOIN
tags t ON t.name = 'Interracial'
GROUP BY
p.id, p.name
HAVING
total_scenes > 0
ORDER BY
scenes_with_tag DESC;
How can I perform this same type of analysis using stash-box instead of my local database? I want to query the centralized stash-box instance to get similar insights about performer tag associations across the community dataset.