Let’s say Stash fails to auto-identify a scene, but I know the studio and the duration. Is there a way to view all scenes for a specific studio pulled from StashDB, sorted by duration, so I can quickly match my unknown scene manually?
A couple of workflows I’ve thought of, but I’m not sure what’s possible:
Is it possible to import all scene metadata for a studio from StashDB into StashApp, and then sort/filter those scenes by duration within StashApp?
Alternatively, does anyone have experience using the StashDB API and a custom script to fetch scenes for a given studio, then output a sorted list by duration for manual review?
Ideally, I just want a fast way to scan through all available scenes for a studio, ordered by duration, to narrow down likely matches.
Any advice, scripts, or tips are highly appreciated! Thanks!
How to List All Scenes for a Studio from StashDB Sorted by Duration
If Stash fails to auto-identify a scene but you know the studio and duration, you can use scripting to pull all scenes for a studio from StashDB, sorted by duration. This streamlines manual matching significantly.
Recommended Workflow using CommunityScript
A user-provided script in the Stash CommunityScripts collection (referenced in the GitHub issue) enables you to:
Query StashBox or StashDB for all scenes associated with a given studio.
Retrieve relevant metadata, including scene duration.
Output a list of scenes sorted by duration, which can then be easily scanned for matches.
Example Approach
Use the API: The script interacts with the Stash or StashBox API to pull all scenes for a specified studio.
Parse and Sort: It gathers scene data and sorts the resulting list by duration in ascending or descending order.
Display or Export: Outputs the sorted list (typically in JSON or CSV), which you can examine or load into a spreadsheet for efficient searching.
Why This Works
Efficient Matching: By sorting scenes by duration, you can quickly cross-reference your unknown scene’s duration with possible matches.
Flexible Search: You can further filter by performer or other attributes if needed.
No Need for Manual Browsing: Eliminates repetitive searching in the Stash UI, saving substantial time.
Is This Script Available?
As of the latest discussion, this functionality is suggested as a feature or implemented via community scripts/plugins, but may require you to write or adapt a script using the Stash GraphQL API. The most common tools/languages for this are Python or JavaScript. Community members have discussed sharing such scripts for batch metadata import, making it simpler to search and sort scenes.
Alternative Tips
Use a spreadsheet: Once scenes are exported (via the API or script), sorting and filtering by duration in Excel or Google Sheets is fast and user-friendly.
Batch import metadata: Consider batch-import plugins or scripts if available, to automate the initial data-gathering phase.
Getting Started
Familiarize yourself with the Stash GraphQL or REST API.
Search the Community Scripts repo for existing scripts that match your workflow.
If one does not exist, adapt an existing script using the scene query and sorting logic described above.
This method significantly streamlines identification when only the studio and duration are known, and avoids the inefficiency of browsing scenes individually.
import requests
from typing import Dict, List, Optional, Union
import logging
from datetime import timedelta
STASHBOX_GRAPHQL_ENDPOINT = "https://stashdb.org/graphql"
STASHBOX_API_KEY = ""
STUDIO_ID = ""
# Configure logging
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')
def execute_graphql_query(query: str, variables: Optional[Dict] = None) -> Dict:
headers = {
"Content-Type": "application/json",
"ApiKey": STASHBOX_API_KEY
}
payload = {"query": query}
if variables:
payload["variables"] = variables
logging.debug(f"Executing GraphQL query: {query}")
logging.debug(f"Variables: {variables}")
try:
response = requests.post(STASHBOX_GRAPHQL_ENDPOINT, json=payload, headers=headers)
response.raise_for_status()
except requests.exceptions.RequestException as e:
logging.error(f"Error executing GraphQL query: {e}")
raise
logging.debug(f"Response: {response.json()}")
return response.json()["data"]
def get_studio_scenes(studio_id: str) -> List[Dict]:
query = """
query Scenes($input: SceneQueryInput!) {
queryScenes(input: $input) {
count
scenes {
id
release_date
title
duration
}
__typename
}
}
"""
page = 1
total_scenes = float("inf") # Initialize with a large value
all_scenes: List[Dict] = [] # Create a list to store all fetched scenes
while True:
variables = {
"input": {
"direction": "DESC",
"page": page,
"parentStudio": studio_id,
"per_page": 100,
"sort": "DATE",
}
}
logging.debug(f"Fetching scenes for page {page}")
try:
result = execute_graphql_query(query, variables)
except Exception as e:
logging.error(f"Error fetching scenes for page {page}: {e}")
break
scenes = result["queryScenes"]["scenes"]
total_scenes = result["queryScenes"]["count"]
logging.debug(f"Page {page}: Fetched {len(scenes)} scenes, total scenes: {total_scenes}")
all_scenes.extend(scenes) # Append the fetched scenes to the all_scenes list
# Break if we have fetched all the scenes
if len(all_scenes) >= total_scenes:
break
page += 1
return all_scenes
def format_duration(duration: int) -> str:
return str(timedelta(seconds=duration))
def main() -> None:
scenes = list(get_studio_scenes(STUDIO_ID))
if not scenes:
logging.warning("No scenes fetched")
return
scenes.sort(key=lambda x: x['duration'] or 0, reverse=True)
for scene in scenes:
print(
f"Duration: {format_duration(scene['duration'] or 0)}",
f"Release Date: {scene['release_date']}, "
f"Title: {scene['title']}, Scene ID: {scene['id']}"
)
if __name__ == "__main__":
main()