How to list all scenes for a studio from StashDB sorted by duration?

Early-Reflection-291 · July 19, 2025, 5:45pm

Let’s say Stash fails to auto-identify a scene, but I know the studio and the duration. Is there a way to view all scenes for a specific studio pulled from StashDB, sorted by duration, so I can quickly match my unknown scene manually?

A couple of workflows I’ve thought of, but I’m not sure what’s possible:

Is it possible to import all scene metadata for a studio from StashDB into StashApp, and then sort/filter those scenes by duration within StashApp?
Alternatively, does anyone have experience using the StashDB API and a custom script to fetch scenes for a given studio, then output a sorted list by duration for manual review?

Ideally, I just want a fast way to scan through all available scenes for a studio, ordered by duration, to narrow down likely matches.

Any advice, scripts, or tips are highly appreciated! Thanks!

Early-Reflection-291 · July 19, 2025, 7:23pm

How to List All Scenes for a Studio from StashDB Sorted by Duration

If Stash fails to auto-identify a scene but you know the studio and duration, you can use scripting to pull all scenes for a studio from StashDB, sorted by duration. This streamlines manual matching significantly.

Recommended Workflow using CommunityScript

A user-provided script in the Stash CommunityScripts collection (referenced in the GitHub issue) enables you to:

Query StashBox or StashDB for all scenes associated with a given studio.
Retrieve relevant metadata, including scene duration.
Output a list of scenes sorted by duration, which can then be easily scanned for matches.

Example Approach

Use the API: The script interacts with the Stash or StashBox API to pull all scenes for a specified studio.
Parse and Sort: It gathers scene data and sorts the resulting list by duration in ascending or descending order.
Display or Export: Outputs the sorted list (typically in JSON or CSV), which you can examine or load into a spreadsheet for efficient searching.

Why This Works

Efficient Matching: By sorting scenes by duration, you can quickly cross-reference your unknown scene’s duration with possible matches.
Flexible Search: You can further filter by performer or other attributes if needed.
No Need for Manual Browsing: Eliminates repetitive searching in the Stash UI, saving substantial time.

Is This Script Available?

As of the latest discussion, this functionality is suggested as a feature or implemented via community scripts/plugins, but may require you to write or adapt a script using the Stash GraphQL API. The most common tools/languages for this are Python or JavaScript. Community members have discussed sharing such scripts for batch metadata import, making it simpler to search and sort scenes.

Alternative Tips

Use a spreadsheet: Once scenes are exported (via the API or script), sorting and filtering by duration in Excel or Google Sheets is fast and user-friendly.
Batch import metadata: Consider batch-import plugins or scripts if available, to automate the initial data-gathering phase.

Getting Started

Familiarize yourself with the Stash GraphQL or REST API.
Search the Community Scripts repo for existing scripts that match your workflow.
If one does not exist, adapt an existing script using the scene query and sorting logic described above.

This method significantly streamlines identification when only the studio and duration are known, and avoids the inefficiency of browsing scenes individually.

import requests
from typing import Dict, List, Optional, Union
import logging
from datetime import timedelta

STASHBOX_GRAPHQL_ENDPOINT = "https://stashdb.org/graphql"
STASHBOX_API_KEY = ""
STUDIO_ID = ""

# Configure logging
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')

def execute_graphql_query(query: str, variables: Optional[Dict] = None) -> Dict:
    headers = {
        "Content-Type": "application/json",
        "ApiKey": STASHBOX_API_KEY
    }
    payload = {"query": query}
    if variables:
        payload["variables"] = variables

    logging.debug(f"Executing GraphQL query: {query}")
    logging.debug(f"Variables: {variables}")

    try:
        response = requests.post(STASHBOX_GRAPHQL_ENDPOINT, json=payload, headers=headers)
        response.raise_for_status()
    except requests.exceptions.RequestException as e:
        logging.error(f"Error executing GraphQL query: {e}")
        raise

    logging.debug(f"Response: {response.json()}")
    return response.json()["data"]

def get_studio_scenes(studio_id: str) -> List[Dict]:
    query = """
    query Scenes($input: SceneQueryInput!) {
        queryScenes(input: $input) {
            count
            scenes {
                id
                release_date
                title
                duration
            }
            __typename
        }
    }
    """

    page = 1
    total_scenes = float("inf")  # Initialize with a large value
    all_scenes: List[Dict] = []  # Create a list to store all fetched scenes

    while True:
        variables = {
            "input": {
                "direction": "DESC",
                "page": page,
                "parentStudio": studio_id,
                "per_page": 100,
                "sort": "DATE",
            }
        }

        logging.debug(f"Fetching scenes for page {page}")
        try:
            result = execute_graphql_query(query, variables)
        except Exception as e:
            logging.error(f"Error fetching scenes for page {page}: {e}")
            break

        scenes = result["queryScenes"]["scenes"]
        total_scenes = result["queryScenes"]["count"]
        logging.debug(f"Page {page}: Fetched {len(scenes)} scenes, total scenes: {total_scenes}")
        all_scenes.extend(scenes)  # Append the fetched scenes to the all_scenes list

        # Break if we have fetched all the scenes
        if len(all_scenes) >= total_scenes:
            break

        page += 1

    return all_scenes

def format_duration(duration: int) -> str:
    return str(timedelta(seconds=duration))

def main() -> None:
    scenes = list(get_studio_scenes(STUDIO_ID))

    if not scenes:
        logging.warning("No scenes fetched")
        return

    scenes.sort(key=lambda x: x['duration'] or 0, reverse=True)

    for scene in scenes:
        print(
            f"Duration: {format_duration(scene['duration'] or 0)}",
            f"Release Date: {scene['release_date']}, "
            f"Title: {scene['title']}, Scene ID: {scene['id']}"
        )

if __name__ == "__main__":
    main()

system · August 19, 2025, 10:15am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.