Forgive me for another long, rambling brain dump. Writing everything out like this helps me organize my own thoughts.
Concept Outline
I remember it taking me awhile to wrap my head around the concept of recordings vs. tracks and release groups vs. releases on MusicBrainz, so depending on the implementation I think Echo is right that there’s a risk of over-complicating the average editing workflow. The more a system like that can sit in the background — easily ignored when it isn’t relevant to the current edit — the better, I think. You wouldn’t want it to either overwhelm and scare away new users or editors who don’t immediately grasp the concept, or to add extra layers of unnecessary tedium to otherwise straightforward edits.
But with that said, I think it’s still worth gaming out how specific we could get with our hierarchies here just to get a better view of the big picture. Think of this more as a concept map than a design outline. Then, we can figure out how to translate those concepts into features within Stash-Box. From smallest to largest, I think the full hierarchy would look something like this:
1. Release
- The atomic unit, same as a “Scene” in Stash and Stash-Box
- Digital vs. DVD, original vs. remaster
- Different releases could feature different edits, even to the point of changing its content
- UNIQUE DATA:
- Release Date
- Title
- Scene Aliases
- Description
- Scene Cover
- Distributor
- Studio behind release, not necessarily the same as studio who originally shot it
- Studio code
- Links
- Duration
- Edits that share content may have different duration due to title sequences / credits
2. Cut / Edit / Version
- Bundles together individual releases that share identical content, splitting up extended digital releases vs. truncated DVD releases, wet vs. dry, censored vs. uncensored, maybe even discrete scenes vs. compilations
- Could label these as “Edits”, but that could be confusing since modifying data in stash-box is also an “edit”
- UNIQUE DATA:
- Performers
- Possible that different cuts add, remove, or even replace performers
- Scene Tags
- Needs to be unique to each cut to reflect changes in content, like wet vs. dry, censored vs. uncensored, abridged scene vs. extended scene
3. Production / Shoot / Release Group
- Bundles together all various releases and edits created from the same content
- UNIQUE DATA:
- Production Date
- Director
- Production Studio
- Studio that originally made the scene, different from distributor, may not be possible to find in all cases
4. Group
- Bundles together releases/edits that are somehow related to each other, such as a movie, series, etc.
- UNIQUE DATA:
- Group Tags
Movie
, Series
, Mini-Series
, etc.
5. Collection
- Bundles together multiple groups, such as Movie Series #1-10, or even multiple releases of the same movie (DVD vs. BluRay vs. VOD, etc.)
- UNIQUE DATA:
- Collection Tags
Movie Collection
, Mini-Series Collection
, etc.
Again, this is just a conceptual hierarchy to define the different tiers and outline the relationships between them, not a design recommendation for Stash-Box.
I didn’t go into as much detail with the last two concepts, Groups and Collections. Mostly I wanted to point out that there is precedent for having a group that contains other groups, but there are also several unanswered questions around how those would be handled. Would a Movie collect specific releases of its scenes? Or would it collect the Cuts / Productions / Release Groups? How would they handle Releases that include the entire movie in a single video? Do we have one, somewhat generic object that combines every version of a single Movie, or do we have separate objects for each variation? Depending on these answers, we could end up with even more tiers to the hierarchy, splitting out Movies from Movie Releases, etc.
Object Design
The biggest questions, of course, are how does all of this apply to Stash-Box and how will it link up with Stash? I’ll expand on each point later, but for now my recommendation would boil down to this:
- Use the same flexible Group concept from Stash to handle Movies, Series, etc. That would mean two-way hierarchies, Group Tags, and labels for parent-sub relationships. I don’t believe inheritance would be useful here.
- Create a stand-alone Production object, separate from Groups. These would be lightweight and only have fields for production date, director, and studio. Scenes attached to a Production would inherit its prod. date and director, but not the studio. We could start with one-way inheritance and expand on it later if necessary.
- Cuts would be handled as a separate object, if at all. They would also be lightweight, containing only tags and performers for attached scenes to inherit. The ability for scenes to share the same tags and performers automatically is really the only advantage this concept gives us. Less demand and higher stakes, so definitely lower priority compared to Groups and Productions.
- Inheritance should sit in the background as much as possible. Unlike the MusicBrainz model, scenes should be able to exist without an attached Production or Cut. Requiring editors to create or attach one to every new scene — or having Stash-Box create them automatically when missing — would create more confusion than necessary, especially since not every scene would benefit from an attached Cut or Production.
#1. Groups
I haven’t been able to spend much time with it yet, but Stash’s recent move from a limited “Movies” category to a more flexible “Groups” concept is the most obvious approach for Stash-Box to use for bundling scenes together. You can attach scenes, parent Groups, sub-Groups, or all three simultaneously. That flexibility plus the inclusion of Group Tags allows for a wide variety of uses for the same category of objects, while still clearly labeling and defining each particular usage. It shifts many decisions from questions of database design to questions of content moderation, while making sure the two platforms are still as closely aligned as possible.
The part where the flexible group concept breaks down for me is inheritance. The whole situation would get incredibly complicated from a moderation standpoint if there was no data inheritance baked-in between levels of the hierarchy. But on the other hand, I’m sure it would also get incredibly complicated to implement an inheritance system on top of these generic, flexible Groups. So assuming the juice isn’t worth the squeeze (which is basically what Infinite said up top), then we need a different solution for the tiers of the hierarchy that need data inheritance.
Using my outline from earlier, the flexible Groups from Stash are still the best fit for Movies, Series, Mini-Series, Movie Collections, Mini-Series Collections, and basically any other kind of custom Playlist. From a moderation perspective, I expect StashDB will be fine with using one generic “Movie” object to bundle together the various releases of that movie, so I wouldn’t worry about needing a Production / Recording / Release Group concept to bundle Groups together as well.
Since “Release” is just another name for “Scene”, the only tiers left to address are “Productions” (serving the same function as a Release Group) and “Cuts” (representing different sets of content pulled from a single Production).
#2. Productions
Productions wouldn’t need to carry much metadata of their own. The only pieces of data that would be identical for all scenes from the same Production would be Production Date and Director. And since that relationship is guaranteed by definition, scenes could inherit both of them from their production.
The only other relevant field would be Studio. Unlike Production Date and Director, this field shouldn’t be inherited because not all scenes sharing the same Production would share the same studio. I referred to separate fields named Production Studio and Distribution Studio in the outline, but attaching different studios to a production vs. a scene is functionally the same thing.
Every other piece of data (title, duration, description, tags, performers, aliases) would depend on the particular release. We could add fields for some of these too, but they would likely be borrowed from the original release and wouldn’t be strictly necessary for the concept to work.
#3. Cuts
Cuts would carry a different set of data. Even though all scenes under the same Cut would share the same Production, not all scenes under a Production will share the same Cut.
Cuts are defined and differentiated by content, meaning tags and performers would be identical for every scene under the same Cut. The primary advantage would be the ability to keep those tags and performers in sync automatically. Without that inheritance in place, the feature wouldn’t be worth it.
For me, the two concepts would need to exist as separate object categories. My concept outline further up puts Cut on a tier below Production, but I don’t think that strict hierarchy needs to be reflected in the design. Instead, each scene could be added to a Production, a Cut, both, or neither. Inheritance of prod. dates and directors could be hard-coded into the Production concept, and inheritance of tags and performers could be hard-coded into Cuts.
Trying to combine both concepts into a single object — let’s call it a Release Group — sounds like a bigger headache to me. Sure, it would be simpler for situations where every scene released from the same Production shares the same content. A single Release Group could keep the prod. date, director, tags, and performers in sync. But if there is a difference in content, what do you do? You could still add every scene to the Release Group, but you’d have to be able to ensure that only the prod. date and director are inherited. And if you want the tags and performers to stay in sync too, you’d still need to create additional Release Groups for each Cut. At that point you’re doing the same work as if you had separate objects for Cuts and Productions, except now there are more inheritable fields and moving parts, making it easier to mess something up.
#4. Inheritance
So we have our flexible groups, we have our inflexible Productions and Cuts, and now we have to untangle the biggest knot in this design. Inheritance.
Again, ideally these meta-objects should sit in the background as much as possible. Scenes should be able to exist without requiring an attached Cut or Production, simplifying the creation process. It should be as intuitive as possible for editors to open the scene, make their edits, and move on without needing to understand how these concepts relate to each other. The system should be set up in a way that these considerations are handled for them, doing the work they may not be aware needs to be done, and preventing mistakes they haven’t learned to avoid.
If we don’t require scenes to have an underlying Production (a la MusicBrainz), I believe that also means we’re talking about a different type of inheritance. The MB model would have us automatically creating an underlying Production object for every existing scene. At the same time, any existing production dates would migrate out of the scene and into the Production. All production dates would now be attached to the production and not the scene. This could be described more as a relational inheritance process. The production date is only found in one place, the Production, and any time you need a scene’s production date you’re really asking for the date of the attached Production.
So, any time a scrape or a filter calls for the production date of a scene, the call would be redirected to the attached Production instead. Anytime someone wants to add or correct a scene’s production date, they would have to edit the date of the attached Production. And for any re-releases, we would have to link them together by essentially merging the two Productions together, which might not be the most intuitive process.
Instead, I’m thinking about an inheritance system that actively submits edits to linked objects. If the data in Object B is inherited from Object A, then any time the data in Object A changes, an edit is submitted to make the same change to Object B. To say it a different way, we’re triggering two separate write operations, one for Object A and one for Object B.
That way, we would be able to continue saving production dates directly to the scene. Scenes could continue to exist without an attached Production. And we would only need to create a Production when we want to link the production dates of two scenes together. This process sounds more intuitive to me, at least from an editor’s perspective. You wouldn’t have to constantly deal with this new concept bolted onto the side of every scene. Instead, you’d be linking scenes together by adding them to a shared group, just like adding scenes to a Movie.
In the Stash-Box interface, this could look like a Merge edit. Somebody submits an edit to Production A. That edit appears in the queue, showing links to Release X, Release Y, and Release Z, all of which will inherit that data. Notifications would trigger for anyone following one of the affected objects. The edit would also appear in the edit history of every affected object. Once the edit passes, Stash-Box applies the new set of data to Production A as well as Releases X, Y, and Z.
Editing Workflow
Now that we have an understanding of how inheritance could work generally, how could this look in practice for our two new objects, Productions and Cuts?
Productions
Say you create a new scene. It appears to be an original release, so there’s no need to create or attach any Productions or Cuts to it. For now, it stands alone. This situation should be identical to the current state of Stash-Box. The production date, director, tag, and performer fields can be freely edited. Nothing is inherited from anywhere else.
A few months later, say someone adds a redistribution to the stash-box. The video has the same content as before, just a different production logo at the beginning. An editor notices this, creates a Production, and links both Releases to it. Now we have a few questions to answer.
What does the creation process for a Production look like?
I imagine it looks something like the Merge edit forms in Stash-Box. At the top, you type into a box to find and select the scenes that should be linked together. Below that, you have a couple fields for production date, studio, and director. I think those three fields should automatically fill with suggestions based on the scenes added to the new Production. In the event of a conflict, it can grab the studio and director from the oldest release date, then ignore the order of release to grab the oldest available production date. The editor can use those suggestions as-is or write over them if necessary.
I also think the only two hard-coded requirements for creating a Production are that it contains at least two scenes, and that it contains a production date. Directors are only credited by a handful of studios, and the original production studio itself isn’t always known. For example, most Euro studios seem to license content from a few unnamed production houses, creating a ton of redistributions without a clear “original” release. To reflect that ambiguity, those Productions should probably leave the studio blank.
How do Releases inherit data from the Production?
My first thought was that as soon as a scene is attached to a Production, the production date and director should be locked. Those fields don’t belong to the scene anymore, they belong to the Production. This makes the inheritance system cleaner, in my mind. Data only flows in one direction, downstream. Editors who are unaware of a scene’s connection to a Production, who might not even understand what a Production is, would be prevented from changing those values. Only editors who understand what a Production is and how it functions would know to edit the Production instead whenever those fields need updating.
But, that method wouldn’t be the most intuitive either. You could have users who recognize that a scene’s production date is inaccurate, try to correct it, and find that the stash-box won’t let them. They’ve been able to either learn or intuit how production dates work, but in order to apply that knowledge they must now figure out how Productions work as well. Does that overcomplicate the process?
The other option would be to leave those inherited fields unlocked. But in order to keep that data in sync with the Production and other Releases, that means the inheritance would have to flow both ways. Editing the production date of one scene would update that field in the connected Production, then the updated Production would pass it on to the other Releases.
Now I wouldn’t know from experience, but two-way inheritance sounds a lot harder to develop to me. It also raises the question, do we want it to be easier to edit one scene and affect multiple? That concern could be mitigated with a few simple safety measures though. An edit that modifies an inherited field could be considered destructive, lengthening the minimum amount of time spent in the queue. The same locking mechanism from before could also be made manual. For example, a mod could lock the production date within the Production and the production dates of all the Releases underneath it could be locked as a result.
Cuts
The same considerations from before would apply here as well. Locking vs. unlocking, one-way vs. two-way, etc. The only difference here is that the inheritable fields would be tags and performers. In comparison, production dates and directors are much more niche than tags and performers.
So even though the unanswered questions and possible approaches are the same, the stakes feel higher. More users will be frustrated or confused if they can’t edit a scene’s tags or performers directly anymore. Conversely, it would be a bigger deal if an unwanted change to a scene’s tags or performers automatically changed those fields for several other scenes as well.
All of that plus lower demand (I believe this thread is the first time the idea’s come up) is why Cuts would be lower priority than Productions and Groups. This design leaves room for us to add them after Productions if we want — benefitting from any lessons learned from implementing the other feature first — but we could just as easily skip this concept entirely.
Well, that’s the write-up. If you’ve made it this far, thanks for sticking with me. I think this all makes sense, but without a background in software design I realize the whole idea could be built on top of false assumptions or unrealistic expectations. On the other hand I’ve been going back and forth on this for 2 or 3 days now — re-writing and re-arranging to try to make it easier to follow — so at least I can say it’s thorough if nothing else.