InTheCrack - Consistency & Consensus

While manually filling in StashDB gaps for InTheCrack scenes, I’ve noticed general inconsistency in the ways scenes are designated and titled. I’d like to draw attention to the different methods and get a sense for the consensus on how to organize these moving forward.

ITC is organized somewhat differently from other sites, in that each “session” with a model contains one set of photos and about 3-6 individual scenes with their own (cheesy ass) titles and descriptions, all wrapped up with a session title that includes the sequence number and model’s name. Most of these sessions are given a single entry in StashDB with a duration equal to the sum of the individual scenes and an aggregation of their descriptions. This session, for example, has both. I don’t see any way on ITC’s site to download the scenes of a session as one single video file.

The entry in the orange box is the parent session which the other pictured scenes derive from.

Sometimes scenes are given their own entries, sometimes some but not all scenes are combined, and sometimes only the parent session is given an entry.

When individual scenes are given an entry, their titles are given a variety of formats, some including reference to the parent session, others not.

More examples: A, B, C

It seems to me that individual scenes should each be given their own entry, as ITC differentiates between them with unique descriptions and download files, but that they should be grouped together as they share a gallery and are taken at roughly the same time.

The clearest naming format seems to be something like: session # model name scene #: scene title
So the Anastasia Knight scenes pictured above would be
1527 Anastasia Knight 1: Ditz and Ass
1527 Anastasia Knight 2: Buzzy Place to Knight
etc.

Are there any problems with this approach?

And, does anyone have a good way to automate it?

1 Like

I agree with you that they probably should be separate entries, but a lot of usenet sources and some torrent sources are combining them now (where in the past they were usually separated).

While technically this violates how the site presents them and allows you to download them, not all sources provide them in individual formats.

Then you have files that have all individual scenes in them and you have to pick which of the n number of scenes you want to associate it with.

My guess is people have been combining them because in total it’s about 20-30 minutes in total.

I’m not sure people are going to be happy either way it ends up going. While the separating out each one is the 1:1 accurate way based on the site’s presentation, certain sources are combining them and scraping them at a collection level so having separate entries for each collection item will also make other unhappy as well.

The only metadata that should be going to StashDB is the metadata that the studio provides.

Using the Anastasia Knight Collection example (“session” in the OP), there should be only four entries on StashDB, as InTheCrack released only four videos.

InTheCrack explicitly provides the video title and description, which StashDB submitters should be using (and not changing or combining).

In this Anastasia Knight example, the four videos all have the same PERFORMER, DATE, STUDIO, and URL. (The DATE field is found on the Collections page(s)).

These four videos have unique scene TITLEs and DETAILS, each provided by InTheCrack. In this case:

TITLE: Ditz and Ass.
DETAILS: Anastasia’s blue lingerie really stands out against the entirely all gray room. She comes off a little ditzy as she occasionally spouts cliche dirty talk but she is quite fascinating to watch as she continually moves around sometimes striking poses worthy of a gymnast.

TITLE: The 19 Sex Tease.
DETAILS: Anastasia continues to roll around on the rug while fully nude giving multiple revealing views of her private places.

TITLE: Fappy to Meat you.
DETAILS: Anastasia’s finger masturbation technique is quite unique and becomes quite intense after she gets wound up. Her fast machine-like hand movement generates some characteristic slapping noises while she masturbates. You definitely cannot accuse her of faking it as she really gets into it and has two very obvious moments of orgasm.

TITLE: Buzzy Place to Knight.
DETAILS: Continuing in the same vein Anastasia a couple more orgasms now using a vibrating toy. Her orgasms here are maybe not as intense in action but are probably more interesting with obvious muscle contractions and oozing pussy juice. By the end her vagina is soaked in slimy pussy goo which is beautifully displayed as she digs out strands of goo with her fingers.

The combined scenes should not be on StashDB at all as they are fan-edits of the studio releases.

Edited to add: Collecting the Collections, so to speak, into “scenes” or “movies” is best reserved for StashBox Groups.

1 Like

I will attempt to update the scenes in the StashDB. This can generate a lot of edits and a lot of new scenes. Here a some examples:

Edit: StashDB
New scenes: StashDB StashDB StashDB

Is this in the interest of stashDB to generate those many creates and edits?

I am very confused. The community seems to think, that the scenes should be edited into one large scene. StashDB StashDB StashDB

This is clearly against the guidlines. Am i am wrong?

From memory, the practice of adding the fan-compiled scenes / collections to StashDB started very early on. I don’t remember being part of that conversation myself — someone would have to dig through old Discord messages to find the specifics. I believe this was before the guidelines were written down anywhere, and before I had any kind of admin privileges. We were all just feeling things out as we went along at back then, and the project was small enough that a lot of decisions were made on the fly after a brief conversation on Discord. This is where the term “working consensus” came in.

That “working consensus” still does a lot of heavy lifting for us, since a sizable portion of the guidelines still haven’t been formally approved. That’s what those “unconfirmed guideline” messages are all about. They carry the same weight as the “confirmed” guidelines, at least until they’re replaced by a formal vote down the line. Ultimately, StashDB is just too damn big to handle any other way. As comprehensive as I’ve tried to make the guidelines site, it will never cover every edge case, exception, and inconsistency. The best we can do is build a framework that covers most of the database — as best as we can tell anyway — and rely on these kinds of conversations to build that “working consensus” to fill in the gaps.

Okay, with that history lesson out of the way… what do we do now?

From my perspective, the compiled entries qualify as an exception to our typical requirements for scene eligibility. The working consensus has been in place for a long time, so there’s a lot of time and effort spent building this studio into the state it’s in today. We don’t want to erase that. Furthermore, splitting up those entries would run counter to a lot of the expectations users have for this studio, because of how StashDB’s handled these scenes up until now, because of how the studio’s fanbase shares these scenes with each other online, and because of how the studio’s website appears to package them together in everything but a compiled video. So removing / rewriting / banning these compiled entries is a non-starter for me. (@yekq that means reverting this edit and any others like it, by the way.)

However, I also don’t think that banning the split “chapters” is the right approach either. That would effectively give preference to users with fan-created and largely pirated files over paying subscribers. No, those clips are official and I don’t see why they shouldn’t be considered eligible for StashDB as well. I get it, it’s a lot of scenes, but they also have unique titles and covers to differentiate them at a glance.

That just leaves the thread’s original question about consistency. Honestly this is not a studio I’m particularly familiar with, so I’m not in a position to dictate a strict format for everything. Really all that matters to me is clarity and consistency. I think the compiled entries already have that covered for the most part, so I’ll move on to the individual chapters.

From what I’ve seen, these clips already have a specific order in addition to unique titles, covers, and descriptions. That gives us a lot to work with already. Of those, it’s really just the titles that aren’t consistent on StashDB yet. My first instinct is something along the lines of Collection's title #1/3 - Clip title. I’m not married to including the total number of clips here, but the use of a pound sign # and hyphen - should be easier to see at a glance than a single digit and colon :. Again, the exact formatting doesn’t matter to me much, so it may make more sense to lean towards whatever the dominant pattern is for any existing clip entries (if there is one) to save us time and effort, limit confusion, etc.

One question I had, I think I saw somewhere that even though every clip uses a single URL for the entire collection, the studio still releases each clip one at a time? I’m not sure what that would mean for release dates, but I would be interested in hearing more about how that works.

Either way, once we’ve established a set of studio-specific conventions, we’ll want to write up a summary as it’s own thread under the Ministry of Truth category. We still need to solidify a template of some kind for those posts, so for now any kind of structured rundown would be fine.

1 Like

Thanks again for the detailed explanation and the historical context — that helps a lot, and I understand the role that long-standing working consensus plays here.

To clarify one of the open questions from above:
ITC does not release the individual clips simultaneously, nor on a fixed daily schedule.
Instead, the clips belonging to a collection are released sequentially over roughly the span of a week, but at irregular intervals. Even though all clips share a single collection URL on the website, new clips become available at different, non-uniform times.

From the studio’s perspective, the collection is only considered “complete” once the final clip has been released, and that date is what ITC effectively treats as the release date of the full collection.

At the same time, each individual clip still has:

  • its own title

  • its own cover

  • its own description

  • a clearly defined internal order

So while the website presentation groups them together, the clips themselves are clearly distinguishable releases with their own identity. However the release date given on the website changes based on the last release.


Clarification on coexistence

Before going further, I’d like to explicitly confirm one core point to avoid future confusion:

Is the intended approach that both the compiled collection entries and the individual clip scenes should coexist in StashDB?
That is, compiled entries remain as long-standing exceptions under the working consensus, while the officially released individual clips are also treated as eligible scenes and maintained in parallel.

If this dual representation is the expected end state, I’m completely fine aligning future edits accordingly — I just want to be sure this is the shared understanding.


Naming proposal for individual clips

Based on your feedback regarding clarity and consistency, I’d suggest the following studio-specific naming convention:

Collection Title #X – Clip Title

Example based on InTheCrack :

  • 461 Adrienne Manning #1 – If the Suit Fits Wear it

  • 461 Adrienne Manning #2 – A Smoothie in the Bender

  • 461 Adrienne Manning #3 – Ass knot what you can do for your Cuntry

  • 461 Adrienne Manning #4 – Twat’s open Round here?

  • 461 Adrienne Manning #5 – Just Pooling Around

Rationale:

  • #X makes ordering immediately visible at a glance

  • The hyphen clearly separates structural context from the clip’s actual title

  • It scales well for larger collections

  • It avoids unnecessary complexity in the title itself

I would personally avoid including the total number of clips (e.g. #1/3), unless there’s a strong preference for that, as it’s not strictly required for ordering and may add maintenance overhead.

Yes, this is my recommendation. Keep both compiled collections and separate chapters on StashDB. And moving forward, new collections and new chapters can both be added to StashDB.

If this approach still proves to be controversial, I would need a formally approved guideline to endorse a new approach. Otherwise, this is my interpretation of the current guidelines and consensus.

So to clarify, the studio displays the release date of the most recent clip? That’s at least more helpful than the alternative, inheriting the first clip’s release date and never updating it.

Again, I’m not super familiar with the studio or its website, so I’m not sure how easy this will be to pull off. But, this sounds like it should be possible to use unique release dates for each of the individual clips, instead of just inheriting whatever date the collection uses and calling that good enough.

How difficult would that be to manage, particularly for older clips? Would it require digging through Wayback Machine snapshots and social media posts, or is there a more convenient source somewhere that tracks the release dates of individual clips?

We can also use the Missing Date tag to mark any clips when we can’t find the individual release date.

This looks good to me. I agree with removing the total number of clips too. Since they release each clip separately over time, I can see how it could be difficult to know how many clips are planned before the clip is complete. Either way, I don’t think it’s necessary to include it.

For the clip titles, I also agree that we should try to preserve their idiosyncratic capitalization. We do the same for normal scene titles after all, so this shouldn’t be any different. However, I would also extend that to preserving the odd periods . they like to attach to the end. Your example helpfully includes a clip that uses a question mark ? instead. Since they didn’t tack on an unnecessary period after it, it makes their punctuation look more like intentional choices instead of some weird quirk of their HTML formatting or whatever. Small issue, I know, just something I noticed.

Speaking of small issues… I don’t know if this was intentional, but the dash should probably stay as a small hyphen - instead of the mid-sized endash you seem to have used here. While I agree that it looks cleaner with a larger dash, that’s going to take a lot of extra work to maintain consistency across the large number of scenes it will affect. Standard hyphen - is a much easier ask since it’s actually on a standard English keyboard, unlike the endash and emdash . (I have a simple autohotkey script to make it easier, but I don’t expect every StashDB editor to do the same.)

So with those notes in mind, I would describe the format as Collection Title #X - Clip Title, which in practice would look like this:

  • 461 Adrienne Manning #1 - If the Suit Fits Wear it.
  • 461 Adrienne Manning #2 - A Smoothie in the Bender.
  • 461 Adrienne Manning #3 - Ass knot what you can do for your Cuntry.
  • 461 Adrienne Manning #4 - Twat’s open Round here?
  • 461 Adrienne Manning #5 - Just Pooling Around.


InTheCrack.pdf (1.7 MB)

Regrettably ITC overrides release dates each time they release a clip. I have included an image and a PDF to illustrate this and show where information can be scraped.

I agree with the use of a small hyphen - for the reasons you stated above. It is also logical to include the period.

1 Like

I don’t know how the current scraper works, or what’s possible to grab, but I noticed a few things while inspecting the webpage.

Again, I don’t know what the current status or limitations are for the scraper, but this at least looks promising for grabbing clip-specific details.

I discovered that the Postman tool can provide the following information. Although I am not a coder, the process was somewhat complex. I had to examine the source of my Safari browser while the set was open to retrieve the curl command. Subsequently, I was able to extract the bearer token required for the ITC API. This enabled me to obtain a response via Postman. I am unable to determine how to integrate this into a scraper. Here is what i could extract so far:

"id": 408,
    "title": "Adrienne Manning",
    "description": "There's not too much hardcore action in this video because that's not really her thing but few girls can look at sexy and sophisticated as Adrienne just posing nude. This is a great video for softcore tease and beauty with plentiful inthecrack style close up pussy and ass viewing mixed in.",
    "active": true,
    "regionId": 1,
    "shootDate": "2009-12-04",
    "shootLocation": "Los Angeles",
    "showOnWebsite": true,
    "clips": [
        {
            "id": 2628,
            "thumbnail": "jpg",
            "active": true,
            "description": "Adrienne looks really elegant in one of the most unique and sexy dresses we have ever seen. The rear view is quite stunning with her lacy beige panties visible just under her hemline. There's a super sexy ass tease focusing on the view up her dress with smooth butt cheeks bouncing inside her beautiful form fitting panties. The full body front views are also stunning with very elegant posing against the backdrop of a sexy bachelor pad setting.",
            "length": 507,
            "quality": 1,
            "releaseDate": "2010-04-25",
            "scene": 1,
            "title": "Is Knit Nice?",
            "videos": [
                {
                    "directory": "408",
                    "filename": "408_01_isknitnice1280x720.mp4",
                    "mb": 261,
                    "videoResolutionId": 3
                },
                {
                    "directory": "408",
                    "filename": "408_01_isknitnice1920x1080.mp4",
                    "mb": 515,
                    "videoResolutionId": 4
                },
                {
                    "directory": "408",
                    "filename": "408_01_isknitnice640x360.mp4",
                    "mb": 103,
                    "videoResolutionId": 1
                }
            ]
        },
        {
            "id": 2629,
            "thumbnail": "jpg",
            "active": true,
            "description": "This clip is all just nude posing with multiple views and angles focusing on Adrienne's beautiful pussy and ass. She has a little bit of a knot on her ass hole though otherwise she is completely flawless. At first we get some low angle views both front and back and then she gets up on the table in doggy style and then squatting front view with her ass hanging in mid air. The ass squirming near the end is super sexy with her silky smooth undercarriage hovering in front of your face.",
            "length": 649,
            "quality": 1,
            "releaseDate": "2010-04-26",
            "scene": 1,
            "title": "The New Eye Pad.",
            "videos": [
                {
                    "directory": "408",
                    "filename": "408_02_theneweyepad1280x720.mp4",
                    "mb": 299,
                    "videoResolutionId": 3
                },
                {
                    "directory": "408",
                    "filename": "408_02_theneweyepad1920x1080.mp4",
                    "mb": 660,
                    "videoResolutionId": 4
                },
                {
                    "directory": "408",
                    "filename": "408_02_theneweyepad640x360.mp4",
                    "mb": 132,
                    "videoResolutionId": 1
                }
            ]
        },
        {
            "id": 2630,
            "thumbnail": "jpg",
            "active": true,
            "description": "Adrienne lies back on the table to use her black vibrating dildo. We don't believe there's any real orgasm here. It's probably a fake but she at least looks great doing it. There's some very nice pussy spreading at the end of this clip.",
            "length": 462,
            "quality": 1,
            "releaseDate": "2010-04-26",
            "scene": 1,
            "title": "Manning the Dick.",
            "videos": [
                {
                    "directory": "408",
                    "filename": "408_03_manningthedick1280x720.mp4",
                    "mb": 231,
                    "videoResolutionId": 3
                },
                {
                    "directory": "408",
                    "filename": "408_03_manningthedick1920x1080.mp4",
                    "mb": 470,
                    "videoResolutionId": 4
                },
                {
                    "directory": "408",
                    "filename": "408_03_manningthedick640x360.mp4",
                    "mb": 94,
                    "videoResolutionId": 1
                }
            ]
        },
        {
            "id": 2631,
            "thumbnail": "jpg",
            "active": true,
            "description": "Adrienne goes for a swim in the beautiful indoor swimming pool. It's quite softcore with lots of excellent posing and one really nice pussy and ass close up when she lifts her ass out of the water. She is like a vision from a James Bond movie as she floats around in the water with her hair and make up perfect in spite of what she is doing.",
            "length": 550,
            "quality": 1,
            "releaseDate": "2010-04-27",
            "scene": 1,
            "title": "Wet Dreams.",
            "videos": [
                {
                    "directory": "408",
                    "filename": "408_04_wetdreams1280x720.mp4",
                    "mb": 284,
                    "videoResolutionId": 3
                },
                {
                    "directory": "408",
                    "filename": "408_04_wetdreams1920x1080.mp4",
                    "mb": 559,
                    "videoResolutionId": 4
                },
                {
                    "directory": "408",
                    "filename": "408_04_wetdreams640x360.mp4",
                    "mb": 112,
                    "videoResolutionId": 1
                }
            ]
        }
    ],
    "galleryImages": [
        {
            "filename": "408_001.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_001.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_002.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_002.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_003.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_003.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_004.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_004.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_005.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_005.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_006.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_006.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_007.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_007.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_008.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_008.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_009.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_009.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_010.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_010.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_011.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_011.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_012.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_012.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_013.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_013.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_014.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_014.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_015.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_015.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_016.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_016.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_017.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_017.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_018.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_018.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_019.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_019.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_020.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_020.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_021.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_021.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_022.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_022.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_023.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_023.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_024.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_024.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_025.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_025.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_026.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_026.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_027.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_027.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_028.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_028.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_029.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_029.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_030.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_030.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_031.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_031.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_032.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_032.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        },
        {
            "filename": "408_033.jpg",
            "imageType": 1,
            "directory": "408 Adrienne Manning/main/"
        },
        {
            "filename": "408_112.jpg",
            "imageType": 2,
            "directory": "408 Adrienne Manning/main_large/"
        }
    ],
    "models": [
        3
    ],
    "zipFiles": [
        {
            "filename": "408 Adrienne Manning Small.zip",
            "mb": 17
        },
        {
            "filename": "408 Adrienne Manning Large.zip",
            "mb": 64
        }
    ]
}

I had to delete some entries as I exceeded the character limit for Discourse.

1 Like

There is a lot of useful data here actually, with some stuff that isn’t even displayed in the public page:

  • Clip release dates
  • Clip ID numbers
  • Full-size cover images for clips
  • Collection’s production date (labelled “shootDate”)
  • Unique collection descriptions? (I don’t see these listed anywhere else)
    Nevermind, I was looking at the earlier example that didn’t have one. These are also displayed in the webpage.

That’s all in addition to the stuff we can already copy by hand, like collection ID, collection title, clip titles, clip descriptions, etc.

I’ll forward this to the Scrapers channel in Discord and see what they can do with it. One outstanding question is still how to scrape the details from just a single clip since we only have a URL for the whole collection.

Another question, I noticed that all of these clips are labelled as “scene 1”, which is also displayed in the webpage above the list of clips. Do you have any examples of a collection with a “scene 2”? How are those typically handled in StashDB?

I believe the scene label was discontinued in 2008. I have only access to older collections where multiple scenes are present. An example from stashdb is: StashDB

I have included the corresponding page as a PDF and the API response for that collection.

105.pdf (3.1 MB)

105.txt (57.3 KB)

1 Like

From what you’re saying here and what I’m seeing poking around early StashDB releases, it’s probably best to just ignore the “scene” numbers then, at least for the individual clips.

It only affects a small set of releases from the first few years of releases, seems to only signify a change in location (different room) while filming, and would complicate the naming scheme. Plus, “scene” means something different in the context of Stash, so it could be confusing.

This early release’s description (linked below) would be as far as I’d go with incorporating them. They’re essentially just used as headings above the clip titles and descriptions. Right now the scraper ignores those scene numbers entirely, but I don’t think it would hurt to include them in a new API scraper’s code when scraping the full collection.

I was just discussing in Discord how the scraper could work and noticed that TPDB has been scraping the individual clips too.

Their formatting is different from what I would prefer, and from what would make the most sense for StashDB, but there are still a few points worth noting:

  • Clips are separated into their own sub-studio
  • Studio codes are a combination of the collection’s ID number and the clip’s ID number
  • Shoot dates are included in the description, along with other information

I’m not a huge fan of the changes they’ve made to the clip titles and descriptions, but the sub-studio and studio codes could be worth adding to the scraper. TPDB could also be a more convenient source for adding those production dates to StashDB compared to re-scraping each time.

Example: ThePornDB

While browsing ITC, I noticed that downloading the clip information directly from the website via a browser is not particularly difficult. If it is helpful, I could download all 2022 clip metadata files over the course of the next few days.

One potential issue for an automated scraper is that it likely needs an authenticated session to access this information. I could also extract the data via a script; in that case, I would just need to know which target format is preferred and how this data should be imported, either into my local Stash instance or directly into StashDB.

Splitting this into a separate studio feels somewhat odd to me, but I do see the benefit of having a clearer distinction between community-compiled scenes and the official ITC clips. As a possible approach, I would propose that the compiled scenes get their own studio (by renaming the current one), and that a new studio be created specifically for the clips.

For the clip-based studio, I would suggest the following mappings:

  • Studio code: id.clips.id
  • Release date: id.clips.releaseDate
  • Scene cover:
    https://api.inthecrack.com/FileStore/images/posters/clips/<filename>-<resolution>.jpg
  • Description structure:
    • Shoot Location: id.shootingLocation
    • Shoot Date: id.shootDate
    • Set Description: id.description
    • Clip Description: id.clips.description

Jumping in here again. I’ve read through the current discussion and would like to note that while clips are still separated, they are seen as one to two “scenes” on the website. This is why I agree with @AdultSun that while TPDB might not have the best formatting, they have the right of it IMO by separating out into two studios.

”Collections” of clips, (technically a “Scene” by ITC’s standards), should be the “ITC” studio whereas the “ITC Clips” will be each individual clip. Further back there sometimes were 5+ clips and they were separated out by two scenes, but nowadays it seems to always be a single “scene”. When I download these I just concat the videos together, partly because there is no equivalent representation of the URL for each clip because they’re all defined as a single collection (“scene” or group of “scenes”). The whole design by the site creates an issue one way or another.

Below is a screenshot of one of the latest collections, showing that the site still groups them together as a “Scene”:

I guess I’m not clear on whether a StashDB entry is supposed to be a “scene” or a “video”. If it’s a “scene” then technically this “collection” on the site is a single “scene” and it’s up to the submitter to concat the videos together to create that scene before scraping and submitting. If it’s a “video” (or clip), then it’s up to the submitter to make sure each clips’ metadata only matches to the scenes’ clip.

I added some suggestions into the re-work of the scraper after the site’s redesign, so my question is this:

Let’s say you have a new clip that you want to submit. If you put the collection URL into the scene in your local instance and then scrape it, how does it know it’s only a single clip of that overall scene?

I opted to use the whole collection as a “scene” when doing the re-work because the site represents it as such and for practical reasons it makes more sense to me that you can scrape a URL that returns the whole scene’s information. It does require that submitters concat the clips into a single scene though (and technically older collections include multiple scenes, so those would need to be reworked into what the site considers a “scene”).

Also worth mentioning here. When ITC re-did their website, they made it so all the codes matched up. This has also created a significant amount of mismatches in StashDB because nearly all previous studio codes are now inaccurate and the URLs also don’t match.

The good news is that collection 1234 now uses the 1234 URL, so it’s a 1:1 now. Now it’s just a matter of updating all the existing StashDB records so they match as well and each studio code matches the actual collection.