Hi, I’ve been using Stash for a few weeks and I’m slowly getting a hang of this. I’ve been testing and learning how to use Stash and plug-ins and I’m ready to feed it a large collection of ~100tb collection with maybe 100k files. Before I do I want to find out more about how it works so I can optimize performance.
Right now when I bring it files I go to Settings>Task>Scan, after which I go to Settings>Task>Identify. During scan I only have Stash generate scene cover and phash, which takes a long time.
Generating phash seems to be the most time-consuming part of the process. I’m assuming it is fairly processor and disk read intensive. Is the phash function opmitized for multi-core or multi-threaded processors? If so is there a point of diminishing returns for core / thread count? Right now I’m running it on a spare 8th gen Intel box. I have some other spare PCs around and I’m wondering if it is worthwhile to throw a 12 or 16 core processor at it since I’ll be processing a large library. For the Identify part of the task, all it is is Stash sending the phash to StashDB and fetch data if a match is found, right? Modern home broadband should not bottleneck the Identifying part?
The other question is if it makes more sense to bring in scenes in batched. Right now the library is on a NAS, and Stash is running on a spare PC. If I want to start adding this whole ~100TB collection, does it make sense to move a few TB to the machine running Stash, have Stash run stand / identify / rename / organize tasks off of a local SSD, and once done, move them back to the NAS? If so, How should I set up the library and what would be the workflow like?
Thanks