Algolia scraper replaces https://www .* with https://*
Is there reason for that? Many sites seems to have redirect to www. site, so this behavior is leading duplicate url’s if url is copied from studio site. Just wondering if fixing that breaks some sites?
Makes parsing easier and technically more flexible since it leave it up to the site to redirect (if it does)
stripped out here to make domain parsing easier
CURRENT_TIME)
# Failed to get new key
if application_id is None:
sys.exit(1)
api_url = f"https://tsmkfa364q-dsn.algolia.net/1/indexes/*/queries?x-algolia-application-id={application_id}&x-algolia-api-key={api_key}"
#log.debug(HEADERS)
#log.debug(FRAGMENT)
URL_DOMAIN = None
if SCENE_URL:
URL_DOMAIN = re.sub(r"www\.|\.com", "", urlparse(SCENE_URL).netloc).lower()
log.info(f"URL Domain: {URL_DOMAIN}")
if "validName" in sys.argv and SCENE_URL is None:
sys.exit(1)
if SCENE_URL and SCENE_ID is None:
log.debug(f"URL Scraping: {SCENE_URL}")
else:
log.debug(f"Stash ID: {SCENE_ID}")
log.debug(f"Stash Title: {SCENE_TITLE}")
and not reinserted here