I have at least one movie that I initially thought was failing to be indexed. As it turns out, it has been identified as being a higher quality version of a different movie in the same collection, which was actually quite difficult to find. I found it more out of luck than anything else
I have Fast and Furious (2009) in 4k. I also have The Fast and the Furious: Tokyo Drift in 1080p
Fast and Furious (2009) should be indexed as its own movie within the same collection but instead it’s indexed as a 4k version of Tokyo Drift.
Anyone know how I can fix it?
I’d also like to be able to understand how this happens, so that I can avoid it in future but also work out if it’s happening for anything else as I don’t think this is the only one
Excellent, thank you! Confirming that fixed it, and no worries on working out the cause since it sounds like it is uncommon. I had assumed this might have been something I had done, or some setting somewhere rather than an uncommon issue
Although if anyone else sees something like this, I also now realise I could have edited the metadata on that particular version which seems to fix it as I’ve also had a small challenge in that the old filename is still showing. But I’m certain that is something on my end and just needs a bit more time
If you care about Infuse/TMDB metadata mismatches, this is an important post for you! I strongly suggest you read it all.
Why Properly Titled Movie Filenames Sometimes Still Return the Wrong Results
What’s a filename with regard to a metadata search? It’s everything. The filename is “scraped” to identify the the words used in it, and a four-digit year representing the title’s release year — if the latter exists. The search algorithm takes these search terms (the words in the title and the four digit year) and queries TMDB’s gigantic metadata database for movie titles that best match the terms searched.
What’s a properly titled file? It’s a media or metadata file’s filename that exactly as possible matches the title and release year of the movie it represents, as listed at TMDB.
Getting this wrong might get your film identified incorrectly.
You are free to disagree with TMDB, but you’ll never change their minds. They are zealots who proselytize from their own literal bible. Conform or be outcast. Resistance is futile.
Always match filenames to both the complete title and the release year listed per TMDB.
This happened because the misidentified file was a 4K file. Infuse scans each file to determine its video and audio characteristics. This is independent of any metadata received from TMDB.
But an interesting example of the flaws in TMDB’s algorithms. See discussion below .
Okay — so the first issue with this title is that the official title contains an ampersand. It should be titled “Fast & Furious (2009).mkv” as demonstrated by the results of the two different searches shown here:
… Now, let’s discuss the more interesting (and broken) parts of TMDB’s search algorithm:
Under ordinary circumstances, “Fast and Furious (2009).mkv” would still have been the first search result shown because the presence of the film’s release year in the filename would normally prioritize the only Fast and Furious film to be released in 2009 — and that’s why you should always include the release year in filenames. Including release years provides additional data points that help the algorithm resolve potential ambiguities.
But that did not happen in this case. Both “The Fast and the Furious: Tokyo Drift (2006)” and “The Fast and the Furious (2001)” were determined to be better matches than “Fast and Furious (2009)”. Why should that be?
#1 — The search algorithm does not automatically prioritize exact title matches — any other films that include in their titles all the words entered into the search may score exactly the same to TMDB’s algorithm (and often higher, if it’s the more popular film) as the film that’s an exact title match to the provided search terms. Neither film will be prioritized over the other based on search terms alone, even though the wrong titles include extra not-searched-for words that ought to be exclusionary.
So why is that?
Presumably because lots of people don’t clean their filenames, and they leave lots of other data in them related to the download’s specs and sources and all such other things … or they use the word “and” instead of an ampersand when the latter is the term that should have been entered … or they name the file “The Fast and Furious 4 - Fast & Furious”.
For this reason, TMDB seems to disregard anything extra in a filename so as to avoid the exclusion of possibly correct matches that might otherwise result if TMDB more strictly limited its search results only to tiles with exact title matches.
#2 — Adding the release year should prevent that, right?
You’d think so. But here’s where the really big flaw comes in: TMDB does not prioritize their own chosen primary theatrical premiere date as shown on every movies’ page header over any other release date types — including not just delayed theatrical releases in other countries, but even the issuance to retailers decades later of ‘new’ editions of the film’s collectible media (DVDs and Blu-Ray Discs).
So why were those other two not-premiered in 2009 Fast and Furious films returned above “Fast and Furious (2009)”?
That’s right. Because the first two films in the series were re-issued on Blu-ray in 2009 to capitalize on the release of Fast 4 — and the information was added to the database — those first two films are not excluded even though their actual release dates were 2001 and 2006, respectively. Any listed release date of any type recorded in TMDB’s database can break searches when the wrong films’ titles include all the searched-for terms.
To prove this point, here is the result of a search I performed immediately after
I deleted the release-date entries recording the 2009 Swedish reissues that had been added to both the 2001 and 2006 F&F franchise films — eliminating either of them from being ranked a better match for the terms “Fast and Furious (2009)” than the actually-released in 2009 F&F film:
Thanks, that was pretty interesting. I figured the and/ampersand had something to do with it, although not for the same reasons. I tried changing it to an ampersand and rescanning which didn’t make any change. I guess this is more of an issue with the catalog updating as I found above, rather than anything else
I don’t recall how I ended up with and in the title, although given this is stored on a linux machine it’s probably to avoid escaping them tbh
Well, I learned something about how it all fits together so thanks for taking the time to explain (and fix) it!
Once a file is identified (correctly or incorrectly), Infuse won’t update the identification unless you delete the identification (by switching its containing folder to local metadata so the ID is discarded — and you are shown a warning to that effect) or manually change it via “edit metadata”.
But if you edit via “edit metadata”, it then remembers the new ID you give it and will continue to return the new ID first every time you rescan the file because it remembers your manual changes in persistent memory for when the metadata cache is erased by the AppleTV and Infuse needs to rebuild the database, it won’t continually be identifying titles you’ve told it were identified wrongly as the wrong title.
Does that make sense?
The website search results generally match the API search results in my testing. So that’s why I’ll often test potential filenames on the website before switching my actual files when they don’t work. I try to never use edit metadata because I want my filenames to return correct results even if I delete ALL of Infuses data, including its record of my manual changes.