I have noticed the same thing recently. Most of the time when I have a mismatch it’s on titles that contain only one word. I had the same issue with “Emily (2022)” and the other issue I had last week was with “Moon (2009)”. It was recognized as “The Twilight Saga: New Moon (2009)”. It’s kind of annoying to have a file named properly with the exact movie title and year and the matching algorithm selects a partial match instead.
With Emily, the top result was 7 in popularity while the one you want was 6900…so it’s trying to get what most people would be searching for. For moon, this should be correct (just tried it).
Yup, had that too. At the time, the only thing I could get to work was to use the Estonian alternative title “Kuu”.
Another frustrating one was the 2016 movie “Lion” that kept getting highjacked by the 1994 movie Lion King — even using the year filter, which made absolutely no sense.
Why? Because somewhere in the world it was aired on TV in 2016 (or maybe released on DVD). I just deleted that entry. No one has seemed to miss it.
I’ve had at least two dozen titles in my 3500 movies that won’t scrape correctly even properly named.
Emily is the second one offered for “Emily y:2002” on the web interface (which is more accurate than the scraper infuse uses) — but it still prioritizes “Emily the Criminal” from the same year and I fracking hate that.
If we wanted “Emily the Criminal” than we’d include the words “the Criminal”. TMDB Nonsense.
For a file for which you only have partial name matches, I can understand using the popularity to pick one. But in the case of a properly named file for which you have an EXACT match for the title and the year, I don’t understand why you wouldn’t want to use that match and instead pick another movie because it’s more popular.
Yes I understand the frustration but take it up with TMDb. It’s their algorithm that returns the match
I have.
They aren’t the most flexible bunch.
I had a quick look at their API documentation and support forum and the ability to perform an exact search has been requested multiple times over the years. There is talk to add this capability to their website and their API. Some of the work appears to be tagged as “Done” but I’m not sure about the API. Maybe someday they will do something about it.
For reference: Search movie should be more accurate - Talk — The Movie Database (TMDB)
Yup yup.
It’s obviously /capable/ of doing an exact search. I just can’t for the life of me figure out why they don’t automatically always prioritize exact matches when they are found.
Maybe they prioritize exact matches when it is more than 1 word searches?
Maybe. But the longer the name, the more accurate the search. If you were ever going to need to prioritize exact matches, it’s with the short titles.
So if I understand well, TMDb id unlikely to correct this in the very future.
One temporary solution, for advanced users, would be to force the TMDB id. of the video, in the same way one can force the metadata themselves with a properly formatted XML file.
One possibility is to use the same XML file (name of the video file but with .xml as extension) and to indicate the TMDB id in it.
That would cause the Infuse metadata parser to use this TMDB id. instead of querying it from the file name.
If you think this is a not too stupid idea, I will put it in the correct place for suggestions.
My experience has show that Infuse builds a completely separate databases when users include .nfo files colocated with their video files (presumably it does the same with .xml but I’ve never used that flavor). This was only noticeable to me after Infuse recently-ish added support for custom cast & crew.
After the change, if I searched for an actor, Infuse returned two identical results for each one — the actor as listed in my .nfo files (sourced from TMDB), and the (same) actor as indexed on addition during Infuse’s background querying of TMDB.
What I found was that the TMDB (normal Infuse database) was generated on Import as normal; but the .nfo-based database was only indexed “on-demand” — after a file is navigated to and shown onscreen.
This results in one of the actor/crew results displayed in search (and the actor/crew photos listed on movie & episode details / info pages for titles where you’ve included local .nfo) to have incomplete information. An actor who has been in a lot of movies in your collection might show 20 titles on one listing, and only 12 on the other, for example.
…
For this reason, I’m not sure Infuse beginning to support including TMDB ID numbers in local .nfo / .xml will be the solution we’re looking for, since then our libraries will include two titles for each movie — the correct one as determined by our .nfo; and the incorrect one as determined by Infuse on adding the title on Import through the API.
Even if the incorrect movie isn’t show in the library, it will likely be returned in searches for titles or especially cast and crew. If the wrongly identified title is embarrassing, that can be problematic — you wouldn’t even know it was there until you (or someone else) stumbled over it.
…
Firecore’s current solution to this issue is the “Edit Metadata” button, which allows the user to manually query TMDB and select the correct movie — which TMDB ever after will associate with that title; until you change the filename of delete the local metadata database.
This solution does mean you have to repeat the process for every incorrectly identified movie every time you rebuild the database; and every time you rebuild the database, different titles are at risk of being misidentified thanks to users at TMDB adding new data (or editing or deleting old data) that newly confuses the API.
Edit: Updated link to skip to the part of thread where I better understood the issue.
If infuse is able to query TMDB to obtain a list of possible choices to show us, what’s preventing Infuse from automatically going through that list to see if one of the choices is an exact match (instead of picking the first one) ? Or am I missing something?
Because many users have hundreds and other users many thousands of titles, and going through each manually would be a nightmare.
The ability to fix broke titles (a very small percentage of all titles scanned) one at a time is the appropriate compromise.
Forgive me, I just realized you might not have meant “automatically” as in automatically let the user decide during import … but automatically have Infuse, by itself, compare the list of all possible title matches.
I think probably the biggest problem is the API that identifies video titles only returns the single match that TMDB determines is correct.
To make Infuse work as well as possible they support many different file-naming schemes and don’t necessarily penalize for poor spelling or putting parts of some titles in the wrong order.
I guess the only way to automate this, they’d have to add a new module that compares TMDB returned movie title with file name as submitted and tag ALL not-exact matches for user to manually review?
What I generally do is hide all my custom art by renaming all my .jpgs and then scanning visually through the library looking for unfamiliar posters searching for titles I know I don’t have in my library — and unexpected double posters next to each other for files misidentified as titles I DO have in my collection.
Not ideal, but it works.
I mentioned the edit function of Infuse just to show that Infuse is capable of retrieving a list of possible matches for a movie using the file name. What I’m trying to say is when Infuse detect a new movie, it should retrieve that same list and see if one of the movie from that list is an exact match. Ideally it would pick that exact match instead of what TMDB find most “popular”. All off that without any user intervention.
Thx.
Not clear to me what a good solution would be.
I imagine this would incredibly lengthen the time it takes for Infuse to import collections (annoying most users) and perhaps be a violation of TMDB’s API terms.
It would be interesting if Infuse could simply tag items it found that matched suspiciously (significant title / release year discrepancies) for human attention or review after import (without performing multiple queries per title) … but not sure how Firecore would settle on the threshold of what constitutes a suspicious match — since at least a few obscure movies in my collection correctly have identical titles and release years — and clearly those would never be tagged as suspicious no matter how strict the threshold is.
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.