My continued struggle to understand Infuse's TV tagging logic: now with pictures!

Part One: Background and what I figured out so far.
TL:WR - Just skip to the next post for the parts I need help with.

I have been busting my brain trying to figure out the logic used (or not) by Infuse to scrape TV series media information from TMDB.org. The “metadata 101” page is a joke. I’d happily write a better one if I can only figure out something helpful to say. SOMEONE must have written this software and knows what it does. Or what it is supposed to do. I’m not here to tell them they are doing it wrong, or tell them how to do it better. I just want to figure out (or preferably be told) what they did, so I can learn to do that, too.

Not having local control over the metadata for my own media library is driving me bonkers. Ideally I want control over my library. Barring that, I at least want to figure out how I’m supposed to name and organize my files so Infuse will correctly identify them ON ITS OWN every time it decides to rebuild my library because it (or Apple) decide to wipe out my metadata cache. Here’s an idea — if you can’t convince apple not to randomly nuke our libraries on what is already an essentially Infuse-dedicated Apple TV, why not offer the option to store the cache instead on the same device (such as the local server) that hosts the content?

I’m going to continue with most of the same television series I already reported problems with; and have to date utterly failed to find the magic formula for.

I’d like to point out that to make everything is confusion free as possible, I did indeed get rid of all existing metadata that theoretically might have been causing infuse problems. To effect this change, I nuked all of the .jpg & .nfo files associated with my collection (the images I purposefully downloaded from various quality sources as to cater both to my aesthetic whims and significant OCD, while the metadata was newly sourced these past few months in its entirety exclusively from TMDB.org — to play nice with Infuse — via Kodi for Windows PC on my laptop). Hibernating the .jpgs and .nfo files was easy enough; I just used a batch renamer to change all .jpg extensions to .jxx and .nfo extensions to .nxx. TMDB’s higher quality thumbnails are indeed nice. Their poor support for special features content still leaves a lot to be desired. But, that’s just a minor quibble.

Nuking the images and .nfo files sure was enlightening when it came to the movie side of my library. It’s wonderful that Infuse displays your preferred -fanart and -poster .jpgs and reads your .nfo files and displays some of that content on screen (including your preferred title and genre). But sadly this is only skin deep — it still ignores your local .nfo files entirely when it comes to looking up your titles in TMDB for the purpose of its searching your collection by actor or crew member or “collection” — both of which sadly are locked exclusively to the whims of the management of TMDB. I’ve got close to 3000 movies in my library; all pre-processed and verified accurate my me. It was surprising how many of those failed to be identified correctly on initial import. But at least I was able to figure out why, in most instances. Initially the biggest problems (for me) came from release year mismatches and Infuse’s apparent inability to parse maybe 20% of titles that I’d long ago renamed for easier alphabetical sorting / file handling (which Kodi never had issues with, even using the same content source of TMDB) — ex: “Hunt for Red October, The (1990)” instead of “The Hunt for Red October (1990)”. In the end, I did learn something interesting:

Unlike Kodi, Infuse doesn’t seem to care what you name the folder that contains a movie or TV series. I found it essential all my files be named explicitly — and with movies, this (and getting the correct year according to TMDB) seemed to do the trick at least 95% of the time. It’s gotta be “The Hunt for Red October (1990).mkv /.jpg / .srt” etc. But perfectly fine to call the folder “Hunt for Red October, The (1990)”. “Crazy Ivan Roll Tide 8675-309” works too. ¯_(ツ)_/¯ Whatever. As long as I can determine the rules, I’ll happily play by them.

^ More on this later.

Way harder to figure out why TV shows won’t scrape correctly.

I suspect a large part of it has to do with TMDB returning ZERO HITS if you search for a television series with an episode that includes the series’ release-year after the title.

REALLY, TMDB? FireCore? What gives?

Example: Search for “The Twilight Zone” on IMDB.org (or via the “Edit Metadata” option in Infuse), and you get seven hits for movies and five for TV series — including the 1959 original, and the 1985, 2002, and 2019 revivals. Makes sense. That’s what you would expect.

But search “The Twilight Zone (2019)” — with or without the parentheses — and you suddenly get zero hits. Well, damn. :man_facepalming:

I’ve got all four series. How’s that supposed to work? You can’t put the date in the name of the video file — before the season and episode number, at least — and if you include it after, or in the folder name, it’s just ignored anyway. I’ve tried everything. I can’t find any way to have these series scrape correctly. I thought, hey, I’ll just keep the folders as “The Twilight Zone (1985)” and “The Twilight Zone (2002)” etc. but name all the files “The Twilight Zone · S01E01.mkv” but then they all scrape to the same season, even being in differently named folders; so that didn’t work. Best I’ve come up with to make sure episodes from all four seasons scrape as different seasons (and don’t return zero hits) is naming them “The Twilight Zone B · S01E01” and “The Twilight Zone C · S01E01” and “The Twilight Zone D · S01E01” in their respective folders (which include the year, as above). But these still need to be edited manually every time the library is rebuilt or I need to move or rename a containing folder on my NAS. So ridiculously inane and tedious!

Who chooses to ignore the 2nd most important data point (the release year) to properly identify a piece of media??

I don’t know who is to blame for this fiasco — TMDB or FireCore — I mean, the TVDB definitely lets you refine your television search query by year. As does, and has, IMDB, Rotten Tomatoes, Wikipedia, etc. etc. since like, always.

But there is one thing that’s even worse. Ignoring the SINGLE MOST IMPORTANT data point. The title.

Yes, the title.

All of the following searches fail, even though a perfect title match exist on TMDB. And these don’t even miss by a little bit — like getting confused by two different shows with identical names (which, when you are throwing away the release year data point, and completely ignoring potential matches based on episode titles, or number of seasons in a series, or episodes in a series — more information available if you only look for it) … they miss by a LOT.

Peruse and enjoy:

My troublesome titles, as identified by Infuse on both my iPhone and Apple TV:
(note how the Twilight Zone fails to suggest ANY hits if the series release year is included in the query)

After I appended the letters “B”, “C”, and “D” to Twilight Zone filenames in place of the crash-causing years — at least now I don’t have to manually type in the file name again; but the first episode of all three subsequent series are identified as being part of the same (arbitrary) edition.

Does not compute… ¯_(ツ)_/¯

And compared to the top results returned when I searched on the TMDB.org webpage:
(The arrows indicate the only times I noticed TMDB got it wrong — meaning the correct choice wasn’t listed first. Obviously, even these mistakes could easily be avoided if the series’ first production year was part of the query.)


Highlighting some of the bonkers identifications:
(note the filename that prompted the search at the top, and all the scrolling up one needs to do to find the best match — almost always listed first BUT NOT CHOSEN)



1 Like

^ Obviously that’s a really long post.

What I find most infuriating is that if you do an “edit metadata” search after importing a title that Infuse scrapes incorrectly, and …

(if you don’t have a date or anything is in the video’s filename that trips the search query up, such as the " · Extended Version" part of “The Hateful Eight · Extended Version” the results in zero hits via Infuse — even though on the TMDB.org website it results in only one — the correct one)

… almost invariably you will presented with a list of what I can only assume are the “most likely matches” in descending order … with Infuse’s nonsensical selection checked when it is quite obviously not even judged the best choice by TMDB since it isn’t first on the list. Scroll up as high as you can go, and, almost always, the top item, the one you hit when you can’t scroll any further, is the one you wanted.

Why??

1 Like

Haven’t read that “thing”, but here are my two cents

  1. what worked perfectly for me using infuse to scrape and build the library is to
  • Movie: name the file like its shown under original title on IMDB plus adding the year in brakets
  • Show: folder named like the show, subfolder Season 01, filename like the show plus S01E01
  1. If you want full control over Meta data use Plex / Emby to manage the library and make Infuse use the Plex / Emby Library

I can’t seem to replicate the issues with TV shows here, with or without the year in the filename. The correct matches are getting found automatically for a few of your examples I tested.

  1. What connection type are you using to connect to the NAS (SMB, NFS, UPnP, etc…)? This can be found by viewing the share settings in Infuse
  2. Is Metadata Fetching enabled in Infuse > Settings > General?
  3. Is Embedded Metadata enabled in Infuse > Settings > General?
  4. Have any of your files/folders been tagged to use local metadata?

Thanks for responding, James.

I use Infuse on an iPhoneX (exclusively to more easily troubleshoot my library scraping issues) and on either of two AppleTV 5 4K units (to watch stuff). Each pulls content over my home wifi from a single NFS share on a Synology DS418play. The AppleTVs are on gigabit ethernet connections and the iPhone is (of course) on wifi. Connection speed hasn’t been a problem — nor has playback of content, generally, which of course is nice. A few times I’ve needed to exit a stream and begin a different one in order to get audio that had drifted out-of-sync to reset, but that’s all.

Metadata Fetching is enabled, and Embedded Metadata is disabled. No folders are tagged to use Local Metadata.

In your test, did you need to scroll to select the correct option or was it already selected correctly? Because I can’t for the life of me get those titles to scrape correctly on their own.

Yes, once I manually set a single item, all the rest of the items in that folder (i.e. all episodes of that series) properly adjust themselves automatically. I would just really like to figure out how I’m supposed to name those series so they scrape correctly without my needing to fix them all manually at the end.

Thank you.

So I just took a closer look at your images and noticed you didn’t include any separators (hyphens, etc.) between the series title and S##E## … so, I tried that, and of course, it works.

A hyphen as a separator also works.

Unfortunately, my preferred spacing character [ · ] — the middot (or “interpunct”, according to wikipedia) — does not. That is to say, I have now narrowed down the cause of the failures listed in this thread to my use of the tiny bullet-like separator.

I’m very curious as to what it is about these series titles in particular, that putting this tiny little thing (which I’ve otherwise been using in just this context for approximately the past 40 or so years) in the “wrong” place … immediately gums up the works and results in the bizarre scrape failures noted above. By which I mean … why is it the middots cause failures only on these titles but not on the greater majority of the titles in my wider collection that don’t care I’ve used middots as separators?

All of my files (till now) have been organized the same way:

Series Title (Year if Necessary) · S##E##

[ Nothing appended after the Episode Number seams to effect search results positively or negatively, so the trailing bullet I use when including additional info (episode titles and original air-date year) is fine. It’s only the the bullet before the S## that wreaks havoc. ]

I prefer using · to - in my file names for aesthetic reasons but clearly I will have to compromise as these failed scrapes have annoyed me to a far greater extent. So, no worries, I will move on.

But if anyone can hypothesize what the bug in the process might be that causes these scrapes to choke on those little tiny bullet points, please tell. I’m terribly curious.

**

This may shed a little light on why it causes problems.

A plain period or a space are what I’ve found to cause the least amount of headaches.

1 Like

No, certainly not colons. Just the middot / middle-dot / interpunct thing. ¯_(ツ)_/¯

I understand, it does seem strange but after several years of dealing with oddities with names I’ve determined that the logic to naming is not to use logic. That’s why I’ve just settled on using spaces or periods. Period. :rofl:

I’m just guessing but there’s probably an operating system somewhere in the global chain of the search for metadata that upchucks with the middot.

I just figure that as long as I don’t have to beat a title into submission to find the right metadata I’ll put up with the vanilla titles.

1 Like

That let me come back to my suggestion here: Infuse - chosing your own covers for the movies - #12 by elchupete

Filesystems act like bitc…s. And although modern filesystems are supposed to handle upper/lower case characters, spaces, special character it turns out that they do not in all cases.

using the middle dot in filenames is a bad idea. Best is only lower case characters, no special characters and no spaces. so_i_end_up_in_these.kind_of_name.extension

I have had no issues with following these three simple rules ever since

And obviously this is not only true for filesystems but also for Databases and sql queries

1 Like

Ah, I didn’t catch that in your filenames but that would explain it.

The currently supported separators are: space, period, underscore, dash.

This is noted in the metadata 101 guide.

Period, space, underscore, and dash can all be used interchangeably as separator characters for both movies and TV shows.

1 Like

Apart from the normally excluded in filename characters (<, >, , ? etc.) I’d never before encountered issues using any of the other trusty old Windows-1252 characters before, and have leaned on heavy daily use of the middot for decades. Alas, as they say, first time for everything. I’ve since renamed multiple tens of thousands of files (video, image, metadata, and subtitles) accordingly —which was easy enough, thank goodness.

Appreciate that update to the 101. I’ve noted that the only separator that seems to matter is the one which may optionally live between the name of the file and the season and episode numbers (Series - SxxExx) … any subsequent content seems to have no effect.

Did my mentioning the Netflix Extended Version of “Hateful Eight” (presented as 4 episode TV miniseries) cause someone here to go nuke it off of TMDB?

That’s damn frustrating.

No matter how successful you might ultimately be getting all your content FINALLY scanning correctly, we STILL can have our content orphaned when some jerk at TMDB decides to erase it. THIS is why it another reason, in my opinion, INFUSE ought to allow us to override TMDB with our own local metadata files. Infuse dropping support for TVDB (I’ve read why the change was made) forced us to rescrape and rename our entire collections with fresh TMDB metadata — which often was different (and inferior) to what we had before. No worries — I get why it happened. But I did my homework; I deleted gigabytes of images and .nfos and renamed series, episodes and movies, and changed the release years when they didn’t match up. Yet it still isn’t enough. Gah!

TMDB nuking the TV series version of “The Hateful Eight” is also yet another really good example how FireCore could make users that much happier by finally allowing Infuse to natively support multiple versions of movies …

Thank you, happy holidays, and blue skies…

Yeah, that seems to be the case. Still very curious though why it only happens on a relatively small subset of searches, and not globally.
Thank you.

There’s probably an old Gateway 486 machine that has cobwebs and dust covering it that is the machine responsible for passing data on 4 or 5 shows and no one has noticed it in the back of a broom closet for the past 20 years. :clown_face:

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.