The ":" colon character in media titles. A solution, sort of

Many movie and show titles use the colon “:” in their titles. However it is not a valid character to use as a windows/ios/dos filename. Hence we see hyphens and other workarounds for filenames. However I discovered a colon character in one of chinese fonts that windows/ios DOES recognize. I am inserting that version HERE - :- However when I insert in the titles for movies Firecore does not read the character as either a colon or other blank/ignored characters such hypens or periods, and thus does not recognize or parse out the correct media title when it searches metadata. So I have to manually edit metadata. In the Asian titles/articles I have found it in, it does function as a colon, not a chinese/japanese vocabulary word. Is this something that can be fixed /addressed in Firecore’s software or is it a more fundamental IOS based issue? Allowing colons to be used for file sorting and naming would clean up the look and clarify the titles immensely.

I feel you. A lot of the APIs use colons (and other special characters) for internal purposes so that’s why your filenames are likely choking up. I’ve used this character · instead for movie folder names, but I have to make sure that character is replaced with a hypen inside the folder on the actual files from which Infuse creates its search queries — before I started doing that, I had a lot of searches fail.

Luckily I have written some regular-expression heavy folder and file renaming scripts so I can automate the process.

Folder names don’t matter to Infuse for Movies — you can literally name them anything. But they DO matter for TV series. Can’t use any special characters there, or the search query might return the wrong show (or no results at all).

Yes hence the reason for my post. I have to manually input the movie or tv show name in the edit meta data screen for the entry to be found.

Well, you don’t have to. You choose to, because you prefer the look of the non-standard character on your file system. If you leave the pseudo-colon in the folder name (for a movie), no problem. But you need to either omit it or replace it (with the trusty hyphen) if you want the Infuse/TMDB communication to work correctly, and not need to manually search for everything.

Just curious how often you need to look at file names and can’t find the file in the right order because of the colon. I imagine that the colon would come after a space when sorted, is that what you are looking for? Instead of inline with others. But generally these items would go together alphabetically anyways. Maybe you could share an example. I personally would recommend just skipping any notion of it because the hassle of dealing with it since it’s not friendly to many tools.

1 Like

I still wake up from nightmares about the 8.3 name format. :scream:

1 Like

No. I choose to name my files accurately. From the same east asian character set, the question mark character works perfectly fine, Infuse ignores its existence. Perhaps I should have made that clear in the beginning. I had also asked in my first post if this was a deeper IOS or firmware issue, or is it one that can be corrected within Infuse. Am not sure what the tone of your post is or what you mean by my liking the look of a non standard character. It reads and can be interpreted if chosen to, as a snooty put down. For the moment I am choosing NOT to take it that way, The last time I checked the colon is pretty much a very commonly used STANDARD character in the English Language with proper grammatical rules, as well as in many others. Peace

LMAO… yes so many apps and programs still get tripped up by the half forgotten rules of ancient legacy operating systems, programming languages and compilers to machine code…

It’s not about the file order. Just like the hyphen and question mark the character is or should be ignored in the meta data searches. Just like FLskydiver had been doing I had adopted using hyphens long ago as well. But in movie titles that use combinations of colons and hyphens, the use of all hyphens just mucks things up for visual clarity and understanding a title…

My friend, I’ve been there.

I, too, found the question mark looking character ( Ɂ ) that works fine in Windows’ file manager. I called that up from memory with a (right-alt) + ( ? ) keystroke because I’ve got a million special characters mapped to my keyboard with a gargantuan autohotkey script that I’ve always got running in the background, such that when I have to borrow someone else’s computer, I have no idea how to work it.

I totally get it. I too need my stuff to look EXACTLY how I want it to look; and dang it, I expect all my software to work exactly the way I want it to work, too.

But eventually, at least on this front, I had to give in to reality. It’s not just me they wrote this stuff for. There’s something like 40-50 Kodi users worldwide? I’ve got no idea how many people use Infuse, but I bet it’s a lot. And everyone’s using TMDB. I can’t imagine the bandwidth going through that place every hour of the day. So I understand no one is going to want to risk everything come crashing down all over the world just because you and me ask them nicely to please just ignore this little dude ( Ɂ ) or this one ( · ).

I mean, the ( Ɂ ) character has always choked up my old FlashRenamer software (which I’ve been using forever to write and manage my oh-so-clever regex-heavy automagic renaming scripts). FR can’t show the contents of any folder that contains a file utilizing that character. Locks it up. I don’t know why. I mean, apart from it not being in the supported character set. But not all unsupported characters do that. Others just get depicted are blank squares. No worries there. But not Mr. Faux Q.

Yet still, I do indeed prefer the look of this ( · ) to this ( - ) … but unfortunately TMDB doesn’t. Maybe 5% of the time, TMDB chokes on it and spits back nonsense. Nothing I can do about it. Believe me, I’ve tried. Look up my thread history! :smiley:

They’re not going to rewrite the APIs just because a few people like us need things just so.

In the end, I decided it’s more important to me that everything simply work. If that means I ditch my questions marks and suck it up with a few unsightly hyphens, well, that’s just what I have to do.

So now I keep myself “entertained” (or at least, busy) making sure every single file scrapes correctly the first time, without my needing to ever intervene. It bugs me if I have to do that. And when I feel compelled to browse, I’ll just do that in Infuse, where the colons and question marks are right where they ought to be.

Yeah, here’s one:

Eagles: Farewell I Tour - Live from Melbourne (2005)

So, my folder is named like this:

Eagles · Farewell I Tour - Live from Melbourne (2005) 8gb ø+

The bit on the end … that’s just for me. It tells me at a glance it’s an ( 8gb ) video encoded in 1080p HEVC ( ø ) that I also own ( + ) on Blu-ray disc.

My regex’s know how to make it play nice with TMDB; and it’s not really all that different. Just strip off the bit at the end, and swap in a second hyphen. Done. I don’t even have to think about it.

Eagles - Farewell I Tour - Live from Melbourne (2005).mkv

another…

TMDB: Mission: Impossible - Fallout
Me: Mission: Impossible - Fallout ᶦᵐᵃˣ ⁴ᴷʜᴅʀ
My movie folder: Mission·Impossible 6 · Fallout (2018) IMAX 4K HDR 21gb Ø
The movie filename (that only string that gets sent to TMDB): Mission-Impossible 6 - Fallout (2018) IMAX 4K HDR.mkv

  • I don’t seem to have any issues with anything hanging on beyond the release year. But I keep the special characters away just to be sure.

Thank you, I will try the dot option. In the meantime try this question mark character on your system and see how it works, just copy and past it “?” . It does look like the standard Windows/DOS “?” but it’s from the East Asian Font Set and happily it works perfectly fine for filenames and in Infuse. Infuse treats it as a blank space or ignores it during metadata searches. That is what prompted me to investigate and ask about the “:” from the same font set. Annoyingly, Infuse behaves as if that’s a valid letter and so it ‘reads’ a different ‘word’, causing automatic searches metadata to fail. The question mark character you are inputting is not from the East Asian Font Set and probably causes the same problem with infuse as the “:” does. I hope this gets picked up by someone from Firecore so they could poke around with it… As I said, if Infuse simply ignores the “:” character, just like the East Asian Q?, it would not mean huge crashes or rewrites of databases or searches. Hm … full quotation mark equivalents would be nice too… let me see

1 Like

Ooh, yeah — I like that one! Neat that it doesn’t confuse scraping.

Unfortunately, my renaming software doesn’t feel the same. In its defense, however, I should note that it hasn’t been updated since 2016 and chokes on anything that isn’t included in the Windows-1252 character set. Which is perfectly understandable, considering how the only language that existed back then was English prime, which came into the universe fully formed at the beginning of time; right here in the good ol’ U.S. of A. where it was first spoken by the first creatures who ever spoke. All those other weird languages you hear weirdos speaking all over the rest of the world didn’t get invented until billions of years later in October of 2018, when a dorm full of math nerds at Stanford got high and invented speaking in tongues as a drinking game.

Screenshot 2022-09-16 061209

I have been using a shareware program called Total Commander since my Win95 days. It’s a really powerful file manager that among many other magical goodies has a bulk file rename function built into it and can handle alternate (unicode) language font sets. The rename function is very versatile, can hunt character strings, renumber, change extensions… It can redo huge entire folders instantly. It’s constantly updated and latest version is 10.51 and will support Win11.

1 Like

We have no way of knowing that. Note that often APIs and security features purposely restrict or strip from queries characters which are not in an approved list of characters for a given input likely to prevent invalid or un-utilized characters from tripping up parsing algorithms, and to limit possible vectors for hacking. I’d imagine it’s possible special characters might also be reserved for the code that runs the databases and API queries; and these probably vary through all the different environments (different user and server operating systems, different user software suites generating the queries, the so very many different languages and regional variations of the same that must be matched on both ends)….

I’m not a programmer so I’ve probably botched most of this but if anyone who actually knows what they’re talking about wants to chime in, that help would be appreciated! :grimacing::grin:

For ( “ )
Just use two apostrophes ( ‘’ ).

I need something that lets me create really long scripts of replacements followed sequentially, with robust support of regular expressions.

My renaming scripts will go through dozens of steps automatically from arrival of files (with all possible different variations in naming schemes accounted for) to arrive at perfectly formatted folders and filenames all that match my preferred schemes.

Bunch of files go in one side, come out the other fully cooked. Subtitles simply marked 4_Eng or 16_Eng with no indication whether they are forced, standard, or SDH get parsed and labeled accordingly, no matter whether their are 1, 2, or 3 such named subtitle files included (based on relative file size, even though the algorithm itself can’t parse more than one file at a time nor be told anything about any other files that might exist along side — every rename happens in isolation — and yet I found a way :smiley:). 4K files, HDR and DV files are recognized and get named accordingly. Folder sizes are appended. (Codecs are assumed as I’m only adding x265 now). Folders get named for alphabetical sorting but files keep the articles in the front because doing otherwise often trips up the APIs. Folders, video files, metadata files, subtitles, Extra content, posters, and fanart are all processed (each in their unique ways) at the same time. I’ve come up with ways to recognize which abbreviations and acronyms should keep their periods and which periods need to be excluded. I’ve got ways to reinsert apostrophes where they might not have originally been included. I parse and fix casing issues. When video files need to be named with “alternative titles” because primary titles fail scraping, I have code that sets up file names such that the folders maintain the original name but the files only contain the alternate. I can toggle that back and forth infinitely. And note, that I can do all that utilizing only the file and folder names themselves to preserve the data. There’s no external list of exceptions or fringe cases or variables that my scripts can look to for help. … And on and on.

I’ve probably spent a 300x more time writing these scripts and getting them working perfectly with every possible exception as I encounter them than I ever spend watching the actual titles in my collection. And I know there are smarter programs available that can make this job a lot easier (by scraping the databases and using information found there to rename to spec) but I consider that cheating. I really enjoy the puzzles of working in that limited, 100% manual environment.

I always coded my web projects, even complicated ones, in notepad (instead of WYSIWYG code-writers) because that way (though not the fastest way) was the best way to ensure I knew exactly how everything worked. I’m weirdly inefficient like that, but wouldn’t have it any other way. .