Auto Sync Subtitles

hey all - i think it would be a great feature to have auto sync for subtitles. often you find films (usually old ones) that are difficult to adjust using the time offset feature. this becomes especially hard if the film is in another language than the subtitles, which is also very common. to eliminate all the fuss of pausing and trying a time offset, an auto sync feature would be great to have.

2 Likes

That would be awesome … but how would it work?

Maybe a lip reading pixie with a rheostat hidden in the new ATV? :clown_face:

I’d throw down some cash for that. :slight_smile:

1 Like

i mean i’m not an developer so i have no clue, but i’m guessing some sort of algorithm that matches a sound sample with text?

Same.

Thing with subtitles is they are all set to show up at a certain time code during a movie’s playback — this bit of text is displayed at “00 hours, 05 minutes, 42.250 seconds” and erased at “00 hours, 05 minutes, 47.100 seconds”. This next bit of text at … etc.

The reason they get out of sync is because we are mixing and matching subtitles from different sources (for example, subs ripped off a blu-ray paired with a video and audio file ripped from a streaming service) … and those various sources also vary a bit in length depending on the length of pre-video-proper content (format, streamer, or distributor logos, content warnings, etc.) and how those were trimmed (or not) from the files we ultimately have.

To sync automagically, I guess you’d need the player (Infuse) to first read the subtitle files, figure out where the cue points are, and then scan the audio tracks in order to attempt to detect human speech; and create a cue list of when it thinks it hears speech. And then compare the two cue lists for possible synchronicity, and when found, magically shift the subtitle cues to overlap.

First, I don’t suppose its an easy processing task to identify human speech in an otherwise chaotic sonic environment. This would likely take some very complicated and specialized coding and considerable hardware resources. Ever watch any of google’s attempts to auto-subtitle YouTube vids? Not terribly effective; and they’ve got TONS of processing power and proprietary machine learning algorithms to throw at it.

Second, I suspect it would require subtitles that are written to only be displayed when someone is taking and get immediately pull down when someone stops talking — which doesn’t always happen for readability sake … though maybe it could be that only the start point matters?

Third, if someone is speaking in one breath more words than can be displayed onscreen at once, you’ll have multiple pages of subtitles but only one detected block of speech. Added complication.

Fourth; hearing impaired stuff. Text that isn’t synced to speech at all (thunder, ominous musical cues, door slams, etc.)

Fifth; Translations of displayed foreign text (during opening titles and various points inside a movie, for plot-specific street/shop signs, documents, computer screens and what not) that have no audio counterpart at all.

Sixth; subtitler credits. More subtitles to throw off the cue synchronicity that don’t match either audio OR on-screen displayed text.

In short, I imagine this being a pretty difficult job. But maybe someone here who’s much smarter could come up with a far simpler concept to tackle the problem? :man_shrugging:t2:

well clearly you have a clue :wink: i havent seen any other media player being able to pull this off, so i imagine if infuse managed to, they would indeed have created some magic.

1 Like

Nah, I just make guesses.

I love Infuse, but I’ve got lots of things I’d like to change about it — some that Firecore might get around to at a later point, and some they frustratingly disagree with me about whether they ought to be changed at all.

Firecore, being an Apple-only bit of kit, hews close to the Apple ethos of “Keep it Simple”.

Many suggestions that could presumably be easily implemented are not given the time of day because they are either “niche” request or would result in an ever-expanding settings menu, which many non-OCD types might find intimidating.

I do very much want them to create an “Expert” mode (similar to Kodi) where the default menu covers simply the basic, necessary settings utilized by all users … and the “Expert” toggle enables a whole slew of other oft-requested in these forums customization options that don’t change the fundamentals of how Infuse works but let users have more control over how it looks and how they interact with it …

… a few such items that come up a lot lately include alternate placement locations for logo art and the ability to show more actors on tv show and move info pages (they are already in the Infuse database, but for whatever reason Infuse won’t ever show more than 15), the ability to hide specific genre tabs from the genre bar, and the ability to use posters instead of fanart on the “Up Next List”.

I just feel like if I have an idea how my requests might be implemented, I’m more likely to get them to listen to me and consider doing something about it.

I mean it hasn’t really worked so far, but I’m an optimist. :laughing:

that’s exactly what this library does so this is possible: GitHub - sc0ty/subsync: Subtitle Speech Synchronizer

Cool. I checked it out — So did they steal my ideas from several years ago verbatim, and am I therefore owed some juicy royalties for their infringement of my rights; or did I just dream this stuff up one and share it online one night; in ignorant bliss that some other genius-level intellect individual was watching me, who had access to a Time Machine?

1 Like

haha it was the latter as the subsync repo is at least 6 years old when they started the project. i started using it this year and it’s been such a timesaver syncing subtitles manually

1 Like