Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to automatically split/merge Whisper subtitles in a smarter way? #8452

Open
GOvEy1nw opened this issue May 30, 2024 · 2 comments
Open

How to automatically split/merge Whisper subtitles in a smarter way? #8452

GOvEy1nw opened this issue May 30, 2024 · 2 comments

Comments

@GOvEy1nw
Copy link

GOvEy1nw commented May 30, 2024

Perhaps this is already possible, but I'm struggling to get 'neater' and more 'logical' splits/merges in an automatic way for generated subtitles.

When I generate subtitles with faster-whisper, it mostly looks great, but there are often a good few parts where it's not quite as well split/merged as it could be. Here are a few examples:

  1. Bad splitting of sentences.
74
00:05:43,370 --> 00:05:47,069
You take two hydrogen atoms, you ram them
together, and what's left over is a helium

75
00:05:47,070 --> 00:05:48,070
atom.

Ideally, something like this would work out better:

74
00:05:43,370 --> 00:05:45,851
You take two hydrogen atoms,
you ram them together

75
00:05:45,876 --> 00:05:48,070
and what's left over is a helium atom.

Perhaps an 'automated' way to do this would be a function that does the following (although I'm sure there's a simpler way to do this!):

  1. Checks if there are any sections with less than 3 words in them.
    If a section meets that condition then...
  2. Check if the section before it is less than 50ms away.
    If that is true then...
  3. Check if merging them would clash with the line max length, if it would, split off some of the previous longer section and merge it with the shorter section. Otherwise just merge the shorter section into the longer section.

——————————————

  1. Sections with long gaps between lines (kind of the opposite issue of the above)
71
00:05:25,800 --> 00:05:32,440
And it's created by one of the most violent
reactions in the universe... nuclear fusion.

Ideally should be more like this:

71
00:05:25,800 --> 00:05:30,253
And it's created by one of the most
violent reactions in the universe...

72
00:05:31,186 --> 00:05:33,113
nuclear fusion.

Perhaps this is more of an issue with Whisper (maybe there's a setting to fix it?), but it'd be great to be able to automatically check if there are sections with gaps in dialogue longer than 1 second, and if so, split the part that comes after the gap off into its own section.

As mentioned above, there could already be automatic fixes for this, but I've not managed to find a solution using 'fix common errors' etc, but please let me know if there is an automatic solution for this already.

@gru123
Copy link

gru123 commented May 30, 2024

2024-05-30 22_50_15-Clipboard
Click 'advanced' and read the description, especially '--sentence', '--max...' and '--min...'. This should solve the problem at least partially.

@ANewDawn
Copy link

I use highlight words option in advanced. It underlines spoken words as they are spoken much easier to follow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants