• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

subsync/H03-Jul-2019-483396

tests/H03-Jul-2019-11089

.gitignoreH A D03-Jul-201956 65

.travis.ymlH A D03-Jul-2019369 3124

HISTORY.rstH A D03-Jul-20191.6 KiB6249

LICENSEH A D03-Jul-20191 KiB84

README.mdH A D03-Jul-20196.4 KiB135112

setup.cfgH A D03-Jul-2019178 107

setup.pyH A D03-Jul-20191.5 KiB5039

subsync-vlc.patchH A D03-Jul-20197.8 KiB176171

README.md

1subsync
2=======
3
4[![Build Status](https://travis-ci.org/smacke/subsync.svg?branch=master)](https://travis-ci.org/smacke/subsync)
5[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
6
7Language-agnostic automatic synchronization of subtitles with video,
8so that subtitles are aligned to the correct starting point within the video.
9
10The implementation for this project was started during HackIllinois 2019,
11for which it received an **_Honorable Mention_**
12(ranked in the top 5 projects, excluding projects that won company-specific prizes).
13
14Turn this:                       |  Into this:
15:-------------------------------:|:-------------------------:
16![](tearing-me-apart-wrong.gif)  |  ![](tearing-me-apart-correct.gif)
17
18Install
19-------
20First, make sure ffmpeg is installed. On MacOS, this looks like:
21~~~
22brew install ffmpeg
23~~~
24Next, grab the script. It should work with both Python 2 and Python 3:
25~~~
26pip install git+https://github.com/smacke/subsync
27~~~
28
29Usage
30-----
31~~~
32subsync video.mp4 -i unsynchronized.srt > synchronized.srt
33~~~
34
35or
36
37~~~
38subsync video.mp4 -i unsynchronized.srt -o synchronized.srt
39~~~
40
41Although it can usually work if all you have is the video file, there may be occasions where you have a correctly synchronized "reference" srt file in a language you are unfamiliar with, as well as an unsynchronized srt file in your native language.  In this case, it will be faster to do the following:
42
43~~~
44subsync reference.srt -i unsynchronized.srt -o synchronized.srt
45~~~
46
47Whether to perform voice activity detection on the audio or to directly extract speech from an srt file is determined from the file extension.
48
49VLC Integration
50---------------
51To demonstrate how one might use `subsync` seamlessly with real video software,
52we developed a prototype integration into the popular [VLC](https://www.videolan.org/vlc/index.html)
53media player, which was demoed during the HackIllinois 2019 project expo. The resulting patch
54can be found in the file [subsync-vlc.patch](https://github.com/smacke/subsync/raw/master/subsync-vlc.patch).
55Here are instructions for how to use it.
56
571. First clone the 3.0 maintenance branch of VLC and checkout 3.0.6:
58~~~
59git clone git://git.videolan.org/vlc/vlc-3.0.git
60cd vlc-3.0
61git checkout 3.0.6
62~~~
632. Next, apply the patch:
64~~~
65wget https://github.com/smacke/subsync/raw/master/subsync-vlc.patch
66git apply subsync-vlc.patch
67~~~
683. Follow the normal instructions on the
69[VideoLAN wiki](https://wiki.videolan.org/VLC_Developers_Corner/)
70for building VLC from source. *Warning: this is not easy.*
71
72You should now be able to autosynchronize subtitles using the hotkey `Ctrl+Shift+S`
73(only enabled while subtitles are present).
74
75Speed
76-----
77My experience is that `subsync` usually finishes running in 20 to 30 seconds,
78depending on the length of the video. The most expensive step is actually
79extraction of raw audio. If you already have a correctly synchronized "reference" srt
80file (in which case the video is no longer necessary),
81`subsync` typically runs in less than a second.
82
83How It Works
84------------
85The synchronization algorithm operates in 3 steps:
861. Discretize video and subtitles by time into 10ms windows.
872. For each 10ms window, determine whether that window contains speech.
88   This is trivial to do for subtitles (we just determine whether any subtitle is "on" during each time window);
89   for video, use an off-the-shelf voice activity detector (VAD) like
90   the one built into [webrtc](https://webrtc.org/).
913. Now we have two binary strings: one for the subtitles, and one for the video.
92   Try to align these strings by matching 0's with 0's and 1's with 1's. We score
93   these alignments as (# video 1's matched w/ subtitle 1's) - (# video 1's matched with subtitle 0's).
94
95The best-scoring alignment from step 3 determines how to offset the subtitles in time
96so that they are properly synced with the video. Because the binary strings
97are fairly long (millions of digits for video longer than an hour), the naive
98O(n^2) strategy for scoring all alignments is unacceptable. Instead, we use the
99fact that "scoring all alignments" is a convolution operation and can be implemented
100with the Fast Fourier Transform (FFT), bringing the complexity down to O(n log n).
101
102Limitations
103-----------
104In most cases, inconsistencies between video and subtitles occur when starting
105or ending segments present in video are not present in subtitles, or vice versa.
106This can occur, for example, when a TV episode recap in the subtitles was pruned
107from video. Subsync typically works well in these cases, and in my experience
108this covers >95% of use cases. Handling breaks and splits outside of the beginning
109and ending segments is left to future work (see below).
110
111Future Work
112-----------
1131. I am currently working to extend the synchronization algorithm to handle splits / breaks
114in the middle of video not present in subtitles (or vice versa). It will take some time
115before I have a robust solution (assuming one is possible). See
116[#10](https://github.com/smacke/subsync/issues/10) for more details.
117
1182. The prototype VLC patch is very experimental -- it was developed under pressure
119and just barely works. I would love to see this project more robustly
120integrated with VLC, either directly in the VLC core, or as a plugin.
121If you or anyone you know has ideas for how to accomplish this, please let me know!
122
123Credits
124-------
125This project would not be possible without the following libraries:
126- [ffmpeg](https://www.ffmpeg.org/) and the [ffmpeg-python](https://github.com/kkroening/ffmpeg-python) wrapper, for extracting raw audio from video
127- VAD from [webrtc](https://webrtc.org/) and the [py-webrtcvad](https://github.com/wiseman/py-webrtcvad) wrapper, for speech detection
128- [srt](https://pypi.org/project/srt/) for operating on [SRT files](https://en.wikipedia.org/wiki/SubRip#SubRip_text_file_format)
129- [numpy](http://www.numpy.org/) and, indirectly, [FFTPACK](https://www.netlib.org/fftpack/), which powers the FFT-based algorithm for fast scoring of alignments between subtitles (or subtitles and video)
130- [sklearn](https://scikit-learn.org/) for its data pipeline API
131- Other excellent Python libraries like [argparse](https://docs.python.org/3/library/argparse.html) and [tqdm](https://tqdm.github.io/), not related to the core functionality, but which enable much better experiences for developers and users.
132
133# License
134Code in this project is [MIT licensed](https://opensource.org/licenses/MIT).
135