Use WhisperX to get SRT output with accurate timestamps but not-so-accurate subtitles, then ask Gemini to "proofread" the subtitles using both the SRT and the audio as input. It works so well.
Use WhisperX to get SRT output with accurate timestamps but not-so-accurate subtitles, then ask Gemini to "proofread" the subtitles using both the SRT and the audio as input. It works so well.
It was working okay, but never well enough.
It was working okay, but never well enough.
Then, the second breakthrough was when Gemini became good enough to take audio as input and produce translated subtitles directly (about this time last year).
Then, the second breakthrough was when Gemini became good enough to take audio as input and produce translated subtitles directly (about this time last year).