Subtitle Synchronization Using Whisper ASR Model

P, Thara and Azneed, Mohammed and Sanas, Muhammad and M, Jithu Prakash P and Naik, Harsh P and B, Divya (2024) Subtitle Synchronization Using Whisper ASR Model. In: UNSPECIFIED.

Full text not available from this repository.

Abstract

This paper introduces a novel approach to subtitle synchronization using the Whisper ASR (Automatic Speech Recognition) model from OpenAI. The primary aim of this research is to achieve accurate and robust synchronization of subtitles with audio content, even in the presence of uniform or non-uniform delays. The project leverages the capabilities of the Whisper ASR model, which is trained on diverse audio datasets and offers multitasking functionalities including multilingual speech recognition, speech translation, and language identification. Key features of the project include Whisper ASR integration, SRT modification, and accuracy enhancement. The system utilizes FFmpeg for audio extraction and preprocessing, followed by speech recognition using the Whisper ASR model. Preprocessing techniques are applied to enhance the precision of generated timestamps, ensuring precise synchronization. The methodology also involves timestamp adjustment based on text comparison between input and transcribed SRT files, resulting in accurately synchronized subtitles. The project offers a user-friendly interface for input acquisition and interaction, guiding users through the synchronization process. While promising, the system has limitations such as the lack of action detection, character name prediction, and challenges with repetitive word handling. Nonetheless, the Whisper ASR-based Subtitle Synchronization project presents a reliable solution for enhancing subtitle accuracy and accessibility in various video content scenarios. © 2025 Elsevier B.V., All rights reserved.

Item Type: Conference or Workshop Item (Paper)
Subjects: Computer Science > Artificial Intelligence
Divisions: Arts and Science > Vinayaka Mission's Kirupananda Variyar Arts & Science College, Salem > Computer Science
Depositing User: Unnamed user with email techsupport@mosys.org
Last Modified: 27 Nov 2025 06:42
URI: https://vmuir.mosys.org/id/eprint/1728

Actions (login required)

View Item
View Item