Conference/ProceedingsMultimedia Content Access: Algorithms and Systems, IS&T/SPIE's Electronic Imaging 2007
Start date28.01.2007
End date01.02.2007
AddressSan Jose, CA, USA
EditorAlan Hanjalic; Raimondo Schettini; Nicu Sebe
Author(s)Amjad Samour, Mustafa Karaman, Lutz Goldmann, Thomas Sikora
TitleVideo to the Rescue of Audio: Shot Boundary Assisted Speaker Change Detection
AbstractSpeaker change detection (SCD) is a preliminary step for many audio applications such as speaker segmentation and recognition. Thus, its robustness is crucial to achieve a good performance in the later steps. Especially, misses (false negatives) affect the results. For some applications, domain-specific characteristics can be used to improve the reliability of the SCD. In broadcast news and discussions, the cooccurrence of shot boundaries and change points provides a robust clue for speaker changes. In this paper, two multimodal approaches are presented that utilize the results of a shot boundary detection (SBD) step to improve the robustness of the SCD. Both approaches clearly outperform the audio-only approach and are exclusively applicable for TV broadcast news and plenary discussions.
Key wordsSpeaker change detection (SCD), Shot boundary detection (SBD), Mel frequency cepstral coefficients (MFCC), Bayesian information criterion (BIC)
NoteISBN: 9780819466198