<<
| Conference/Proceedings | Multimedia Content Access: Algorithms and Systems, IS&T/SPIE's Electronic Imaging 2007 |
| Start date | 28.01.2007 |
| End date | 01.02.2007 |
| Address | San Jose, CA, USA |
| Organisation | SPIE |
| Editor | Alan Hanjalic; Raimondo Schettini; Nicu Sebe |
| Publisher | SPIE |
| Volume | 6506 |
| Author(s) | Amjad Samour, Mustafa Karaman, Lutz Goldmann, Thomas Sikora |
| Title | Video to the Rescue of Audio: Shot Boundary Assisted Speaker Change Detection |
| Abstract | Speaker change detection (SCD) is a preliminary step for many audio applications such as speaker segmentation and recognition. Thus, its robustness is crucial to achieve a good performance in the later steps. Especially, misses (false negatives) affect the results. For some applications, domain-specific characteristics can be used to improve the reliability of the SCD. In broadcast news and discussions, the cooccurrence of shot boundaries and change points provides a robust clue for speaker changes. In this paper, two multimodal approaches are presented that utilize the results of a shot boundary detection (SBD) step to improve the robustness of the SCD. Both approaches clearly outperform the audio-only approach and are exclusively applicable for TV broadcast news and plenary discussions. |
| Key words | Speaker change detection (SCD), Shot boundary detection (SBD), Mel frequency cepstral coefficients (MFCC), Bayesian information criterion (BIC) |
| Note | ISBN: 9780819466198 |
| DOI | 10.1117/12.703114 |
BibTeX