|Philadelphia, PA, USA
|Hyoung-Gook Kim, Daniel Ertelt, Thomas Sikora
|Hybrid Speaker-Based Segmentation System Using Model-Level Clustering
|In this paper, we present a hybrid speaker-based segmentation, which combines metric-based and modelbased
techniques. Without a priori information about number of speakers and speaker identities, the speech stream is segmented by three stages: (1) The most likely speaker changes are detected. (2) To group segments of identical speakers, a two-level clustering algorithm using a Bayesian Information Criterion (BIC) and HMM model scores is performed. Every cluster is assumed to contain only one speaker. (3) The speaker models are reestimated from each cluster by HMM. Finally a resegmentation step performs a more refined segmentation using these speaker models. For measuring the performance we compare the segmentation results of the proposed hybrid method versus metric-based segmentation. Results show that the hybrid approach using two-level clustering significantly outperforms direct metric based segmentation.