Conference/Proceedings | IS&T/SPIE's Electronic Imaging 2004 |
Start date | 18.01.2004 |
End date | 22.01.2004 |
Address | San Jose, CA, USA |
Author(s) | Hyoung-Gook Kim, Thomas Sikora |
Title | Performance of MPEG-7 spectral basis representations for retrieval of home video abstract |
Abstract | In this paper, we present a classification and retrieval technique targeted for retrieval of home video abstract using dimension-reduced, decorrelated spectral features of audio content. The feature extraction based on MPEG-7 descriptors consists of three main stages: Normalized Audio Spectrum Envelope (NASE), basis decomposition algorithm and basis projection, obtained by multiplying the NASE with a set of extracted basis functions. A classifier based on continuous hidden Markov models is applied. For retrieval with accurate performance the system consists of a two-level hierarchy method using speech recognition and sound classification. For the measure of the performance we compare the classification results of MPEG-7 standardized features vs. Mel-scale Frequency Cepstrum Coefficients (MFCC). Results show that the MFCC features yield better performance compared to MPEG-7 features. |
Key words | MPEG-7, Normalized Audio Spectrum Envelope (NASE), basis decomposition algorithm, Mel-scale Frequency Cepstrum Coefficients (MFCC) |
File | 0793Kim2004.ps |