PhD thesis

AuthorMarcus Purat
TitleZum Einsatz von Wavelet- und Waveletpacket-Transformationen in niederratigen, wahrnehmungsangepaßten Audiocodierverfahren
TutorProf. Noll
AbstractMost audio coding algorithms use time frequency transforms with a high spectral resolution to exploit the masking phenomena of the human ear and to reduce the redundancy of the signal efficiently. This may lead to time domain artefacts (pre-echoes) when coding critical items with low bit rates. To avoid these clearly audible noise special methods as hybrid filter banks, window switching, time domain companding or temporal noise shaping are used. However, these methods may considerably increase the coding complexity and often involve more or less disadvantages regarding coding efficiency.

Under this aspect, my thesis considers wavelet packet transforms as an alternative tool. These transforms combine a high frequency resolution in the low frequency domain with a high temporal resolution in the high frequency domain and, thus, seem to be very suitable for the use in a low bit rate perceptual audio coder. Moreover, the algorithms that underlie the transforms allow for a number of possible adaptations that are useful in a audio coding system.

Most commonly, the Mallat algorithm is used for the implementation of a wavelet packet transform. This algorithm is based on a filter bank. Different types of filters (QMF, CQF, biorthogonal filters) are investigated and optimized in a coding system that had been developed especially for the comparison of different transforms. My thesis states the practicality of using time variant filter banks, gives an overview about possible realizations of transition filters (periodic extension, Gram-Schmidt-orthogonalization, lattice switching), and compares the results of fixed and time variant filters. Finally, a general view makes clear that practicable filters will always lead to strong spectral side lobes in a coding system that will be audible for critical signals.

The concept of Frequency-Varying Modulated Lapped Transforms (FVMLT) that is shown in detail avoids these major drawback of the Mallat algorithm for audio coding while theoretically maintaining the same time frequency resolution. It is proved to be more efficient for audio coding for both subjective and obective measures. The possibility of using fast algorithms for the underlying modulated lapped transforms (MLT) and the lower coding delay constitutes two more advantages for a practical realization. FVMLT are conceptually related to Lemarié-Meyer-Wavelets and temporal noise shaping. These relations are discussed as well.

Using window switching both in time and frequency domain FVMLT allows for an efficient algorithm to adapt the time-frequency analysis to the signal characteristics. By this adaptation, a remarkable gain can be achieved in comparison with fixed transforms. Also in comparison with window switching in time-domain only gains can be achieved. Results and investigations are given in detail for some modifications of this adaptation, taking also into consideration the concept of a best-base transform.