Robust Automatic Speech Recognition Employing Phoneme Dependent Multi Environment Enhanced Models Based Linear Normalization


Download Robust Automatic Speech Recognition Employing Phoneme Dependent Multi Environment Enhanced Models Based Linear Normalization PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Robust Automatic Speech Recognition Employing Phoneme Dependent Multi Environment Enhanced Models Based Linear Normalization book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.

Download

Robust Automatic Speech Recognition Employing Phoneme-dependent Multi-environment Enhanced Models Based Linear Normalization


Robust Automatic Speech Recognition Employing Phoneme-dependent Multi-environment Enhanced Models Based Linear Normalization

Author: Igmar Hernández Ochoa

language: en

Publisher:

Release Date: 2006


DOWNLOAD





This work shows a robust normalization technique by cascading a speech enhance-ment method followed by a feature vector normalization algorithm. An efficient scheme used to provide speech enhancement is the Spectral Subtraction algorithm, which reduces the effect of additive noise by performing a subtraction of noise spectrum estimate over the complete speech spectrum. On the other hand, a new and promising technique known as PD-MEMLIN (Phoneme-Dependent Multi-Enviroment Models based Linear Normalization) has also shown to be effective. PD-MEMLIN is an empirical feature vector normalization which models clean and noisy spaces by Gaussian Mixture Models (GMMs), and estimates the different compensation linear transformation to be per-formed to clean the signal. In this work the integration of both approaches is proposed. The final design is called PD-MEEMLIN (Phoneme-Dependent Multi-Enviroment Enhanced Models based Linear Normalization), which confirms and improves the effectiv-ness of both approaches. The results obtained show that in very high degraded speech (between -5dB and OdB) PD-MEEMLIN outperforms the SS by a range between 11.4% and 34.5%,for PD-MEMLIN by a range between 11.7% and 24.84%, and for SPLICE by a range between 6.04% and 22.23%. Furthemore, in moderate SNR, i.e. 15 or 20 dB, PD-MEEMLIN is as good as PD-MEMLIN and SS techniques.

Pattern Recognition and Image Analysis


Pattern Recognition and Image Analysis

Author: Joan Martí

language: en

Publisher: Springer Science & Business Media

Release Date: 2007-05-31


DOWNLOAD





Part of a two-volume set, this book constitutes the refereed proceedings of the Third Iberian Conference on Pattern Recognition and Image Analysis, IbPRIA 2007, held in Girona, Spain in June 2007. It covers pattern recognition, human language technology, special architectures and industrial applications, motion analysis, image analysis, biomedical applications, shape and texture analysis, 3D, and image coding and processing.

Self-Learning Speaker Identification


Self-Learning Speaker Identification

Author: Tobias Herbig

language: en

Publisher: Springer Science & Business Media

Release Date: 2011-06-18


DOWNLOAD





Current speech recognition systems are based on speaker independent speech models and suffer from inter-speaker variations in speech signal characteristics. This work develops an integrated approach for speech and speaker recognition in order to gain space for self-learning opportunities of the system. This work introduces a reliable speaker identification which enables the speech recognizer to create robust speaker dependent models In addition, this book gives a new approach to solve the reverse problem, how to improve speech recognition if speakers can be recognized. The speaker identification enables the speaker adaptation to adapt to different speakers which results in an optimal long-term adaptation.