Speech intelligibility and quality: a comparative study of speech enhancement algorithms.
Abstract
Mobile devices are widely used today for speech communication. The environments in which these devices are used are widely varied and often the level of background noise in the speaker's environment can be significant. The purpose of speech enhancement is to reduce the level of background noise, ideally to such a level that it is not noticed by the listener. While speech enhancement algorithms can significantly reduce the noise level in a speech signal, improving speech quality, it is widely recognized that enhancement algorithms can have a negative impact on speech intelligibility. This paper compares the effect of three different speech enhancement algorithms on the intelligibility and the quality of speech. This work is the initial phase of an investigation into mitigating the impact of speech enhancement algorithms on speech intelligibility. The speech enhancement algorithms evaluated each use different approaches for noise reduction, namely, a statistical model-based algorithm, a noise estimation algorithm and a wavelet packet decomposition-based algorithm. Two objective speech intelligibility measurements and three objective speech quality measurements are used to assess the performance of the enhancement algorithms. The results of the experiments show that all the speech enhancement algorithms in this study have a negative impact on speech intelligibility to varying degrees.
Collections
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland
Related items
Showing items related by title, author, creator and subject.
-
Evaluating performance of commercial automatic speech recognition systems for speakers with dysarthria
Fahim, Sirajuim; Murray, Niall; Flynn, Ronan (Technological University of the Shannon Midlands Midwest, 2022)Dysarthria refers to motor neuron speech disorders that limit the ability to control muscle groups used for speech production. Dysarthric speech has reduced intelligibility due to being slurry, breathy and slow compared ... -
The wizard of OZ: instilling a resilient heart into self-service business applications
Costello, Gabriel J. (2006)Speech enabled business applications are characterized by complex implementations that bring together language processing technologies, applications development, and end-user psychology. Resilience is critical to maintaining ... -
Speech, head, and eye-based cues for continuous affect prediction.
O'Dwyer, Jonny (IEEE Xplore, 2019-12)Continuous affect prediction involves the discrete time-continuous regression of affect dimensions. Researchers in this domain are currently embracing multimodal model input. This provides motivation for researchers to ...