Technology

Human Speech Recognition is Improved by Machine Learning

Human Speech Recognition is Improved by Machine Learning

Scientific study on hearing loss is expanding quickly as more and more baby boomers experience hearing loss as they become older.

The ability of individuals to detect speech is studied by researchers to better understand how hearing loss affects people. When there is reverberation, a hearing impairment, or a substantial amount of background noise, such as from several speakers or traffic, it is more challenging for listeners to understand human speech.

As a result, algorithms for hearing aids are frequently utilized to enhance human speech recognition. Researchers conduct studies to ascertain the signal-to-noise ratio at which a particular number of words (often 50%) are identified in order to evaluate such algorithms. But these exams take a lot of time and money.

With the use of machine learning (ML), which is a form of artificial intelligence (AI), software programs can predict outcomes more accurately without having to be explicitly instructed to do so. In order to forecast new output values, machine learning algorithms use historical data as input.

Researchers from Germany examine a human voice recognition model based on machine learning and deep neural networks in The Journal of the Acoustical Society of America, which is distributed by the Acoustical Society of America through AIP Publishing.

“The novelty of our model is that it provides good predictions for hearing-impaired listeners for noise types with very different complexity and shows both low errors and high correlations with the measured data,” said author Jana Roßbach, from Carl Von Ossietzky University.

We were most surprised that the predictions worked well for all noise types. We expected the model to have problems when using a single competing talker. However, that was not the case.

Jana Roßbach

Machine learning is significant because it aids in the development of new goods and provides businesses with a picture of trends in consumer behavior and operational business patterns. A significant portion of the operations of many of today’s top businesses, like Facebook, Google, and Uber, revolve around machine learning.

For many businesses, machine learning has emerged as a key competitive differentiation. Using automatic voice recognition, the researchers determined how many words per sentence a listener can comprehend.

The majority of people are familiar with ASR because to voice-activated assistants like Alexa and Siri. Eight participants with normal hearing and 20 listeners with hearing loss participated in the study and were exposed to a variety of complicated noises that obscure speech.

Three groups of hearing-impaired listeners with varying degrees of aging-related hearing loss were created.

The model allowed the researchers to forecast how well hearing-impaired listeners will perform in terms of human voice recognition for a range of noise maskers with varying degrees of temporal complexity and closeness to actual speech. An individual’s potential hearing loss could be taken into account.

“We were most surprised that the predictions worked well for all noise types. We expected the model to have problems when using a single competing talker. However, that was not the case,” said Roßbach.

Single-ear hearing predictions were made using the model. The researchers will create a binaural model moving forward because two-ear hearing affects speech comprehension.

The model could potentially be used to predict speech intelligibility, listening effort, or speech quality as these subjects are closely related.