Perception of vocal characteristics in cochlear-implant users

When many persons speak at the same time, listeners can track speakers in the noise mixture by identifying their voice. Voice identification or discrimination relies on a number of cues, but two particularly stick out first because they are directly related to the anatomy of the speaker, and, second, because they explain a large proportion of acoustic variability across voices (see Turner et al., 2009): one is the fundamental frequency (F0) and the other is vocal-tract length (VTL). More details about these voice dimensions can be found on the excellent wiki of the CNBH: acousticscale.org.

These situations where all the members of a crowd are babbling at the same time are difficult for everybody, but they seem to be particularly so for most cochlear-implant users. One reason often advanced is the fact that the cochlear-implant provides a rather poor pitch experience. One of the two main vocal characteristics, F0, is therefore notoriously weak through the implant. However very little is known about the other principal dimension: VTL.

The purpose of our research is to describe how cochlear-implant users perceive VTL and what may hinder their perception of that cue. Based on this, we also explore and test potential solutions to improve VTL perception in the hope that it would contribute to improving the perception of speech-on-speech.

Publications

  • Gaudrain E., Başkent D. (2015). “Factors limiting vocal-tract length discrimination in cochlear implant simulations.” The Journal of the Acoustical Society of America 137(3), 1298-1308. doi : 10.1121/1.4908235.
  • Fuller C.D., Gaudrain E., Clarke J., Galvin J.J., Fu Q.-J., Free R.H., Başkent D. (2014). “Gender categorization is abnormal in cochlear-implant users.” Journal of the Association for Research in Otolaryngology 15(6), 1037-1048. doi : 10.1007/s10162-014-0483-7.
  • Gaudrain E., Li S., Ban V.S., Patterson R.D. (2009). “The role of glottal pulse rate and vocal tract length in the perception of speaker identity.” Interspeech 2009, vol. 1-5, 152-155.

Some presentations at conferences

  • El Boghdady N., Başkent D., Gaudrain E. (2015). “Effect of frequency allocation on cocal tract length perception in cochlear implant users.” Conference on Implantable Auditory Prostheses, Lake Tahoe, California, USA. Poster.
  • Gaudrain E., Başkent D. (2014). “Vocal-tract length discrimination and spectral resolution in cochlear implants.” 8th International Symposium on Objective Measures in Auditory Implants, Toronto, Canada. Poster.
  • Gaudrain E., Başkent D. (2014). “Factors limiting perception of vocal characteristics in cochlear-implants.” 37th Annual Mid-winter Meeting of the Association for Research in Otolaryngology, San Diego, California, USA. Poster.
  • Gaudrain E., Fuller C.D., Clarke J., Galvin J.J., Fu Q.-J., Free R.H., Başkent D. (2013). “Little Red Riding Hood was a Cochlear Implant User!!” Conference on Implantable Auditory Prostheses, Lake Tahoe, California, USA. Poster.

Tools

For this research, I (re)developed a vocoder that can be easily configured to do all sorts of things. The code is available on GitHub:

  • Gaudrain E. (2016). "Vocoder, v1.0" 10.5281/zenodo.48120

The simplified cocktail party

The cocktail party effect is how you can have a conversation with someone even when other talkers are babbling around you. One extreme case is when there are a lot of speakers who are very far from you, except one who is close to you and that you try to understand. In that case, the situation amounts to perceiving speech in noise. Another extreme case is when there are two speakers close to you, and you are trying to hear one of them. This is a case of speech-on-speech perception. Of course most realistic cocktail parties consist of intermediate situations between these two extremes. But to understand how things work in the lab we generally study one or the other of these two extremes.

Our research aims at revealing the underlying mechanisms that allow a listener to track what one target speaker is saying in the presence of a competing talker, both in normal hearing listeners, but also in listeners with hearing impairment. To make it manageable, we have decomposed the big cocktail party problem into smaller unit mechanisms: (1) simultaneous segregation, (2) sequential segregation, and (3) linguistic interpretation.

Publications

  • Başkent D., Gaudrain E. (2016). “Musician advantage for speech-on-speech perception.” The Journal of the Acoustical Society of America, 139, EL51. doi : 10.1121/1.4942628.
  • Clarke J., Başkent D., Gaudrain E. (2016). “Pitch and spectral resolution: A systematic comparison of bottom-up cues for top-down repair of degraded speech.” The Journal of the Acoustical Society of America, 139, 395. doi : 10.1121/1.4939962.
  • Clarke J., Gaudrain E., Chatterjee M.C., Başkent D. (2014). “T'ain't the way you say it, it's what you say – Perceptual continuity of voice and top–down restoration of speech.” Hearing Research 315, 80-87. doi : 10.1016/j.heares.2014.07.002.
  • Bhargava P., Gaudrain E., Başkent D. (2014). “Top–down restoration of speech in cochlear-implant users.” Hearing Research 309, 113-23. doi : 10.1016/j.heares.2013.12.003.
  • Gaudrain E., Carlyon R.P. (2013). “Using Zebra-speech to study sequential and simultaneous speech segregation in a cochlear-implant simulation.” The Journal of the Acoustical Society of America 133(1), 502-518. doi: 10.1121/1.4770243.
  • Gaudrain E., Grimault N., Healy E.W., Béra J.-C. (2012). “The relationship between concurrent speech segregation, pitch-based streaming of vowel sequences, and frequency selectivity.” Acta Acustica united with Acustica 98(2), 317-327. doi : 10.3813/AAA.918515.
  • Gaudrain E., Grimault N., Healy E.W., Béra J.-C. (2008). “Streaming of vowel sequences based on fundamental frequency in a cochlear implant simulation.” The Journal of the Acoustical Society of America 124(5), 3076-3087. doi : 10.1121/1.2988289.
  • Gaudrain E., Grimault N., Healy E.W., Béra J.-C. (2007). “Effect of spectral smearing on the perceptual segregation of vowel sequences.” Hearing Research, 231, 32-41. doi : 10.1016/j.heares.2007.05.001.
  • Grimault N., Gaudrain E. (2007). “The Consequences of Cochlear Damages on Auditory Scene Analysis.” Current Topics in Acoustical Research 4, 17–24. Online access on researchtrends.net.

Some presentations at conferences

  • Clarke J., Gaudrain E., Başkent D. (2015). “Top-down Repair of Speech: F0 Contours.” 38th Annual Mid-winter Meeting of the Association for Research in Otolaryngology, Baltimore, Maryland, USA. Poster.
  • Clarke J., Gaudrain E., Başkent D. (2014). “Top-down Repair of Speech: Adding Pitch to Spectrally Degraded Speech.” 37th Annual Mid-winter Meeting of the Association for Research in Otolaryngology, Baltimore, Maryland, USA. Poster.
  • Clarke J., Gaudrain E., Başkent D. (2013). “Is weak pitch fully responsible for low comprehension of interrupted speech with poor spectral resolution?” Conference on Implantable Auditory Prostheses, Lake Tahoe, California, USA. Poster.
  • Gaudrain E., Carlyon R.P., Clarke J., Başkent D. (2013). “Concurrent speech perception: from streaming to phonemic restoration.” 5th Workshop on Speech in Noise: Intelligibility and Quality, 10-11 January 2013, Vitoria, Spain.
  • Gaudrain E., Carlyon R.P. (2012). “Invite a Zebra to your electro-acoustic cocktail party: sequential concurrent speech segregation in EAS simulation.” 4th Workshop on Speech in Noise: Intelligibility and Quality, Cardiff, UK. Poster.
  • Gaudrain E., Carlyon R.P. (2011). “Zebra speech: investigating the role of sequential segregation for concurrent speech perception in cochlear implants.” Annual Conference of the British Society of Audiology, Nottingham, UK. Poster.

2016-09-08 13:03:45