Detection of Non-native Speaker Status from Backwards and Vocoded Content-masked Speech

Arkadiusz Rojczyk; Andrzej Porzuczek

doi:10.31261/TAPSLA.7714

Published: 2021-01-18

Vol. 6 No. 2 (2020)

Detection of Non-native Speaker Status from Backwards and Vocoded Content-masked Speech

Arkadiusz Rojczyk

, Andrzej Porzuczek

Section: Articles

https://doi.org/10.31261/TAPSLA.7714

Abstract

This paper addresses the issue of speech rhythm as a cue to non-native pronunciation. In natural recordings, it is impossible to disentangle rhythm from segmental, subphonemic or suprasegmental features that may influence nativeness ratings. However, two methods of speech manipulation, that is, backwards content-masked speech and vocoded speech, allow the identification of native and non-native speech in which segmental properties are masked and become inaccessible to the listeners. In the current study, we use these two methods to compare the perception of content-masked native English speech and Polish-accented speech. Both native English and Polish-accented recordings were manipulated using backwards masked speech and 4-band white-noise vocoded speech. Fourteen listeners classified the stimuli as produced by native or Polish speakers of English. Polish and English differ in their temporal organization, so, if rhythm is a significant contributor to the status of non-native accentedness, we expected an above-chance rate of recognition of native and non-native English speech. Moreover, backwards content-masked speech was predicted to yield better results than vocoded speech, because it retains some of the indexical properties of speakers. The results
show that listeners are unable to detect non-native accent in Polish learners of English from backwards and vocoded speech samples.

Keywords:

accent detection , non-native accent , content-masked speech , vocoded speech , backwards speech

Download files

PDF

Citation rules

Rojczyk, A., & Porzuczek, A. (2021). Detection of Non-native Speaker Status from Backwards and Vocoded Content-masked Speech. Theory and Practice of Second Language Acquisition, 6(2), 87–105. https://doi.org/10.31261/TAPSLA.7714

Cited by / Share

References

Alexander, L. G. (1967). Practice and progress: An integrated course for pre-intermediate students.London: Longman.
Google Scholar

Andrianopolous, M. V., Darrow, K. N., & Chen, J. (2001). Multimodal standarization of voice among four multicultural populations: Formant structures. Journal of Voice, 15, 61–77.
Google Scholar

Anisfeld, M., Bogo, N., & Lambert, W. E. (1962). Evaluational reactions to accented English speech. Journal of Abnormal and Social Psychology, 65, 223–231.
Google Scholar

Arthur, B., Farrar, D., & Bradford, G. (1974). Evaluation reactions of college students to dialect differences in the English of Mexican-Americans. Language Speech, 17(3), 255–270.
Google Scholar

Black, J. W. (1973). The ‘phonemic’ content of backward-reproduced speech. Journal of Speech and Hearing Research, 16, 165–174.
Google Scholar

Boersma, P. (2001). Praat, a system for doing phonetics by computer. Glot International, 10, 341–345.
Google Scholar

Cummings, F., & Port, R. (1998). Rhythmic constraints on stress timing in English. Journal of Phonetics, 26, 145–171.
Google Scholar

Davis, M. H., Johnsrude, I. S., Hervais-Adelman, A., Taylor, K., & McGettigan, C. (2005). Lexical information drives perceptual learning of distorted speech: Evidence from the comprehension of noise-vocoded sentences. Journal of Experimental Psychology, 134(2), 222–241.
Google Scholar

Dellwo, V., Leemann, A., & Kolly, M.-J. (2012). Speaker idiosyncratic rhythmic features in the speech signal. Electronic Proceedings of Interspeech 2012. Portland, OR, USA, 1584–1587.
Google Scholar

Derwing, T. M., Munro, M. J., & Thomson, R. I. (2008). A longitudinal study of ESL learners’ fluency and comprehensibility development. Applied Linguistics, 29(3), 359–380.
Google Scholar

Donaldson, W. (1992). Measuring recognition memory. Journal of Experimental Psychology. General, 121(3), 275–277.
Google Scholar

Donaldson, W. (1993). Accuracy of d’ and A’ as estimates of sensitivity. Bulletin of Psychonomic Society, 31, 271–274.
Google Scholar

Flege, J. E., & Port. R. (1981). Cross-language phonetic interference: Arabic to English. Language and Speech, 24(2), 125–146.
Google Scholar

Fourcin, A., & Dellwo, V. (2009). Rhythmic classification of languages based on voice timing. UCL Eprints. Retrieved from: http://eprints.ucl.ac.uk/15122/ accessed March 15, 2018.
Google Scholar

Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley. Kolly, M.-J., Boula de Mareüil, P., Leemann, A., & Dellwo, V. (2017). Listeners use temporal information to identify French- and English-accented speech. Speech Communication, 86, 121–134.
Google Scholar

Kolly, M.-J., & Dellwo, V. (2013). (How) can listeners identify the L1 in foreign-accented L2 speech? Travaux Neuchâtelois de Linguistique, 59, 127–148.
Google Scholar

Kolly, M.-J., & Dellwo, V. (2014). Cues to linguistic origin: The contribution of speech temporal information to foreign accent recognition. Journal of Phonetics, 42, 12–23.
Google Scholar

Laver, J. (1980). The phonetic description of voice quality. Cambridge: Cambridge University Press.
Google Scholar

Lee, C. S., & Todd, N. P. M. (2004). Towards an auditory account of speech rhythm: Application of a model of the auditory ‘primal sketch’ to two multi-language corpora. Cognition, 93, 225–254.
Google Scholar

Lev-Ari, S., & Keysar, B. 2010. Why don’t we believe non-native speakers? The influence of accent on credibility.Journal of Experimental Social Psychology, 46, 1093–1096.
Google Scholar

Lippi-Green, R. (1997). English with an accent: Language, ideology, and discrimination in the United States. London–New York: Routledge.
Google Scholar

Luke, S. G. (2017). Evaluating significance in linear mixed-effects models in R. Behavior Research Methods, 49, 1494–1502.
Google Scholar

Mennen, I. (2004). Bidirectional interference in the intonation of Dutch speakers of Greek. Journal of Phonetics, 32, 543–563.
Google Scholar

Munro, M. J., & Derwing, T. M. (2001). Modeling perceptions of the accentedness and comprehensibility of L2 speech: The role of speaking rate. Studies in Second Language Acquisition, 23(4), 451–468.
Google Scholar

Munro, M. J., Derwing, T. M., & Burgess, C. S. (2010). Detection of nonnative speaker status from content-masked speech. Speech Communication, 52, 626–637.
Google Scholar

Porzuczek, A. (2012). Measuring vowel duration variability in native English speakers and Polish learners.Research in Language, 10(2), 201–214.
Google Scholar

Ramus, F., Hauser, M. D., Marc, D., Miller, C., Morris, D., & Mehler, J. (2000). Language discrimination by human newborns and by cotton-top tamarin monkeys. Science, 288, 349–351.
Google Scholar

Raupach, M. (1980). Temporal variables in first and second language speech production. In H. W. Dechert & M. Raupach (Eds.), Temporal variables in speech: Studies in honour of F. Goldman-Eisler (pp. 263–270). The Hague: Mouton Publishers.
Google Scholar

Riggenbach, H. (1991). Toward an understanding of fluency: A microanalysis of nonnative speaker conversations. Discourse Process, 14, 423–441.
Google Scholar

Ryan, E. B., & Carranza, M. A. (1975). Evaluative reactions of adolescents toward speakers of standard English and Mexican American accented English. Journal of Personality and Social Psychology, 31(5), 855–863.
Google Scholar

Schairer, K. E. (1992). Native speaker reaction to non-native speech. Modern Language Journal, 76(3), 309–319.
Google Scholar

Searle, S. R., Casella, G., & McCulloch, C. E. (1992). Variance components. New York: Wiley.
Google Scholar

Shannon, R. V., Zeng, F.-G., Kamath, V., Wygonski, J., & Ekelid, M. (1995). Speech recognition with primarily temporal cues. Science, 270, 303–304.
Google Scholar

Stoet, G. (2010). A software package for programming psychological experiments using Linux. Behavior Research Methods, 42(4), 1096–1104.
Google Scholar

Stoet, G. (2017). A novel web-based method for running online questionnaires and reaction-time experiments. Teaching of Psychology, 44(1), 24–31.
Google Scholar

Tajima, K., Port, R., & Dalby, J. (1997). Effects of temporal correction on intelligibility of foreign accented English. Journal of Phonetics, 25, 1–24.
Google Scholar

Tilsen, S., & Arvaniti, A. (2013). Speech rhythm analysis with decomposition of the amplitude envelope: Characterizing rhythmic patterns within and across languages. Journal of the Acoustical Society of America, 134, 628–639.
Google Scholar

Toro, J. M., Trobalon, J. B., & Sebastián-Gallés, N. (2003). The use of prosodic cues in language discrimination tasks by rats. Animal Cognition, 6, 131–136.
Google Scholar

Trofimovich, P., & Baker, W. (2006). Learning second-language suprasegmentals: Effect of L2 experience on prosody and fluency characteristics of L2 speech. Studies in Second Language Acquisition, 28, 1–30.
Google Scholar

Van Lancker, D., Kreiman, J., & Emmorey, K. (1985). Familiar voice recognition: Patterns and parameters Part I: Recognition of backwards voices. Journal of Phonetics, 13, 19–38.
Google Scholar

White, L., & Mattys, S. L. (2007). Calibrating rhythm: First language and second language studies. Journal of Phonetics, 35, 501–522.
Google Scholar

Vol. 6 No. 2 (2020)
Published: 2020-12-23

ISSN: 2450-5455

eISSN: 2451-2125

10.31261/tapsla

Publisher

Wydawnictwo Uniwersytetu Śląskiego | University of Silesia Press

Submit

Detection of Non-native Speaker Status from Backwards and Vocoded Content-masked Speech

Abstract

Keywords:

Arkadiusz Rojczyk

Andrzej Porzuczek

Information