Relation of pitch glide perception and Mandarin tone identification

Terry L. Gottfried and Annie Staby

Psychology Dept.

Lawrence Univ.

Appleton, WI, USA


Devon Riester

Psychology Dept.

Kenyon College

Gambier, OH, USA

Because Mandarin Chinese is a tonal language, the pitch contour (changes in F0)of syllables is phonemic, differentiating lexical items in the language. In previous studies (Gottfried & Suiter, 1997; Gottfried, 1997), native and non-native speakers of Mandarin Chinese identified the four phonemic tones in syllables spoken by a native speaker. Different portions of the syllable were removed using waveform editing procedures (see Strange, 1989). Native listeners accurately identified the tone,even with the middle portion of the syllable removed (silent-centers). Non-native listeners made significantly more identification errors than native listeners on all conditions, but particularly on silent-centers. This suggests that natives integrated initial and final portions of the syllable over the silent interval to recognize the tone, while non-natives did not. However, individuals with relatively little experience speaking Mandarin varied considerably in their ability to differentiate the tones.

Individual differences in the ability to recognize and reproduce speech contrasts in a second language are related to a variety of subject variables. For example, Flege, et al.(1999) have noted that native Italian speakersí accuracy in identifying English vowels was related to their age of emigration, years they have spoken English, and amount of time currently spent speaking Italian. Other studies have demonstrated the importance of perceived phonetic similarity in phonetic categories in the accurate identification of unfamiliar phonemic contrasts (Flege, et al.,1997; Strange, et al.,1998).

Other factors may also be related to success in perceiving like a native listener. One such factor might be musical ability, especially with respect to ear training. In ear training, students are taught the recognition of musical intervals and rhythm, as heard in simple melodic lines, as well as more harmonically complex musical passages. Testing usually requires that the student transcribe (or choose from multiple transcriptions) the musical material, using the correct written musical notation. Although the use of correct musical notation is dependent on musical training, the actual ability to perceive pitch changes varies among both musically literate and illiterate listeners (see Shuter-Dyson, 1999, for a review of tests of musical ability). Although there are differences between recognition of musical intervals and recognition of lexical tones, both tasks require the listener to use changes in F0to make the appropriate response. The purpose of this research is to examine how performance on more musical and linguistic tasks may be related. We predict that music students will perform better than non-music on pitch contour identification, and music students will also perform better than non-music students on Mandarin tone identification, even when they have never studied Mandarin.


A speaker of Mandarin (native of Beijing) produced 10 different syllables (/li/, /liao/, /liu/, /lei/, /la/, /lao/, /lou/, /lu/, /luo/, /lü/) with four different tones?high-level, mid-rising, low-dipping, and high-falling. The syllables were produced in the context of a carrier sentence, varying only in the target syllable. All 40 resulting syllables are words in Mandarin except /lü/ with Tone 1.  Figure 1 shows the pitch (fundamental frequency, F0) contour for each of the four tones, averaged over the 10 syllables.

The listeners were native speakers of American English (n=35) at Lawrence University who had not studied Mandarin. Listeners varied in their musical experience: 15 were currently enrolled as students in the conservatory of music; 20 were enrolled in the college in non-music majors. Of these students, 7 had taken music lessons for more than 5 years.

The glide stimuli consisted of 28 400-ms sine-wave stimuli, starting at 300 Hz or 250 Hz and increasing or decreasing by 5, 10, 15, 20, 30, 40, or 50 Hz. An additional 14 stimuli were unchanging in frequency at 200, 210, 220, 230, 240, 250, 260, 270, 280, 300, 320, 330, 340, 350 Hz.

Listeners heard glide stimuli in random order presented by computer over headphones (using Kay Computerized Speech Lab). Listeners responded by clicking via mouse on the appropriate button on the computer screen: up, same, or down.

A native speaker (from Beijing) produced the Mandarin Chinese stimuli in the context of a carrier sentence. Listeners heard these stimuli (only the test syllable and the following word /dzi/) in random order and responded by clicking via mouse on the appropriate button:

Tone 1` ; Tone 2 /; Tone 3 v; Tone 4 \

There were two tests of syllable identification:


Figure 2 shows the mean percent correct responses on the glide identification task and the two Mandarin tone identification tasks. The data reported in the abstract indicated that music majors (n=5) were significantly more accurate in their recognition of the direction of glide frequency change than non-music majors (n=17): t(20)=3.290, p<.005. With additional subjects (10 music and 3 non-music), the difference between groups was reduced, but was still marginally significant: t(32)=1.867, p<.10. Furthermore, music majors were somewhat more accurate in their identification of both intact and silent-center Mandarin syllable tones than non-majors, F(1, 32)= 3.396, p<.10.

There was no significant effect of condition: both the intact and the silent-center syllables were difficult for the listeners. However, there was a significant effect of tone, F(1,32)=8.951, p<.01, and a significant interaction of tone x condition, F(1,32)=12.167, p<.001. Figure 3 and Figure 4 show mean percent correct for each tone in each condition for music and non-music students. Overall, listeners identified Tone 1 the poorest (mean correct 40.56%); Tone 2 was identified correctly 46.42%; Tone 3 was identified correctly 50.78%; Tone 4 was identified correctly 49.95%. The difference in accuracy for tones was more marked for the silent-center syllables. Music majors were better than non-music on all tones except Tone 1 (no significant difference for intact syllables, and a marginally poorer performance for silent-centers).

Table 1 displays the correlations between the years of music study and the dependent measures: pitch glide, Mandarin intact tone, and Mandarin silent-center tone identification. Years of study significantly correlated with performance on the pitch glide task, but not with the Mandarin tone tasks. Pitch glide performance correlated significantly with performance on the Mandarin tone tasks.


Music majors, not surprisingly, performed pitch glide identification somewhat better than non-majors, as measured by percent correct. Further analysis will be done to determine the amount of F0 change necessary for reliable detection of change, and whether listeners genuinely vary in their sensitivity to these changes, or whether they are the result of differing response biases. Because the F0 changes in the pitch glides are analogous to those in the Mandarin tones, it is also not surprising that performance on the two tasks is correlated. It is notable that listenersí performance is marginally different depending on their college major. With additional listeners, we hope to examine whether specific differences among the music majors (namely, ear training ability, as determined by music theory placement exams) is related to accuracy in performance in Mandarin tone identification. It will be interesting also to see whether training with particular musical instruments will be related to different performance on the Mandarin tones: for example, one might expect that learning instruments requiring tuning adjustment by the performer (string instruments or voice) would be related to greater sensitivity to small pitch changes, and better identification of Mandarin tones, compared to other instruments that do not require such adjustments (keyboard or percussion).

Because students in music theory classes are learning how to recognize pitch changes, it would also be interesting to test these students longitudinally on changes in their ability to identify pitch glide changes and Mandarin tones. With additional listeners, we also hope to see whether musical ability and linguistic experience (with other non-tonal languages) interact in their accurate identification of these unfamiliar contrasts.


  • Flege, J., Bohn, O.-S., & Jang, S. (1997). Effects of experience on non-native speakersí production and perception of English vowels. Journal of Phonetics 25: 437-470.

  • Flege, J. E., MacKay, I. R. A., & Meador, D. (1999). Native Italian speakersí perception and production of English vowels.  Journal of the Aacoustical Society of America106: 2973-2987.

  • Gottfried, T. L. (1997).  "Identification of Mandarin tones."  Poster presented at the International Symposium on Speech Perception by Non-Native Listeners (special session at the Annual Convention of the American Speech-Language-Hearing Association). November 21, 1997. Boston, MA.

  • Gottfried, T. L., & Suiter, T. L. (1997). Effect of linguistic experience on the identification of Mandarin Chinese vowels and tones. Journal of Phonetics 25: 207-231.

  • Shuter-Dyson, R. (1999). Musical ability. In D. Deutsch (Ed.), The Psychology of Music, 2nd Ed. (pp. 627-651). San Diego, CA: Academic Press.

  • Strange, W. (1989). Dynamic specification of coarticulated vowels spoken in sentence context. Journal of the Acoustical Society of America 85: 2135-2153.

  • Strange, W., Akahane-Yamada, R., Kubo, R., Trent, S., Nishi, K., & Jenkins, J. (1998). Perceptual assimilation of American English vowels by Japanese listeners. Journal of Phonetics 26: 311-344.



    Table 1. Correlations between performance on pitch glide task, Mandarin intact tone identification, Mandarin silent-center tone identification, and years of music study.
    Pitch Glide
    Mandarin Intact Identification
    Mandarin Silent-center Identification
    Years of Music
    Mandarin Intact Identification
    Mandarin Silent-center Identification

    ** means p<.01     * means p<.05