Vowel Dynamics



Quite a few studies over the past two decades have shown that American English vowels are only approximately separable on the basis of static formant frequency information alone. It appears that the structure of vowels can only be fully specified when we also describe how formant frequencies vary from onset to offset of the vowel. The purpose of this exercise is to explore this important feature of vowel structure by synthesizing vowels with time-changing formant values and compare these to static vowels we synthesized in the previous exercise.

Task 1

To do this exercise, we'll need to use the more complicated General Klatt Synthesis interface. Let's start by getting it to work with an example. The code shown below is a sort of computer program or script for the speech synthesizer. It instructs the synthesizer to set various control parameters to specific values at specific points in time. Lines that begin with an asterisk are comments. Actual script lines begin with a time value (either absolute like TIME=000 or relative like TIME+20) which is followed by one or more other parameter settings (like F1=592 or AV=72). The formant frequency values in this example are those Hillenbrand et al. (1995) found for average male /ae/ productions, but the timing values are fairly arbitrary. So here's the synthesis script:


* Start with formants on slightly high - front /ae/
*
TIME = 000; F1=592; F2=1923; F3=2500; F0=130; AV=30
*
* Ramp AV up over 20 msec to prevent perception of stop
*
TIME + 20; AV=72
*
* Hold formants and amplitude constant for 250 msec.
*
TIME + 250; F1=592; F2=1923; F3=2500; AV=72; F0=120
*
* Over next 200 msec glide into lower - more back /ae/
*
TIME + 200; F1=632; F2=1720; F3=2500; AV=72
*
* Hold constant formants and AV but let F0 drop over 60 msec
*
TIME +  60; F0=90; AV=72
*
* Ramp amplitude back off over final 30 msec
*
TIME + 30; AV=0
END

We're going to make a copy of this script so we can paste it into the synthesizer interface and generate the speech it describes. Use your mouse to select the script by positioning the pointer on the start of the script text, pressing the left mouse button, and dragging the pointer over the rest of the script. Now press ALT-C or use the Edit menu to Copy the selected area. Next, go to the Synthesis Interface script section, click inside the text entry window and select Paste from the Edit menu.

Task 2

Jim Hillenbrand was kind enough to supply a listing of the acoustic measures he and his colleagues found for each talker in their 1995 study. Below are four specific examples from his data, also for instances of the vowel [ae].
Adult Male - F0=124; Duration=312
Steady State: F1=627; F2=1910; F3=2488
20%: F1=629; F2=1882; F3=2460
50%: F1=720; F2=1750; F3=2435
80%: F1=757; F2=1563; F3=2527
Adult Female - F0=156; Duration=365
Steady State: F1=649; F2=2508; F3=3050
20%: F1=612; F2=2532; F3=2973
50%: F1=736; F2=2419; F3=3082
80%: F1=824; F2=2126; F3=3018
Child Male - F0=246; Duration=352
Steady State: F1=726; F2=2231; F3=2932
20%: F1=742; F2=2246; F3=2902
50%: F1=767; F2=2003; F3=2873
80%: F1=921; F2=1870; F3=2958
Child Female - F0=255; Duration=385
Steady State: F1=932; F2=2523; F3=3644
20%: F1=905; F2=2512; F3=3704
50%: F1=977; F2=2325; F3=3434
80%: F1=949; F2=2100; F3=3110
Synthesize both the male example and one of the other examples. Synthesize both a steady-state and time-changing example of the vowel. Save the text of your final parameter files so we can review them in class.

Note: These are not ready-to-synthesize scripts. You will need to set up scripts like the example earlier and use the F0, duration, and formant frequency values from the table above. On many systems you can run two copies of Netscape at the same time. If you can do that, run one copy with the synthesis interface, and keep this page visible in another copy. Of course, you can always write the synthesis script on paper while viewing the information on this page, then move on the the Synthesis Interface page and enter the script there.


References

Hillenbrand, J., Getty, L.A., Clark, M.J., Wheeler, k. (1995). Acoustic
    characteristics of American English vowels. J. Acoust. Soc. Am., 
    97, 3099-3111.

Peterson, G.E., and Barney, H.L., (1952). Control methods used in a
    study of the vowels. J. Acoust. Soc. Am. 24, 175-184.


All text and graphics unless otherwise specified by H. Timothy Bunnell, Ph.D
bunnell@asel.udel.edu

Last Modified: November 28, 2015 (htb).