NonText Version

Speech Research Laboratory
AI. duPont Hospital for Children
and the
University of Delaware

SRL Main Page (text)

Info for Users (text)
Info for Clinicians (text)
Info for Researchers (text)

ModelTalker Speech Synthesizer

STAR: Speech Training, Assessment and Remediation (text)

Language and Speech

Newsletter (text)

Contact Us!


Research Info

InvTool Recording Tool

InvTool Tutorial

Download InvTool Speech Recording & ModelTalker Speech Synthesis Software

The degree to which clear speech, bearing the identity of the recorded talker, can be produced by ModelTalker speech synthesizer depends crucially on the quality of the speech database it uses. No matter how well the ModelTalker synthesizer software is designed, the synthesizer will produce very poor synthetic speech if the speech data it needs for synthesis are incorrect or distored. Thus the most important part of the process of creating a personalized voice is that of recording a corpus of high quality digital speech from which the synthesis database can be constructed. The tool used for recording a corpus of speech is InvTool.

There are three aspects of the speech corpus that are important in determining its quality: recording quality (fidelity); the specific words and phrases to be recorded (content); and the accuracy with which the speech sounds and other acoustic features within words and phrases are identified (labeling).

While we have no control over the environment in which recording takes place, we can recommend appropriate hardware and environments. InvTool also helps with calibration procedures.

The inventory to be recorded currently consists of about 1400 words and phrases. We are in the process of investigating ways to reduce the size of the inventory - 1400 words may be too strenuous for people with weaker voices or short attention spans. InvTool prompts for each utterance both with both a written and an auditory prompt. The recording person repeats the utterance, and InvTool will give feedback on whether the amplitude, the average pitch, and the pronunciation are within acceptible ranges, and, if not, gives the recorder the option of rerecording the utterance. InvTool is responsible for labeling speech before storage. Both phoneme boundary locations and pitchmark locations are determined by InvTool.

 

Projects | Publications | Related Links | Staff |
Facilities | Events | Positions