Table of Contents Volume 1 ThA1LP -- Opening Ceremony and Plenary Lecture 1 The Comparative Study of Spoken-Language Processing Anne Cutler ThA2L1 -- Large Vocabulary 2 New Developments in the INRS Continuous Speech Recognition System Z. Li, M. Heon, Douglas O'Shaughnessy 6 On Designing Pronunciation Lexicons for Large Vocabulary, Continuous Speech Recognition Lori Lamel, Gilles Adda 10 Word Graph Rescoring Using Confidence Measures Pablo Fetter, Frédéric Dandurand, Peter Regel-Brietzmann 14 A Bottom-up Approach for Handling Unseen Triphones in Large Vocabulary Continuous Speech Recognition X.L. Aubert, Peter Beyerlein, Meinhard Ullrich 18 Discriminative Optimisation of Large Vocabulary Recognition Systems V. Valtchev, P.C. Woodland, S. J. Young 22 Japanese Large-vocabulary Continuous-speech Recognition using a Business-newspaper Corpus Tatsuo Matsuoka, Katsutoshi Ohtsuki, Takeshi Mori, Sadaoki Furui, Katsuhiko Shirai 26 Handling Compound Nouns in a Swedish Speech-understanding System David Carter, Jaan Kaja, Leonardo Neumeyer, Manny Rayner, Fuliang Weng, Mats Wiren 30 Initial Evaluation of a Preselection Module for a Flexible Large Vocabulary Speech Recognition System in Telephone Environment J. Macias-Guarasa, A. Gallardo, J. Ferreiros, Jose M. Pardo, L. Villarrubia ThA2L2 -- Multimodal ASR (Face and Lips) 34 Asynchronous Integration of Visual Information in an Automatic Speech Recognition System Mamoun Alissali, Paul Deleglise, Alexandrina Rogozan 38 Audiovisual Speech Recognition using Multiscale Nonlinear Image Decomposition. I.A. Matthews, J. Bangham, S.J. Cox 42 Robust Audiovisual Integration using Semicontinuous Hidden Markov Models Qin Su, Peter L. Silsbee 46 The Effect of Visual Information on Word Initial Consonant Perception of Dysarthric Speech Richard P. Schumeyer, Kenneth E. Barner 50 A Multiple Deformable Template Approach for Visual Speech Recognition Devi Chandramohan, Peter L. Silsbee 54 Speaker Independent Bimodal Phonetic Recognition Experiments P. Cosi, E. Magno Caldognetto, F. Ferrero, M. Dugatto, K. Vagges 58 Speechreading using Shape and Intensity Information Juergen Luettin, Neil A. Thacker, Steve W. Beet 62 Speaker Identification by Lipreading Juergen Luettin, Neil A. Thacker, Steve W. Beet ThA2L3 -- Perception of Words 66 How Word Onsets Drive Lexical Access and Segmentation: Evidence from Acoustics, Phonology and Processing David W. Gow Jr., Janis Melvold, Sharon Manuel 70 RAW: A Real-speech Model for Human Word Recognition David van Kuijk, Peter Wittenburg, Ton Dijkstra 74 How Facilitatory can Lexical Information Be During Word Recognition? Evidence from Moroccan Arabic Mehdi Meftah, Sami Boudelaa 78 Effects of Frequency on the Auditory Perception of Open- Versus Closed-class Words Alette P. Haveman 82 Phonotactic and Metrical Influences on Adult Ratings of Spoken Nonsense Words Michael S. Vitevitch, Paul A. Luce, Jan Charles-Luce, David Kemmerer 86 Lipreading Supplemented by Voice Fundamental Frequency: To What Extent Does the Addition of Voicing Increase Lexical Uniqueness for the Lipreader? Edward T. Auer Jr., Lynne E. Bernstein 90 Strategies Used in Rhyme-Monitoring S. te Riele, S.G. Nooteboom, H. Quené 94 How do Dutch Listeners Process Words with Epenthetic Schwa? Wilma van Donselaar, Cecile Kuijpers, Anne Cutler ThA2P1 -- Phonetics, Transcription, and Analysis 98 Whole-word Phonetic Distances and the PGPfone Alphabet Patrick Juola, Philip Zimmermann 102 Automatic Vowel Quality Description using a Variable Mapping to an Eight Cardinal Vowel Reference Set Shuping Ran, J. Bruce Millar, Phil Rose 106 Automatic Detection and Segmentation of Pronunciation Variants in German Speech Corpora Andreas Kipp, Maria-Barbara Wesenick, Florian Schiel 110 ANGIE: A New Framework for Speech Analysis Based on Morpho-phonological Modelling Stephanie Seneff, Raymond Lau, Helen Meng 114 Perceptual Contrast in the Korean and English Vowel System Normalized Byunggon Yang 118 On Phonetic Characteristics of Pause in the Korean Read Speech Yong-Ju Lee, Sook-hyang Lee 121 Cross-Language Effects of Lexical Stress in Word Recognition: The Case of Arabic English Bilinguals Sami Boudelaa, Mehdi Meftah 125 Automatic Generation of German Pronunciation Variants Maria-Barbara Wesenick 129 Estimating the Quality of Phonetic Transcriptions and Segmentations of Speech Signals Maria-Barbara Wesenick, Andreas Kipp 133 An Acoustic Analysis of Contemporary Vowels of the Standard Slovenian Language Bojan Petek, Rastislav Sustarsic,Smiljana Komar 137 Using Decision Trees to Construct Optimal Acoustic Cues Sandrine Robbe, Anne Bonneau, Sylvie Coste, Yves Laprie 141 Maximum Jaw Displacement in Contrastive Emphasis Donna Erickson, Osamu Fujimura 145 Subglottal Pressure and Final Lowering in English Rebecca Herman, Mary Beckman, Kiyoshi Honda 149 Phonological Variation: Epenthesis and Deletion of Schwa in Dutch Cecile Kuijpers, Wilma van Donselaar, Anne Cutler Populations ThA2P2 -- Spoken Language Processing for Special Populations 153 Feedback Considerations for Speech Training Systems James J. Mahshie 157 Clinical Applications of Computer-Based Speech Training for Children with Hearing Impairment Anne-Marie Öster 161 Enhancing Information-rich Regions of Natural VCV and Sentence Materials Presented in Noise Valerie Hazan, Andrew Simpson 165 Speech Perceptual Abilities of Children with Specific Reading Difficulty (Dyslexia) Valerie Hazan, Alan Adlard 169 Bimodal Perception of Spectrum Compressed Speech Larry D. Paarmann, Michael K. Wynne 173 Effect of Sentential Context on Syllabic Stress Perception by Hearing-impaired Listeners Dragana Barac-Cikoja, Sally Revoile 176 Applications of Automatic Speech Recognition to Speech and Language Development in Young Children Martin Russell, Catherine Brown, Adrian Skilling, Rob Series, Julie Wallace, Bill Bohnam, Paul Barker 180 Sub-band Adaptive Speech Enhancement for Hearing Aids D. R. Campbell 184 Adapting a TTS System to a Reading Machine for the Blind Thomas Portele, Juergen Kraemer ThA2S1 -- Dialogue Special Session I 188 Modeling of Spoken Dialogue with and without Visual Information Katsuhiko Shirai 192 Multimodal Discourse Modelling in a Multi-user Multi-domain Environment Stephanie Seneff, David Goddeau, Christine Pao, Joseph Polifroni 196 Automatic Acquisition of Probabilistic Dialogue Models Kenji Kita, Yoshikazu Fukui, Masaaki Nagata, Tsuyoshi Morimoto 200 Units of Dialogue Management: An Example Paul Heisterkamp, Scott McGlashan 204 Error Resolution During Multimodal Human-computer Interaction Sharon Oviatt, Robert VanGent 208 Improved Spontaneous Dialogue Recognition Using Dialogue and Utterance Triggers by Adaptive Probability Boosting Ramesh R. Sarukkai, Dana H. Ballard 212 Speech Recognition for Spontaneously Spoken German Dialogues Kai Hübener, Uwe Jost, Henrik Heine 216 Using Prosodic Information to Constrain Language Models for Spoken Dialogue Paul Taylor, Hiroshi Shimodaira, Stephen Isard, Simon King, Jacqueline Kowtko ThP1L1 -- Language Modeling I 220 Combination of Word-based and Category-based Language Models T.R. Niesler, P.C. Woodland 224 A Multi-level Lexical-semantics Based Language Model Design for Guided Integrated Continuous Speech Recognition Francisco J. Valverde-Albacete, Jose M. Pardo 228 A Category Based Approach for Recognition of Out-of-Vocabulary Words Florian Gallwitz, Elmar Noeth, Heinrich Niemann 232 Scalable Backoff Language Models Kristie Seymore, Ronald Rosenfeld 236 Modeling Long Distance Dependence in Language: Topic Mixtures vs. Dynamic Cache Models R. Iyer, Mari Ostendorf 240 Bayesian Estimation Methods for N-Gram Language Model Adaptation Marcello Federico ThP1L2 -- Feature Extraction for Speech Recognition I 244 Feature Dimension Reduction Using Reduced-Rank Maximum Likelihood Estimation for Hidden Markov Models Don X. Sun 248 Using Multi-Level Segmentation Coefficients to Improve HMM Speech Recognition Kai Hübener 252 A Comparative Study of Linear Feature Transformation Techniques for Automatic Speech Recognition T. Eisele, R. Haeb-Umbach, D. Langmann 256 Inclusion of Temporal Information into Features for Speech Recognition Ben Milner 260 New Cepstral Representation using Wavelet Analysis and Spectral Transformation for Robust Speech Recognition Hubert Wassner, Gérard Chollet 264 Wavelet Based Feature Extraction for Phoneme Recognition C.J. Long, S. Datta ThP1L3 -- Speech Production - Measurement and Modeling 268 Extraction of Tongue Contours in X-ray Images with Minimal User Interaction Yves Laprie, Marie-Odile Berger 272 Three-dimensional Measurement of the Vocal Tract by MRI Didier Demolin, Thierry Metens, Alain Soquet 276 Syllable Affiliation of Final Consonant Clusters Undergoes a Phase Transition Over Speaking Rates Philip Gleason, Betty Tuller, J. A. Scott Kelso 279 Towards a Biomechanical Model of the Larynx Arthur Lobo, Michael O'Malley 283 Generating Intonation by Superposing Gestures Yann Morlec, Gérard Bailly, Vèronique Aubergé 287 Effects of Auditory Feedback on F0 Trajectory Generation Hideki Kawahara, Hiroko Kato, J. C. Williams ThP1P1 -- Speech Coding / HMMs and NNs in ASR 291 On the Effects of Accent and Language on Low Rate Speech Coders I. S. Burnett, J. J. Parry 295 VQ Codevector Index Assignment Using Genetic Algorithms for Noisy Channels J.S. Pan, Fergus R. McInnes, Mervyn A. Jack 299 An Improved Vector Quantization Algorithm for Speech Transmission Over Noisy Channels Gavin C. Cawley 302 Very Low Delay and High Quality Coding of 20 Hz-15 kHz Speech Signals at 64 kbit/s C. Murgia, G. Feng, A. Le Guyader, C. Quinquis 306 Application of Speaker Modification Techniques to Phonetic Vocoding Carlos M. Ribeiro, Isabel M. Trancoso 310 Entropy Coded Vector Quantization with Hidden Markov Models Tadashi Yonezaki, Kiyohiro Shikano 314 An Application of Recurrent Neural Networks to Low Bit Rate Speech Coding Minoru Kohata 318 CELP Coding System Based on Mel-Generalized Cepstral Analysis Kazuhito Koishida, Keiichi Tokuda, Takao Kobayashi, Satoshi Imai 322 Wideband Re-synthesis of Narrowband CELP-coded Speech Using Multiband Excitation Model Cheung-Fat Chan, Wai-Kwong Hui 326 Recurrent Neural Networks for Phoneme Recognition Takuya Koizumi, Mikio Mori, Shuji Taniguchi, Mitsutoshi Maruya 330 A Model for the Acoustic Phonetic Structure of Arabic Language using a Single Ergodic Hidden Markov Model M.A. Mokhtar, A. Zein-el-Abddin 334 Modelling Long Term Variability Information in Mixture Stochastic Trajectory Framework Yifan Gong, Irina Illina, Jean-Paul Haton 338 Segmental Phonetic Features Recognition by means of Neural-fuzzy Networks and Integration in an N-best Solutions Post-processing T. Moudenc, R. Sokol, G. Mercier 342 Stochastic Trajectory Model with State-Mixture for Continuous Speech Recognition Irina Illina, Yifan Gong 346 Recognition of Spelled Names over the Telephone Hermann Hild, Alex Waibel 350 Optimal Tying of HMM Mixture Densities using Decision Trees Gilles Boulianne, Patrick Kenny 354 Speech Recognition Using an Enhanced FVQ Based on a Codeword Dependent Distribution Normalization and Codeword Weighting by Fuzzy Objective Function Hwan Jin Choi, Yung Hwan Oh 358 Using the Self-Organizing Map to Speed up the Probability Density Estimation for Speech Recognition with Mixture Density HMMs Mikko Kurimo, Panu Somervuo ThP1S1 -- Dialogue Special Session II 362 Combining the Detection and Correction of Speech Repairs Peter A. Heeman, Kyung-ho Loken-Kim, James F. Allen 366 Generating Spontaneous Elliptical Utterance Yuji Sagawa, Wataru Sugimoto, Noboru Ohnishi 370 Developing the Modelling of Swedish Prosody in Spontaneous Dialogue Gösta Bruce, Marcus Filipsson, Johan Frid, Björn Granström, Kjell Gustafson, Merle Horne, David House, Birgitta Lastow, Paul Touati 374 Spoken Language Generation in a Multimedia System Shimei Pan, Kathleen R. McKeown 378 Synthesizing Dialogue Speech of Japanese Based on the Quantitative Analysis of Prosodic Features Keikichi Hirose, Mayumi Sakata, Hiromichi Kawanami 382 Spoken Dialogue Interface in a Dual Task Situation Shuichi Tanaka, Shu Nakazato, Keiichiro Hoashi, Katsuhiko Shirai ThP1S2 -- Neural Models of Speech Processing I * How is Information About Speech Encoded in the Peripheral Auditory System? Eric D. Young * Spectral Shape Analysis in the Central Auditory System Shihab Shamma ThP2L1 -- Language Modeling II 386 Modeling Disfluencies in Conversational Speech Man-hung Siu, Mari Ostendorf 390 Evaluation of a Language Model using a Clustered Model Backoff John Miller, Fil Alleva 394 Language Modeling Using X-grams Antonio Bonafonte, José B. Mariño 398 Class Phrase Models For Language Modelling Klaus Ries, Finn Dag Buo, Alex Waibel 402 Introducing Linguistic Constraints into Statistical Language Modeling Petra Geutner 406 Language Modeling with Stochastic Automata Jianying Hu, William Turin, Michael K. Brown ThP2L2 -- Feature Extraction for Speech Recognition II 410 New Fast Wavelet Packet Transform Algorithms for Frame Synchronized Speech Processing Andrzej Drygajlo 414 Frequency-Warping in Speech S. Umesh, L. Cohen, N. Marinovic, D. Nelson 418 Extracting Speech Features from Human Speech-like Noise Daisuke Kobayashi, Shoji Kajita, Kazuya Takeda, Fumitada Itakura 422 Subband-Crosscorrelation Analysis for Robust Speech Recognition Shoji Kajita, Kazuya Takeda, Fumitada Itakura 426 A New ASR Approach Based on Independent Processing and Recombination of Partial Frequency Bands Hervé Bourlard, Stéphane Dupont 430 Frequency and Time Filtering of Filter-bank Energies for HMM Speech Recognition Climent Nadeu, José B. Mariño, Javier Hernando, Albino Nogueiras ThP2L3 -- Vowels 434 Temporal Cues for Vowels and Universals of Vowel Inventories Carrie E. Lang, John J. Ohala 438 Acoustic Variability in Spontaneous Conversational Speech of American English Talkers Ann K. Syrdal 442 Cross-language Speech Perception: Swedish, English, and Spanish Speakers' Perception of Front Rounded Vowels Raquel Willerman, Patricia K. Kuhl 446 Inter-language Vowel Perception and Production by Korean and Japanese Listeners John C.L. Ingram, See-Gyoon Park 450 Intelligibility and Acoustic Correlates of Japanese Accented English Vowels Diane Kewley-Port, Reiko Akahane-Yamada, Kiyoaki Aikawa 454 Segmentation Strategies for Spoken Language Recognition: Evidence from Semi-bilingual Japanese Speakers of English Kiyoko Yoneyama ThP2P1 -- NNs and Stochastic Modeling 458 Integrating Connectionist, Statistical and Symbolic Approaches for Continuous Spoken Korean Processing Geunbae Lee, Jong-Hyeok Lee, Kyubong Park, Byung-Chang Kim 462 Towards ASR on Partially Corrupted Speech Hynek Hermansky, Sangita Timberwala, Misha Pavel 466 Parametric Trajectory Models for Speech Recognition Herbert Gish, Kenney Ng 470 Use of Gaussian Selection in Large Vocabulary Continuous Speech Recognition Using HMMs K.M. Knill, M.J.F. Gales, S. J. Young 474 Cross Phone State Clustering using Lexical Stress and Context J. Hogberg, K. Sjolander 478 Likelihood Ratio Decoding and Confidence Measures for Continuous Speech Recognition Eduardo Lleida, Richard C. Rose 482 A Study on Continuous Chinese Speech Recognition Based on Stochastic Trajectory Models Xiaohui Ma, Yifan Gong, Yuqing Fu, Jiren Lu, Jean-Paul Haton 486 A Proposal for a New Algorithm of Reference Interval-free Continuous DP for Real-time Speech or Text Retrieval Yoshiaki Itoh, Jiro Kiyama, Hiroshi Kojima, Susumu Seki, Ryuichi Oka 490 Language Modeling by String Pattern N-gram for Japanese Speech Recognition Akinori Ito, Masaki Kohda 494 Statistical Language Modeling using a Variable Context Length Reinhard Kneser 498 A Comparison of Hybrid HMM Architectures Using Global Discriminative Training Finn Tore Johansen 502 Improved Probability Estimation with Neural Network Models Wei Wei, Etienne Barnard, Mark Fanty 506 A Neural Network Using Acoustic Sub-word Units for Continuous Speech Recognition Ha-Jin Yu, Yung-Hwan Oh 510 On the Error Criteria in Neural Networks as a Tool for Human Classification Modelling Louis F. M. ten Bosch, Roel Smits 514 A Non-linear Filtering Approach to Stochastic Training of the Articulatory-acoustic Mapping Using the EM Algorithm Gordon Ramsay 518 A Tool for Automated Design of Language Models Y.P. Yang, J.R. Deller Jr. 522 Acoustic-phonetic Decoding Based on Elman Predictive Neural Networks F. Freitag, E. Monte 526 On Improving Discrimination Capability of an RNN Based Recognizer Tan Lee, P.C. Ching 530 An Evaluation of Statistical Language Modeling for Speech Recognition using a Mixed Category of Both Words and Parts-of-speech Yumi Wakita, Jun Kawai, Hitoshi Iida ThP2S1 -- Dialogue Special Session III 534 A Dialogue Control Strategy Based on the Reliability of Speech Recognition Yasuhisa Niimi, Yutaka Kobayashi 538 SpeechWear: A Mobile Speech System Alexander I. Rudnicky, Stephen Reed, Eric H. Thayer 542 WHEELS: A Conversational System in the Automobile Classifieds Domain Helen Meng, Senis Busayapongchai, James Glass, David Goddeau, Lee Hetherington, Edward Hurley, Christine Pao, Joseph Polifroni, Stephanie Seneff, Victor Zue 546 Effective Human-computer Cooperative Spoken Dialogue: The AGS Demonstrator M.D. Sadek, A. Ferrieux, A. Cozannet, P. Bretier, F. Panaget, J. Simonin 550 Dialog in the RAILTEL Telephone-based System S.K. Bennacef, L. Devillers, S. Rosset, Lori Lamel 554 Dialogue Processing in a Conversational Speech Translation System Alon Lavie, Lori Levin, Yan Qu, Alex Waibel, Donna Gates, Marsal Gavaldà, Laura Mayfield, Maite Taboada ThP2S2 -- Neural Models of Speech Processing II 558 Novel Speech Processing Mechanism Derived from Auditory Neocortical Circuit Analysis Boris Aleksandrovsky, James Whitson, Gretchen Andes, Gary Lynch, Richard Granger 562 Modeling Neurons in the Anteroventral Cochlear Nucleus for Amplitude Modulation (AM) Processing: Application to Speech Sound Ping Tang, Jean Rouat 566 Noise Suppression and Loudness Normalization in an Auditory Model-based Acoustic Front-end Halewijn Vereecken, Jean-Pierre Martens 570 A Psychoacoustic Model for the Noise Masking of Voiceless Plosive Bursts Jim Hant, Brian Strope, Abeer Alwan 574 Training Machine Classifiers to Match the Performance of Human Listeners in a Natural Vowel Classification Task Martin Hunke, Thomas Holton 578 A Neural Matrix Model for Active Tracking of Frequency-modulated Tones Kiyoaki Aikawa, Hideki Kawahara, Minoru Tsuzaki Volume 2 FrA1L1 -- Utterance Verification and Word Spotting 582 A User-Configurable System for Voice Label Recognition Richard C. Rose, Eduardo Lleida, G.W. Erhart, R.V. Grubbe 586 Keyword Spotting Enhancement for Video Soundtrack Indexing Philippe Gelin, Chris. J. Wellekens 590 New Efficient Fillers for Unlimited Word Recognition and Keyword Spotting Rachida El Méliani, Douglas O'Shaughnessy 594 Automatic Transcription of General Audio Data: Preliminary Analyses Michelle S. Spina, Victor Zue 598 Transcribing Radio News Francis Kubala, Tasos Anastasakos, Hubert Jin, Long Nguyen, Richard Schwartz 602 Correcting Recognition Errors via Discriminative Utterance Verification Anand R. Setlur, Rafid A. Sukkar, John Jacob FrA1L2 -- Acquisition/Learning Training L2 Learners 606 Does Training in Speech Perception Modify Speech Production? Reiko Akahane-Yamada, Yoh'ichi Tohkura, Ann R. Bradlow, David B. Pisoni 610 Phrase-Final Lengthening and Stress-Timed Shortening in the Speech of Native Speakers and Japanese Learners of English Motoko Ueyama 614 Japanese Accentuations by Foreign Students and Japanese Speakers of Non-Tokyo Dialect Nobuko Yamada 618 Devoicing of Japanese Vowels by Taiwanese Learners of Japanese J. Kevin Varden, Tsutomu Sato 622 Fluency and Use of Segmental Dialect Features in the Acquisition of a Second Language (French) by English Speakers Danièle Archambault, Catherine Foucher, Blagovesta Maneva 626 Estimating Child and Adolescent Formant Frequency Values From Adult Data P. Martland, S.P. Whiteside, Steve W. Beet, L. Baghai-Ravary FrA1L3 -- Focus, Stress and Accent 630 Acoustic Correlates of Linguistic Stress and Accent in Dutch and American English Agaath M.C. Sluijter, Vincent J. van Heuven 634 On the Levels of Accentuation in Spoken Japanese Hiroya Fujisaki, Sumio Ohno, Osamu Tomita 638 Tonal Distinctions Between Emphatic Stress and Pretonic Lengthening in Quebec French Linda Thibault, Marise Ouellet 642 Distinction Between 'Normal' Focus and 'Contrastive/Emphatic' Focus Anja (Petzold) Elsner 646 Perception of Tonal Accent by Americans Learning Japanese Yukihiro Nishinuma, Masako Arai, Takako Ayusawa 650 Modeling Intra-Speaker Pitch Range Variation: Predicting F0 Targets when "Speaking Up" Elizabeth Shriberg, D. Robert Ladd, Jacques Terken FrA1P1 -- Spoken Language Dialogue and Conversation 654 Predicting Dialogue Acts for a Speech-To-Speech Translation System Norbert Reithinger, Ralf Engel, Michael Kipp, Martin Klesen 658 Automatic Speech Translation Based on the Semantic Structure Johannes Müller, Holger Stahl, Manfred Lang 662 A Methodology for Application Development for Spoken Language Systems Lewis M. Norton, Carl E. Weir, K.W. Scholz, Deborah A. Dahl, Ahmed Bouzid 665 A New Restaurant Guide Conversational System: Issues in Rapid Prototyping for Specialized Domains Stephanie Seneff, Joseph Polifroni 669 Semantic Interpretation of a Japanese Complex Sentence in an Advisory Dialogue - Focused on the Postpositional Word "KEDO,'' Which Works as a Conjunction Between Clauses Tadahiko Kumamoto, Akira Ito 673 A Korean Morphological Analyzer for Speech Translation System Youngkuk Hong, Myoung-Wan Koo, Gijoo Yang 677 Generic and Domain-specific Aspects of the Waxholm NLP and Dialog Modules Rolf Carlson, Sheri Hunnicutt 681 A Real-Time System for Summarizing Human-Human Spontaneous Spoken Dialogues Megumi Kameyama, Goh Kawai, Isao Arima 685 Evaluation of Spoken Language Understanding and Dialogue Systems Bernd Hildebrandt, Heike Rautenstrauch, Gerhard Sagerer 689 Inter-Speaker Interaction of F0 in Dialogs Kuniko Kakita 693 A Robust Dialogue System for Making an Appointment Hans Brandt-Pook, Gernot A. Fink, Bernd Hildebrandt, Franz Kummert, Gerhard Sagerer 697 Segmentation of Spoken Dialogue by Interjections, Disfluent Utterances and Pauses Kazuyuki Takagi, Shuichi Itahashi 701 A Form-Based Dialogue Manager for Spoken Language Applications David Goddeau, Helen Meng, Joe Polifroni, Stephanie Seneff, Senis Busayapongchai 705 The Design of Complex Telephony Applications Using Large Vocabulary Speech Technology S.J. Whittaker, D.J. Attwater 709 Building 10,000 Spoken Dialogue Systems Stephen Sutton, David G. Novick, Ronald A. Cole, Pieter Vermeulen, Jacques de Villiers, Johan Schalkwyk, Mark Fanty 713 Speaker Intention Modeling for Large Vocabulary Mandarin Spoken Dialogues Yen-Ju Yang, Lee-Feng Chien, Lin-Shan Lee 717 Hybrid Language Models and Spontaneous Legal Discourse P.E. Kenne, Mary O'Kane 721 Topic Change and Local Perplexity in Spoken Legal Dialogue P.E. Kenne, Mary O'Kane 725 Intonational Cues to Discourse Structure in Japanese Jennifer J. Venditti, Marc Swerts 729 Principles for the Design of Cooperative Spoken Human-Machine Dialogue Niels Ole Bernsen, Hans Dybkjær, Laila Dybkjær 733 Development and Comparison of Three Syllable Stress Classifiers Karen L. Jenkin, Michael S. Scordilis FrA1P2 -- Speech Disorders 737 Interaction of Speech Disorders with Speech Coders: Effects on Speech Intelligibility D.G. Jamieson, Li Deng, M. Price, Vijay Parsa, J. Till 741 Detecting Arytenoid Cartilage Misplacement through Acoustic and Electroglottographic Jitter Analysis Maurílio N. Vieira, Arnold G. D. Maran, Fergus R. McInnes, Mervyn A. Jack 745 Robust F0 and Jitter Estimation in Pathological Voices Maurílio N. Vieira, Fergus R. McInnes, Mervyn A. Jack 749 Speech Monitoring of Infective Laryngitis F. Plante, H. Kessler, B.M.G. Cheetham, J. Earis 753 Searching for Nonlinear Relations in Whitened Jitter Time Series J. Schoentgen, R. De Guchteneere 757 Vocal Fold Pathology Assessment using AM Autocorrelation Analysis of the Teager Energy Operator Liliana Gavidia-Ceballos, John H.L. Hansen, James F. Kaiser 761 Continuous Positive Airway Pressure (CPAP) in the Treatment of Hypernasality David P. Kuehn 764 Enhancement of Alaryngeal Speech by Adaptive Filtering Carol Y. Espy-Wilson, Venkatesh R. Chari, Caroline B. Huang 768 Simulation of Disordered Speech Using a Frequency-Domain Vocal Tract Model Li Deng, Xuemin Shen, D.G. Jamieson, J. Till 772 A Stochastic Model of Fundamental Period Perturbation and Its Application to Perception of Pathological Voice Quality Yasuo Endo, Hideki Kasuya 776 A Screening Test for Speech Pathology Assessment Using Objective Quality Measures Eric J. Wallen, John H.L. Hansen 780 Recent Advances in Hypernasal Speech Detection using the Nonlinear Teager Energy Operator Douglas A. Cairns, John H.L. Hansen, James F. Kaiser FrA1S1 -- Vocal Tract Geometry I 784 Human Palate and Related Structures: Their Articulatory Consequences Kiyoshi Honda, Shinji Maeda, Michiko Hashi, Jim Dembowski, John R. Westbury 788 A Continuum Mechanics Representation of Tongue Deformation Edward P. Davis, Andrew Douglas, Maureen Stone 793 From MRI and Acoustic Data to Articulatory Synthesis: A Case Study of the Lateral Approximants in American English Philbert Bangayan, Abeer Alwan, Shrikanth Narayanan 797 Liquids in Tamil Shrikanth Narayanan, Abigail Kaun, Dani Byrd, Peter Ladefoged, Abeer Alwan FrA2L1 -- Prosody in ASR and Segmentation 801 Modeling Hyperarticulate Speech during Human-computer Error Resolution Sharon Oviatt, Gina-Anne Levow, Margaret MacEachern, Karen Kuhn 805 Using Stress to Disambiguate Spoken Thai Sentences Containing Syntactic Ambiguity Siripong Potisuk, Mary P. Harper, Jackson T. Gandour 809 Use of Prosodic Information to Integrate Acoustic and Linguistic Knowledge in Continuous Mandarin Speech Recognition with Very Large Vocabulary Hung-yun Hsieh, Ren-yuan Lyu, Lin-shan Lee 813 Word Boundary Detection using Pitch Variations G.V. Ramana Rao, J. Srichand 817 Detection of Phrase Boundaries in Japanese by Low-Pass Filtering of Fundamental Frequency Contours Atsuhiro Sakurai, Keikichi Hirose 821 A New Method for Speech Delexicalization, and its Application to the Perception of French Prosody V. Pagel, N. Carbonell, Yves Laprie FrA2L2 -- Acquisition and Learning by Machine 825 Task Adaptation for Dialogues Via Telephone Lines Udo Bub 829 The Influence of Bigram Constraints on Word Recognition by Humans: Implications for Computer Speech Recognition Ronald A. Cole, Yonghong Yan, Troy Bailey 833 ALICE: Acquisition of Language In Conversational Environment - An Approach to Weakly Supervised Training of Spoken Language System for Language Porting Tetsunori Kobayashi 837 Pitch Pattern Clustering of User Utterances in Human-Machine Dialogue Takashi Yoshimura, Satoru Hayamizu, Hiroshi Ohmura, Kazuyo Tanaka 841 Simplifying Language through Error-correcting Decoding J.C. Amengual, E. Vidal, J.M. Benedí 845 A Mixed Approach to Speech Understanding Mauro Cettolo, Anna Corazza, Renato De Mori FrA2L3 -- Dialogue Systems 849 Speech Recognition for an Information Kiosk J.L. Gauvain, J.J. Gangolf, L. Lamel 853 Localizing an Automatic Inquiry System for Public Transport Information Helmer Strik, Albert Russel, Henk van den Heuvel, Catia Cucchiarini, Louis Boves 857 Prompt Constrained Natural Language - Evolving the Next Generation of Telephony Services Stephen M. Marcus, Deborah W. Brown, Randy G. Goldberg, Max S. Schoeffler, William R. Wetzel, Richard R. Rosinski 861 Key-Phrase Detection and Verification for Flexible Speech Understanding Tatsuya Kawahara, Chin-Hui Lee, Biing-Hwang Juang 865 Interactive Recovery from Speech Recognition Errors in Speech User Interfaces Bernhard Suhm, Brad Myers, Alex Waibel 869 Estimation of Language Models for New Spoken Language Applications Sunil Issar FrA2P1 -- Speech Enhancement and Robust Processing 873 H-infinity Filtering for Speech Enhancement Xuemin Shen, Li Deng, Anisa Yasmin 877 A Comparitive Analysis of Channel-Robust Features and Channel Equalization Methods for Speech Recognition Saeed V. Vaseghi, Ben Milner 881 Robust Speech Recognition Features Based on Temporal Trajectory Filtering of Frequency Band Spectrum Jia-lin Shen, Wen-liang Hwang, Lin-shan Lee 885 Durational Modelling for Improved Connected Digit Recognition Kevin Power 889 Study on the Dereverberation of Speech Based on Temporal Envelope Filtering Carlos Avendano, Hynek Hermansky 893 Estimating Markov Model Structures Thorsten Brants 897 A Fertility Channel Model for Post-Correction of Continuous Speech Recognition Eric K. Ringger, James F. Allen 901 Restoration of Wide Band Signal from Telephone Speech using Linear Prediction Error Processing Hiroshi Yasukawa 905 Smoothed Spectral Subtraction for a Frequency-Weighted HMM in Noisy Speech Recognition Hiroshi Matsumoto, Noboru Naitoh 909 A Simple Architecture for using Multiple Cues in Sound Separation William S. Woods, Martin Hansen, Thomas Wittkop, Birger Kollmeier 913 On the Robust Automatic Segmentation of Spontaneous Speech Bojan Petek, Ove Andersen, Paul Dalsgaard 917 Bayesian Adaptation of Speech Recognizers to Field Speech Data C.G. Miglietta, C. Mokbel, D. Jouvet, J. Monné 921 Sub-band Adaptive Filtering Applied to Speech Enhancement A. J. Darlington, D. J. Campbell 925 Noise Robust Estimate of Speech Dynamics for Speaker Recognition J.P. Openshaw, John S. Mason 929 Overview of Speech Enhancement Techniques for Automatic Speaker Recognition Javier Ortega-García, Joaquín González-Rodríguez 933 Dynamic Features for Segmental Speech Recognition Naomi Harte, Saeed V. Vaseghi, Ben Milner 937 Speech Recognition Based on a Model of Human Auditory System Takuya Koizumi, Mikio Mori, Shuji Taniguchi 941 APVQ Encoder Applied to Wideband Speech Coding J.M. Salavedra, E. Masgrau 945 Simple Fast Vector Quantization of the Line Spectral Frequencies Jin Zhou, Yair Shoham, Ali Akansu FrA2S1 -- Vocal Tract Geometry II 949 Speaker Individualities of Vocal Tract Shapes of Japanese Vowels Measured by Magnetic Resonance Images Chang-Sheng Yang, Hideki Kasuya 953 Vocal Tract Acoustics Using the Transmission Line Matrix (TLM) Method S. El-Masri, X. Pelorson, P. Saguet, P. Badin 957 Building Sensori-motor Prototypes from Audiovisual Exemplars Gérard Bailly 961 Parameterized VT Area Function Inversion Mats Båvegård, Gunnar Fant 965 An Improved Vocal Tract Model of Vowel Production Implementing Piriform Resonance and Transvelar Nasal Coupling Jianwu Dang, Kiyoshi Honda 969 Pseudo-articulatory Speech Synthesis for Recognition using Automatic Feature Extraction from X-Ray Data C. S. Blackburn, S. J. Young FrP1L1 -- Speaker Adaptation and Normalization I 973 N-best-based Instantaneous Speaker Adaptation Method for Speech Recognition Tomoko Matsui, Sadaoki Furui 977 Mixture Splitting Technic and Temporal Control in a HMM-based Recognition System C. Montacié, M.-J. Caraty, C. Barras 981 A Unified Spectral Transformation Adaptation Approach for Robust Speech Recognition Lei Yao, Dong Yu, Taiyi Huang 985 On-line Adaptive Learning of the Correlated Continuous Density Hidden Markov Models for Speech Recognition Qiang Huo, Chin-Hui Lee 989 Speaker Adaptation by Modeling the Speaker Variation in a Continuous Speech Recognition System Nikko Ström 993 An Enquiring System of Unknown Words in TV News by Spontaneous Repetition (Application of Speaker Normalization by Speaker Subspace Projection) Yasuo Ariki, Shigeaki Tagashira FrP1L2 -- Spoken Language and NLP I 997 Language Understanding using Hidden Understanding Models Richard Schwartz, Scott Miller, David Stallard, John Makhoul 1001 Processing of Semantic Information in Fluently Spoken Language Allen L. Gorin 1005 Automatic Linguistic Segmentation of Conversational Speech Andreas Stolcke, Elizabeth Shriberg 1009 Towards Understanding Spontaneous Speech: Word Accuracy vs. Concept Accuracy M. Boros, W. Eckert, Florian Gallwitz, G. Görz, G. Hanrieder, Heinrich Niemann 1013 A Stochastic Case Frame Approach for Natural Language Understanding Wolfgang Minker, S.K. Bennacef, J.L. Gauvain 1017 Improving Speech Understanding by Incorporating Database Constraints and Dialogue History Frank Seide, Bernhard Rüber, Andreas Kellner FrP1L3 -- Spoken Discourse Analysis/Synthesis 1021 A New Discourse Structure Model for Spontaneous Spoken Dialogue Tetsuro Chino, Hiroyuki Tsuboi 1025 An Architecture for Spoken Dialogue Management David Duff, Barbara Gates, Susann LuperFoy 1029 Pausing Strategies in Discourse in Dutch Monique E. van Donzel, Florien J. Koopmans-van Beinum 1033 Filled Pauses as Markers of Discourse Structure Marc Swerts, Anne Wichmann, Robbert-Jan Beun 1037 The Prosodic Analysis of Korean Dialogue Speech - Through a Comparative Study with Read Speech Cheol-jae Seong, Minsoo Hahn 1041 Changing the Topic: How Long Does it Take? Mary O'Kane, P.E. Kenne FrP1P1 -- Acoustic Modeling I 1045 Learning Pronunciation Dictionary from Speech Data Christian-Michael Westendorf, Jens Jelitto 1049 The Trended HMM with Discriminative Training for Phonetic Classification C. Rathinavelu, Li Deng 1053 Improving Decision Trees for Acoustic Modeling Ariane Lazaridès, Yves Normandin, Roland Kuhn 1057 An Improved Training Algorithm in HMM-based Speech Recognition Gongjun Li, Taiyi Huang 1061 Speech Recognition Using a Strong Correlation Assumption for the Instantaneous Spectra J. Ming, P. O'Boyle, J. McMahon, F. J. Smith 1065 On Parameter Filtering in Continuous Subword-unit-based Speech Recognition Pau Pachès-Leal, Climent Nadeu 1069 Estimation of Statistical Phoneme Center Considering Phonemic Environments Shigeki Okawa, Katsuhiko Shirai 1073 Integration of Context-dependent Durational Knowledge into HMM-based Speech Recognition Xue Wang, Louis F. M. ten Bosch, Louis C. W. Pols 1077 Speech Recognition Based on Acoustically Derived Segment Units T. Fukada, M. Bacchiani, K.K. Paliwal, Yoshinori Sagisaka 1081 Robust Gender-dependent Acoustic-phonetic Modelling in Continuous Speech Recognition Based on a New Automatic Male/Female Classification Rivarol Vergin, Azarshid Farhat, Douglas O'Shaughnessy 1085 A Codebook Adaptation Algorithm for SCHMM Using Formant Distribution Tae Young Yang, Won Ho Shin, Weon Goo Kim, Dae Hee Youn 1089 Parameter Tying for Flexible Speech Recognition J. Simonin, S. Bodin, D. Jouvet, K. Bartkova 1093 Word-spotting Based on Inter-word and Intra-word Diphone Models Tsuneo Nitta, Shin'ichi Tanaka, Yasuyuki Masai, Hiroshi Matsu'ura 1097 Duration Modeling with Expanded HMM Applied to Speech Recognition Antonio Bonafonte, Josep Vidal, Albino Nogueiras 1101 Different Strategies for Distribution Clustering using Discrete, Semicontinuous and Continuous HMMs in CSR Ricardo de Córdoba, José M. Pardo 1105 Improved HMM Phone and Triphone Models for Realtime ASR Telephony Applications Ilija Zeljkovic, Shrikanth Narayanan 1109 Improved Extended HMM Composition by Incorporating Power Variance Yasuhiro Minami, Sadaoki Furui 1113 Optimal Filtering and Smoothing for Speech Recognition using a Stochastic Target Model Gordon Ramsay, Li Deng 1117 Speech Recognition Using Syllable-Like Units Zhihong Hu, Johan Schalkwyk, Etienne Barnard, Ronald A. Cole FrP1S1 -- Physics and Simulation of the Vocal Tract I 1121 Search for Unexplored Effects in Speech Production C.H. Coker, M.H. Krane, B.Y. Reis, R.A. Kubli * Computational Models for Speech Generation S. Levinson 1125 Articulatory Synthesis from X-rays and Inversion for an Adaptive Speech Robot P. Badin, C. Abry FrP2L1 -- Speaker Adaptation and Normalization II 1129 Adaptive Recognition Method Based on Posterior Use of Distribution Pattern of Output Probabilities Jin-Song Zhang, Beiqian Dai, Changfu Wang, Hingkeung Kwan, Keikichi Hirose 1133 Iterative Unsupervised Adaptation Using Maximum Likelihood Linear Regression P.C. Woodland, D. Pye, M.J.F. Gales 1137 A Compact Model for Speaker-Adaptive Training Tasos Anastasakos, John McDonough, Richard Schwartz, John Makhoul 1141 Iterative Unsupervised Speaker Adaptation for Batch Dictation Shigeru Homma, Jun-ichi Takahashi, Shigeki Sagayama 1145 Rapid Unsupervised Adaptation to Children's Speech on a Connected-Digit Task Daniel C. Burnett, Mark Fanty 1149 Speaker Adaptation Using Tree Structured Shared-State HMMs Jun Ishii, Masahiro Tonomura, Shoichi Matsunaga FrP2L2 -- Spoken Language and NLP II 1153 Learning to Parse Spontaneous Speech Finn Dag Buo, Alex Waibel 1157 Spontaneous Speech and Natural Language Processing ALPES: A Robust Semantic-led Parser Jean-Yves Antoine 1161 The Natural Language Processing Module for a Voice Assisted Operator at Telefónica I+D J. Alvarez-Cercadillo, J. Caminero-Gil, C. Crespo-Casas, D. Tapias-Merino 1165 Compound Words in Large-Vocabulary German Speech Recognition Systems André Berton, Pablo Fetter, Peter Regel-Brietzmann 1169 Prosody, Empty Categories and Parsing - A Success Story Anton Batliner, A. Feldhaus, S. Geissler, T. Kiss, Ralf Kompe, Elmar Nöth 1173 "Almost Parsing" Technique for Language Modeling B. Srinivas FrP2L3 -- Duration and Rhythm 1177 From Segmental Duration Properties to Rhythmic Structure: A Study of Interactions Between High and Low Level Constraints Marise Ouellet, Benoît Tardif 1181 Analysis of Context-dependent Segmental Duration for Automatic Speech Recognition Xue Wang, Louis C. W. Pols, Louis F. M. ten Bosch 1185 The Role of the Rhythmic Groups in the Segmentation of Continuous French Speech Delphine Dahan 1189 The Implications of Temporal Patterns for the Prosody of Boundary Signaling in Connected Speech Zita McRobbie-Utasi 1193 Experimental Phonetic Study of the Syllable Duration of Korean with Respect to the Positional Effect Hyunbok Lee, Cheol-jae Seong 1197 Timing of Pitch Movements and Accentuation of Syllables Dik J. Hermes FrP2P1 -- Acoustic Analysis 1201 A Probabilistic Approach to AMDF Pitch Detection Goangshiuan S. Ying, Leah H. Jamieson, Carl D. Michell 1205 From Sagittal Cut to Area Function: An RMI Investigation Alain Soquet, Véronique Lecuit, Thierry Metens, Didier Demolin 1209 Pitch Detection and Voiced/Unvoiced Decision Algorithm Based on Wavelet Transforms Léonard Janer, Juan José Bonet, Eduardo Lleida-Solano 1213 Decomposition of Speech Signals into a Deterministic and a Stochastic Part Yannis Stylianou 1217 Improved Glottal Closure Instant Detector based on Linear Prediction and Standard Pitch Concept Cheol-Woo Jo, Ho-Gyun Bang, W.A. Ainsworth 1221 Analysis of Speech Segments using Variable Spectral/Temporal Resolution Xihong Wang, Stephen A. Zahorian, Stefan Auberg 1225 Time-based Clustering for Phonetic Segmentation Brian Eberman, William Goldenthal 1229 Formant Analysis Using Mixtures of Gaussians Parham Zolfaghari, Tony Robinson 1233 Deriving Articulatory Representations from Speech with Various Excitation Modes Hywel B. Richards, John S. Mason, Melvyn J. Hunt, John S. Bridle 1237 "Blind" Speech Segmentation: Automatic Segmentation of Speech Without Linguistic Knowledge Manish Sharma, Richard J. Mammone 1241 Speech Synthesis Using a Nonlinear Energy Damping Model for the Vocal Folds Vibration Effect Hiroshi Ohmura, Kazuyo Tanaka 1245 Neural Networks Learning with L1 Criteria and Its Efficiency in Linear Prediction of Speech Signals Munehiro Namba, Hiroyuki Kamata, Yoshihisa Ishida 1249 Preprocessing and Neural Classification of English Stop Consonants [b,d,g,p,t,k] A. Esposito, C. E. Ezin, M. Ceccarelli 1253 A Comparison of Modified k-means(MKM) and NN based Real Time Adaptive Clustering Algorithms for Articulatory Space Codebook Formation K.S. Ananthakrishnan 1257 A Novel Approach to the Estimation of Voice Source and Vocal Tract Parameters from Speech Signals Wen Ding, Hideki Kasuya 1261 Syllable Detection in Read and Spontaneous Speech Hartmut R. Pfitzinger, Susanne Burger, Sebastian Heid 1265 Maximum Likelihood Learning of Auditory Feature Maps for Stationary Vowels Kuansan Wang, Chin-Hui Lee, Biing-Hwang Juang 1269 Explicit Segmentation of Speech using Gaussian Models Antonio Bonafonte, Albino Nogueiras, Antonio Rodriguez-Garrido 1273 A Comparison of Several Recent Methods of Fundamental Frequency and Voicing Decision Estimation E. Mousset, W.A. Ainsworth, José A. R. Fonollosa 1277 Robust Pitch Estimation with Harmonics Enhancement in Noisy Environments Based on Instantaneous Frequency Toshihiko Abe, Takao Kobayashi, Satoshi Imai 1281 Integrated Polispectrum on Speech Recognition Asunción Moreno, Miquel Rutllán FrP2S1 -- Physics and Simulation of the Vocal Tract II 1285 Analysis of Acoustic Properties of the Nasal Tract Using 3-D FEM Hisayoshi Suzuki, Takayoshi Nakai, Hirosi Sakakibara 1289 Experiments with Analysis By Synthesis of Glottal Airflow Johan Liljencrants Volume 3 SaA1L1 -- Speech Recognition Using HMMs and NNs 1293 An Incremental Speaker-Adaptation Technique for Hybrid HMM-MLP Recognizer Joao P. Neto, Ciro A. Martins, Luís B. Almeida 1297 Phoneme Segmentation of Continuous Speech using Multi-layer Perceptron Youngjoo Suh, Youngjik Lee 1301 Stochastic Perceptual Speech Models with Durational Dependence Jeff Bilmes, Nelson Morgan, Su-Lin Wu, Hervé Bourlard 1305 Boosting the Performance of Connectionist Large Vocabulary Speech Recognition G.D. Cook, A.J. Robinson 1309 HMMs and OWE Neural Network for Continuous Speech Recognition Nicolas Pican, Dominique Fohr, Jean-François Mari 1313 Smoothed Local Adaptation of Connectionist Systems Steve Waterhouse, Dan Kershaw, Tony Robinson SaA1L2 -- Adverse Environments and Multiple Microphones 1317 Robust Speech Recognition with Speaker Localization by a Microphone Array Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano 1321 Sound Source Localization in Reverberant Environments using an Outlier Elimination Algorithm Ea-Ee Jan, James L. Flanagan 1325 The 1995 Abbot LVCSR System for Multiple Unknown Microphones Dan Kershaw, Tony Robinson, Steve Renals 1329 Experiments of Speech Recognition in a Noisy and Reverberant Environment using a Microphone Array and HMM Adaptation D. Giuliani, M. Omologo, P. Svaizer 1333 Increasing Robustness in GMM Speaker Recognition Systems for Noisy and Reverberant Speech with Low Complexity Microphone Arrays Joaquín González-Rodríguez, Javier Ortega-García, César Martin, Luis Hernández 1337 Robust Automatic Speech Recognition Using a Multi-channel Signal Separation Front-End Kuan-Chieh Yen, Yunxin Zhao SaA1L3 -- Prosodic Synthesis in Dialogue 1341 Prosody Generation in Text-to-Speech Conversion Using Dependency Graphs Anders Lindström, Ivan Bretan, Mats Ljungqvist 1345 Extraction Method of Non-restrictive Modification in Japanese as a Marked Factor of Prosody Hisako Asano, Hisashi Ohara, Yoshifumi Ooyama 1349 Modeling Contrast in the Generation and Synthesis of Spoken Language Scott Prevost 1353 A Left-to-right Processing Model of Pausing in Japanese Based on Limited Syntactic Information Hajime Tsukada 1357 Modeling of Intonation Bearing Emphasis for TTS-Synthesis of Greek Dialogues D. Galanis, V. Darsinos, G. Kokkinakis 1361 Synthesizing Prosody: a Prominence-based Approach Barbara Heuft, Thomas Portele SaA1P1 -- Speech Synthesis 1365 Multilingual Text Analysis for Text-to-Speech Synthesis Richard Sproat 1369 Spoken-style Explanation Generator for Japanese Kanji using a Text-to-speech System Yoshifumi Ooyama, Hisako Asano, Koji Matsuoka 1373 A Method for Estimating Prosodic Symbol from Text for Japanese Text-To-Speech Synthesis Ken-ichi Magata, Tomoki Hamagami, Mitsuo Komura 1377 Statistical Methods in Data-driven Modeling of Spanish Prosody for Text to Speech E. López-Gonzalo, J.M. Rodríguez-García 1381 Intonation Processing for TTS Using Stylization and Neural Network Learning Method Jung-Chul Lee, Youngjik Lee, Sang-Hun Kim, Minsoo Hahn 1385 Generating F0 Contours from ToBI Labels using Linear Regression Alan W. Black, Andrew J. Hunt 1389 The Broad Study of Homograph Disambiguity for Mandarin Speech Synthesis Wern-Jun Wang, Shaw-Hwa Hwang, Sin-Horng Chen 1393 The MBROLA project: Towards a Set of High Quality Speech Synthesizers Free of Use for Non Commercial Purposes T. Dutoit, V. Pagel, N. Pierret, F. Bataille, O. Van der Vrecken 1397 Training Data Selection for Voice Conversion Using Speaker Selection and Vector Field Smoothing Makoto Hashimoto, Norio Higuchi 1401 A New Voice Transformation Method Based on Both Linear and Nonlinear Prediction Analysis Ki Seung Lee, Dae Hee Youn, Il Whan Cha 1405 On the Transformation of the Speech Spectrum for Voice Conversion G. Baudoin, Yannis Stylianou 1409 Spectral Analysis of Synthetic Speech and Natural Speech with Noise over the Telephone Line Cristina Delogu, Andrea Paoloni, Susanna Ragazzini, Paola Ridolfi 1413 A New Speech Synthesis System Based on the ARX Speech Production Model Weizhong Zhu, Hideki Kasuya 1417 Speech Synthesis Using the CELP Algorithm Geraldo Lino de Campos, Evandro Bacci Gouvêa 1421 A Mandarin Text-to-Speech System Shaw-Hwa Hwang, Sin-Horng Chen, Yih-Ru Wang 1425 Residual-based Speech Modification Algorithms for Text-to-Speech Synthesis M.D. Edgington, A. Lowry 1429 A Generalized LR Parser for Text-to-speech Synthesis Per Olav Heggtveit 1433 Enhanced Shape-invariant Pitch and Time-scale Modification for Concatenative Speech Synthesis M.P. Pollard, B.M.G. Cheetham, C.C. Goodyear, M.D. Edgington, A. Lowry 1437 An Excitation Synchronous Pitch Waveform Extraction Method and its Application to the VCV-concatenation Synthesis of Japanese Spoken Words Yasuhiko Arai, Ryo Mochizuki, Hirofumi Nishimura, Takashi Honda 1441 A New Chinese Text-to-Speech System with High Naturalness Ren-Hua Wang, Qinfeng Liu, Difei Tang 1445 Voice Conversion Based on Topological Feature Maps and Time-variant Filtering Ansgar Rinscheid SaA1P2 -- Instructional Technology for Spoken Language 1449 Language Training System Utilizing Speech Modification Meron Yoram, Keikichi Hirose 1453 Perception of English /r/ and /l/ Speech Contrasts by Native Korean Listeners with Extensive English-language Experience D.G. Jamieson, K. Yu 1457 Automatic Text-independent Pronunciation Scoring of Foreign Language Student Speech Leonardo Neumeyer, Horacio Franco, Mitchel Weintraub, Patti Price 1461 Assessing the Contribution of Instructional Technology in the Teaching of Pronunciation Antônio Simoes 1465 Detection of Foreign Speakers' Pronunciation Errors for Second Language Training - Preliminary Results Maxine Eskenazi 1469 Foreign Accent in Intonation Patterns - A Contrastive Study Applying a Quantitative Model of the F0 Contour Hansjörg Mixdorff 1473 Input Modality Effects in Foreign Accent Duncan J. Markham, Yasuko Nagano-Madsen SaA1S1 -- Multimodal Spoken Language Processing I 1477 For Speech Perception by Humans or Machines, Three Senses are Better than One Lynne E. Bernstein, Christian Benoît 1481 A Few Factors Which Affect the Degree of Incorporating Lip-read Information into Speech Perception Kaoru Sekiyama, Yoh'ichi Tohkura, Michio Umeda 1485 Characterizing Audiovisual Information During Speech E. Vatikiotis-Bateson, K.G. Munhall, Y. Kasahara, F. Garcia, H. Yehia 1489 The Implications of the Tadoma Method of Speechreading for Spoken Language Processing Charlotte M. Reed 1493 Seeing Speech in Space and Time: Psychological and Neurological Findings Ruth Campbell SaA2L1 -- Prosody - Phonological/Phonetic Measures 1497 What's in the "Pure" Prosody? Volker Strom, Christina Widera 1501 F0 Declination in Read-aloud and Spontaneous Speech Marc Swerts, Eva Strangert, Mattias Heldner 1505 Prediction of Prosodic Phrase Boundaries Considering Variable Speaking Rate Yeon-jun Kim, Yung-hwan Oh 1509 Prediction of F0 Parameter of Contextualized Utterances in Dialogue Yoichi Yamashita, Riichiro Mizoguchi 1513 The Production and Perception of Potentially Ambiguous Intonation Contours by Speakers of Russian and Japanese V. Makarova, J. Matsui 1517 What is Invariant and What is Optional in the Realization of a FOCUSED Word? A Cross-dialectal Study of Swedish Sentences With Moving Focus Robert Eklund SaA2L2 -- Phonetics and Perception 1521 Quantifying Spectral Characteristics of Fricatives Christine H. Shadle, Sheila J. Mair 1525 Acoustic Characteristics of Ejectives in Ingush Natasha Warner 1529 An Acoustic Profile of Consonant Reduction R.J.J.H. van Son, Louis C. W. Pols 1533 Devoicing in Post-vocalic Canadian-French Obstruants Danièle Archambault, Blagovesta Maneva 1537 Paying Attention to Speaking Rate Alexander L. Francis, Howard C. Nusbaum 1541 The Lack of Invariance Problem and the Goal of Speech Perception Irene Appelbaum SaA2L3 -- Language Acquisition 1545 The Acoustic Structure of Vowels in Mothers' Speech to Infants and Adults Jean E. Andruski, Patricia K. Kuhl 1549 Acoustical Characteristics of Sound Production of Deaf and Normally Hearing Infants Chris J. Clement, Florien J. Koopmans-van Beinum, Louis C. W. Pols 1553 Learning Non-native Vowel Categories John Kingston, Christine Bartels, José Benkí, Deanna Moore, Jeremy Rice, Rachel Thorburn, Neil Macmillan 1557 Word Recognition by Japanese Infants P.A. Halle, Toshisada Deguchi, Yuji Tamekawa, B. Boysson-Bardies, Shigeru Kiritani 1561 Investigations of the Word Segmentation Abilities of Infants Peter W. Jusczyk 1565 Developmental Change in Perception of Clause Boundaries by 6- and 10-Month-old Japanese Infants Akiko Hayashi, Yuji Tamekawa, Toshisada Deguchi, Shigeru Kiritani SaA2P1 -- Production and Prosody Posters 1569 A Frequency Domain Method for Parametrization of the Voice Source Paavo Alku, Erkki Vilkman 1573 Glottal Correlates of the Word Stress and the Tense/Lax Opposition in German Krzysztof Marasek 1577 Coarticulatory Stability in American English /r/ Suzanne Boyce, Carol Y. Espy-Wilson 1581 An MRI-based Analysis of the English /r/ and /l/ Articulations Shinobu Masaki, Reiko Akahane-Yamada, Mark K. Tiede, Yasuhiro Shimada, Ichiro Fujimoto 1585 Does Lexical Stress or Metrical Stress Better Predict Word Boundaries in Dutch? David van Kuijk 1589 Optopalatograph (OPG): A New Apparatus for Speech Production Analysis A. A. Wrench, A. D. McIntosh, W. J. Hardcastle 1593 Prediction of Vowel Systems using a Deductive Approach René Carré 1597 Distinctions Between [t] and [tch] using Electropalatography Data Sheila J. Mair, Celia Scully, Christine H. Shadle 1601 Relating Formants and Articulation in Intelligibility Test Words Michiko Hashi, Raymond D. Kent, John R. Westbury, Mary J. Lindstrom 1605 The Role of Coarticulation in the Perception of Vowel Quality in Modern Standard Arabic Imad Znagui, Mohamed Yeou 1609 Updating the Reading EPG Simon Arnfield, Wilf Jones 1612 Lexical Stress Detection on Stress-minimal Word Pairs Goangshiuan S. Ying, Leah H. Jamieson, Ruxin Chen, Carl D. Mitchell 1616 An Acoustic Study of the Interaction Between Stressed and Unstressed Syllables in Spoken Mandarin Jing Wang 1620 Automatic Detection of Accent Nuclei at the Head of Words for Speech Recognition Nobuaki Minematsu, Seiichi Nakagawa 1624 Automatic Generation of Prosodic Structure for High Quality Mandarin Speech Synthesis Fu-chiang Chou, Chiu-yu Tseng, Lin-shan Lee 1628 A Study on Japanese Prosodic Pattern and its Modeling in Restricted Speech Tomoki Hamagami, Ken-ichi Magata, Mitsuo Komura 1632 A Phonetic Study of Focus in Intransitive Verb Sentences Steve Hoskins * Variation in Vocal Fold Vibration Associated with Prosodic Conditions Shigeru Kiritani, Hiroshi Imagawa, Seiji Niimi 1636 Goethe for Prosody Stefan Rapp 1640 Prosodic Cues in Syntactically Ambiguous Strings; An Interactive Speech Planning Mechanism K.A. Straub 1644 A Functional Model for Generation of the Local Components of F0 Contours in Chinese Jinfu Ni, Ren-Hua Wang, Deyu Xia 1648 The Acquisition of Voiceless Stops in the Interlanguage of Second Language Learners of English and Spanish Marie Fellbaum SaA2S1 -- Multimodal Spoken Language Processing II 1652 Studies of the McGurk Effect: Implications for Theories of Speech Perception Kerry P. Green 1656 Using the Visual Component in Automatic Speech Recognition N. M. Brooke 1660 Perceptual Organization of Speech in One and Several Modalities: Common Functions, Common Resources Robert E. Remez 1664 Multi-modal Encoding of Speech in Memory: A First Report David B. Pisoni, Helena M. Saldaña, Sonya M. Sheffert SaP1L1 -- User-Machine Interfaces 1668 Evaluating Automatic Speech Recognition as a Component of a Multi-input Device Human-computer Interface B.A. Mellor, C. Baber, C. Tunley 1672 Data Collection for the MASK Kiosk: WOz vs Prototype System A. Life, I. Salter, J.N. Temem, F. Bernard, S. Rosset, S.K. Bennacef, Lori Lamel 1676 An Experimental Japanese/English Interpreting Video Phone System M. Karaorman, T.H. Applebaum, T. Itoh, M. Endo, Y. Ohno, M. Hoshimi, T. Kamai, K. Matsui, K. Hata, S. Pearson, J.-C. Janqua 1680 User Participation and Compliance in Speech Automated Telecommunications Applications Sara Basson, Stephen Springer, Cynthia Fong, Hong Leung, Ed Man, Michele Olson, John Pitrelli, Ranvir Singh, Suk Wong 1684 Embedding Speech in Web Interfaces Samuel Bayer 1688 Voice-activated Home Banking System and its Field Trial Toshihiro Isobe, Masatoshi Morishima, Fuminori Yoshitani, Nobuo Koizumi, Ken'ya Murakami SaP1L2 -- TTS Systems and Rules 1692 A Text Analyzer for Korean Text-to-Speech Systems Sangho Lee, Yung-Hwan Oh 1696 Design and Evaluation of a Phonological Phrase Parser for Spanish Text-to-Speech Helen E. Karn 1700 Comparison of Two Tree-Structured Approaches for Grapheme-to-Phoneme Conversion Ove Andersen, Roland Kuhn, Ariane Lazaridès, Paul Dalsgaard, Jürgen Haas, Elmar Nöth 1704 A Recurrent Network that Learns to Pronounce English Text M.J. Adamson, R.I. Damper 1708 Archisegment-based Letter-to-Phone Conversion for Concatenative Speech Synthesis in Portuguese Eleonora Cavalcante Albano, Agnaldo Antonio Moreira 1712 A New Method of Generating Speech Synthesis Units Based on Phonological Knowledge and Clustering Technique Yuki Yoshida, Shin'ya Nakajima, Kazuo Hakoda, Tomohisa Hirokawa SaP1L3 -- Prosody and Labeling 1716 Consistency in Transcription and Labelling of German Intonation with GToBI Martine Grice, Matthias Reyelt, Ralf Benzmüller, Jörg Mayer, Anton Batliner 1720 Syntactic-prosodic Labeling of Large Spontaneous Speech Data-bases Anton Batliner, R. Kompe, A. Kiessling, H. Niemann, E. Nöth 1724 Relationship Between Discourse Structure and Dynamic Speech Rate Florien J. Koopmans-van Beinum, Monique E. van Donzel 1728 Using Prosodic Clues to Decide When to Produce Back-channel Utterances Nigel Ward 1732 Dialog Act Classification with the Help of Prosody Marion Mast, Ralf Kompe, Stefan Harbeck, Andreas Kiessling, Heinrich Niemann, Elmar Nöth, E. G. Schukat-Talamazzini, V. Warnke 1736 Using Lexical Stress in Continuous Speech Recognition for Dutch David van Kuijk, Henk van den Heuvel, Louis Boves SaP1P1 -- Speaker/Language Identification and Verification 1740 Automatic Accent Classification of Foreign Accented Australian English Speech Karsten Kumpf, Robin W. King 1744 Discriminative Adaptation for Speaker Verification F. Korkmazskiy, Biing-Hwang Juang 1748 Perceptual Features of Unknown Foreign Languages as Revealed by Multi-dimensional Scaling V. Stockmal, D. Muljani, Z.S. Bond 1752 On-line Incremental Adaptation for Speaker Verification using Maximum Likelihood Estimates of CDHMM Parameters Kin Yu, John S. Mason 1756 Combining Methods to Improve Speaker Verification Decision Dominique Genoud, Frédéric Bimbot, Guillaume Gravier, Gérard Chollet 1760 Incremental Speaker Adaptation with Minimum Error Discriminative Training for Speaker Identification Cesar Martín del Alamo, J. Alvarez, C. de la Torre, F.J. Poyatos, L. Hernández 1764 Frame Level Likelihood Normalization for Text-independent Speaker Identification using Gaussian Mixture Models Konstantin P. Markov, Seiichi Nakagawa 1768 On Using Prosodic Cues in Automatic Language Identification Ann E. Thymé-Gobbel, Sandra E. Hutchins 1772 Speaker Recognition Model using Two-dimensional Mel-Cepstrum and Predictive Neural Network Tadashi Kitamura, Shinsai Takei 1776 Unknown Language Rejection in Language Identification System Hingkeung Kwan, Keikichi Hirose 1780 Spoken Language Identification using Large Vocabulary Speech Recognition James L. Hieronymus, Shubha Kadambe 1784 Accent Identification Carlos Teixeira, Isabel M. Trancoso, António Serralheiro 1788 Comparison of Text-independent Speaker Recognition Methods on Telephone Speech with Acoustic Mismatch Sarel van Vuuren 1792 On the Sources of Inter- and Intra-speaker Variability in the Acoustic Dynamics of Speech Xue Yang, J. Bruce Millar, Iain Macleod 1796 Language Identification with Inaccurate String Matching Kay M. Berkling, Etienne Barnard 1800 Robust Prosodic Features for Speaker Identification M.J. Carey, E.S. Parris, H. Lloyd-Thomas, S.J. Bennett 1804 Text Independent Speaker Identification on Noisy Environments by Means of Self Organizing Maps E. Monte, J. Hernando, X. Miró, A. Adolf 1808 Language-identification Using Language-dependent Phonemes and Language-independent Speech Units Paul Dalsgaard, Ove Andersen, Hanne Hesselager, Bojan Petek SaP1S1 -- Large Vocabulary Speech Recognition: The Switchboard Domain * Large Vocabulary Speech Recognition: The Switchboard Domain Ronald Rosenfeld, Hervé Bourlard SaP1S2 -- Emotion in Recognition and Synthesis I * Adding the Affective Dimension: A New Look in Speech Analysis and Synthesis Klaus R. Scherer 1812 Ethological Theory and the Expression of Emotion in the Voice John J. Ohala 1816 Synthesizing Emotions in Speech: Is it Time to Get Excited? Iain R. Murray, John L. Arnott SaP2L1 -- Stochastic Techniques in Robust Speech Recognition 1820 A Study on Task-independent Subword Selection and Modeling for Speech Recognition Chin-Hui Lee, Biing-Hwang Juang, Wu Chou, J.J. Molina-Perez 1824 Simultaneous ANN Feature and HMM Recognizer Design using String-based Minimum Classification Error (MCE) Training Mazin G. Rahim, Chin-Hui Lee 1828 Quantizing Mixture-weights in a Tied-mixture HMM Sunil K. Gupta, Frank K. Soong, Raziel Haimi-Cohen 1832 Variance Compensation within the MLLR Framework for Robust Speech Recognition and Speaker Adaptation M.J.F. Gales, D. Pye, P.C. Woodland 1836 Maximum-likelihood Stochastic Matching Approach to Non-linear Equalization for Robust Speech Recognition A.C. Surendran, Chin-Hui Lee, Mazin G. Rahim 1840 Estimation of Channel Bias for Telephone Speech Recognition Jen-Tzung Chien, Hsiao-Chuan Wang, Lee-Min Lee SaP2L2 -- Prosodic Synthesis in Text to Speech 1844 Synthesis of English Intonation using Explicit Models of Reading and Spontaneous Speech M. E. Johnson 1848 Implementation and Evaluation of a Model for Synthesis of Swedish Intonation Merle Horne, Marcus Filipsson 1852 Natural Prosody Generation for Domain Specific Text-to-Speech Systems Nobuyuki Katae, Shinta Kimura 1856 Improving Text-to-Speech Synthesis Mark Tatham, Eric Lewis 1860 Synthesis of Stressed Speech from Isolated Neutral Speech Using HMM-based Models Sahar E. Bou-Ghazale, John H.L. Hansen 1864 Modeling Segment Intonation for Slovene TTS System Ales Dobnikar SaP2L3 -- Dialogue Events 1868 Word Predictability After Hesitations: A Corpus-based Study Elizabeth Shriberg, Andreas Stolcke 1872 Interruptions and Intonation Li-chiung Yang 1876 On not Recognizing Disfluencies in Dialogue Robin J. Lickley, Ellen Gurman Bard 1880 A Theory of Word Frequencies and its Application to Dialogue Move Recognition Phil Garner, Sue Browning, Roger Moore, Martin Russell 1884 Utterance Units and Grounding in Spoken Dialogue David R. Traum, Peter A. Heeman 1888 Coordinating Turn-taking with Gaze David G. Novick, Brian Hansen, Karen Ward SaP2P1 -- Databases and Tools 1892 BABEL: An Eastern European Multi-language Database Peter Roach, Simon Arnfield, W. Barry, J. Baltova, M. Boldea, A. Fourcin, W. Gonet, R. Gubrynowicz, E. Hallum, L. Lamel, K. Marasek, A. Marchal, E. Meister, K. Vicsi 1894 USTC95---A Putonghua Corpus Ren-Hua Wang, Deyu Xia, Jinfu Ni, Bicheng Liu 1898 Telephone Data Collection using the World Wide Web Edward Hurley, Joseph Polifroni, James Glass 1902 The "SIVA" Speech Database for Speaker Verification: Description and Evaluation M. Falcone, A. Gallo 1906 A Multi-level Description of Date Expressions in German Telephone Speech Christoph Draxler 1910 Viterbi Search Visualization Using Vista: A Generic Performance Visualization Tool Robert H. Halstead Jr., Ben Serridge, Jean-Manuel Van Thong, William Goldenthal 1914 A Multilingual Phonetic Representation and Analysis System for Different Speech Databases Toomas Altosaar, Matti Karjalainen, Martti Vainio 1918 FRESCO: The French Telephone Speech Data Collection - Part of the European SpeechDat(M) Project D. Langmann, R. Haeb-Umbach, Louis Boves, E. den Os 1922 Predicting the Out-of-Vocabulary Rate and the Required Vocabulary Size for Speech Processing Applications Johannes Müller, Holger Stahl, Manfred Lang 1926 AMULET: Automatic MUltisensor Speech Labelling and Event Tracking: Study of the Spatio-temporal Correlations in Voiceless Plosive Production Nathalie Parlangeau, Alain Marchal 1930 Constructing Multi-level Speech Database for Spontaneous Speech Processing Minsoo Hahn, Sanghun Kim, Jung-Chul Lee, Yong-Ju Lee 1934 Preliminaries to a Romanian Speech Database Marian Boldea, Alin Doroga, Tiberiu Dumitrescu, Maria Pescaru 1938 Labelled Data Bank of Spoken Standard German The Kiel Corpus of Read/Spontaneous Speech Klaus J. Kohler 1942 SAPPHIRE: An Extensible Speech Analysis and Recognition Tool Based on Tcl/Tk Lee Hetherington, Michael McCandless 1946 Automatic Detection of Topic Boundaries and Keywords in Arbitrary Speech Using Incremental Reference Interval-free Continuous DP Jiro Kiyama, Yoshiaki Itoh, Ryuichi Oka 1950 Very-large-vocabulary Mandarin Voice Message File Retrieval using Speech Queries Bo-Ren Bai, Lee-Feng Chien, Lin-Shan Lee 1954 Gandalf - A Swedish Telephone Speaker Verification Database H. Melin 1958 The DCIEM Map Task Corpus: Spontaneous Dialogue Under Sleep Deprivation and Drug Treatment Ellen Gurman Bard, C. Sotillo, A. H. Anderson, M. M. Taylor 1962 The Nemours Database of Dysarthric Speech Xavier Menéndez-Pidal, James B. Polikoff, Shirley M. Peters, Jennie E. Leonzio, H.T. Bunnell 1966 POST: Parallel Object-oriented Speech Toolkit Jean Hennebert, Dijana Petrovska Delacrétaz SaP2S2 -- Emotion in Recognition and Synthesis II 1970 Recognizing Emotion in Speech Frank Dellaert, Thomas Polzin, Alex Waibel 1974 Emotions in Time Domain Synthesis Barbara Heuft, Thomas Portele, Monika Rauth 1978 Word Class Driven Synthesis of Prosodic Annotations Simon Arnfield 1981 Dynamical Modelling of Vowel Sounds as a Synthesis Tool M. Banbrook, S. McLaughlin 1985 Emotional Speech Elicited using Computer Games Tom Johnstone 1989 Automatic Statistical Analysis of the Signal and Prosodic Signs of Emotion in Speech Roddy Cowie, Ellen Douglas-Cowie Volume 4 SuA1L1 -- Robust Speech Processing 1993 Channel and Noise Normalization Using Affine Transformed Cepstrum Xiaoyu Zhang, Richard J. Mammone 1997 Spectral Estimation and Normalisation for Robust Speech Recognition Tom Claes, Fei Xie, Dirk Van Compernolle 2001 Trellis Encoded Vector Quantization for Robust Speech Recognition Wu Chou, Nambi Seshadri, Mazin Rahim 2005 Phone Clustering using the Bhattacharyya Distance Brian Mak, Etienne Barnard 2009 Variability of Lombard Effects Under Different Noise Conditions Atsushi Wakao, Kazuya Takeda, Fumitada Itakura 2013 Lombard Effect Compensation and Noise Suppression for Noisy Lombard Speech Recognition Sang-mun Chi, Yung-Hwan Oh SuA1L2 -- Dialects and Speaking Styles 2017 The Use of Shibboleth Words for Automatically Classifying Speakers by Dialect A.W.F. Huggins, Yogen Patel * The Organization of Dialect Diversity in North America William Labov 2021 Data Collection of Japanese Dialects and its Influence into Speech Recognition Ikuo Kudo, Takao Nakama, Tomoko Watanabe, Reiko Kameyama 2025 Statistical Dialect Classification Based on Mean Phonetic Features David R. Miller, James Trischitta 2028 Norwegian Numerals: a Challenge to Automatic Speech Recognition Knut Kvale 2032 Evaluation of the Telefónica I+D Natural Numbers Recognizer over Different Dialects of Spanish from Spain and America C. de la Torre, J. Caminero-Gil, J. Alvarez, C. Martín del Alamo, L. Hernández-Gómez SuA1L3 -- Production and Perception of Prosody 2036 Rhythmic Constraints on English Stress Timing Fred Cummins, Robert F. Port 2040 On the Interaction of Clash, Focus and Phonological Phrasing Irene Vogel, Steve Hoskins 2044 On the Quantal Nature of Speech Timing Gunnar Fant, Anita Kruckenberg 2048 Differential Perception of Tonal Contours Through the Syllable David House 2052 Pitch, Loudness, and Segmental Duration Correlates: Towards a Model for the Phonetic Aspects of Finnish Prosody Martti Vainio, Toomas Altosaar 2056 Prosodic Manipulation System of Speech Material for Perceptual Experiments Nobuaki Minematsu, Seiichi Nakagawa, Keikichi Hirose SuA1P1 -- Topics in ASR and Search 2060 Clustered Language Models with Context-Equivalent States J.P. Ueberla, I. R. Gransden 2063 Modeling of Contextual Effects and its Application to Word Spotting Yuji Yonezawa, Masato Akagi 2067 A New Keyword Spotting Algorithm with Pre-calculated Optimal Thresholds J. Junkawitsch, L. Neubauer, H. Höge, G. Ruske 2071 Detection of Ambiguous Portions of Signal Corresponding to OOV Words or Misrecognized Portions of Input Roxane Lacouture, Yves Normandin 2075 Techniques for Approximating a Trigram Language Model Fabio Brugnara, Marcello Federico 2079 Unsupervised and Incremental Speaker Adaptation under Adverse Environmental Conditions Keizaburo Takagi, Koichi Shinoda, Hiroaki Hattori, Takao Watanabe 2083 An Adaptive-Beam Pruning Technique for Continuous Speech Recognition Hugo Van hamme, Filip Van Aelten 2087 Data Based Filter Design for RASTA-like Channel Normalization in ASR Carlos Avendano, Sarel van Vuuren, Hynek Hermansky 2091 A Comparison of Time Conditioned and Word Conditioned Search Techniques for Large Vocabulary Speech Recognition S. Ortmanns, H. Ney, Frank Seide, I. Lindam 2095 Language-model Look-ahead for Large Vocabulary Speech Recognition S. Ortmanns, H. Ney, A. Eiden 2099 A New Search Algorithm in Segmentation Lattices of Speech Signals Jean-Luc Husson, Yves Laprie 2103 LR-Parser-driven Viterbi Search with Hypotheses Merging Mechanism Using Context-dependent Phone Models Tomokazu Yamada, Shigeki Sagayama 2107 Discrete-Utterance Recognition with a Fast Match Based on Total Data Reduction Jan Nouza 2111 On-line Garbage Modeling with Discriminant Analysis for Utterance Verification J. Caminero, C. de la Torre, L. Villarrubia, C. Martín, L. Hernández 2115 Cheating with Imperfect Transcripts Paul Placeway, John Lafferty 2119 Novel Training Method for Classifiers used in Speaker Adaptation Naoto Iwahashi 2123 Large Vocabulary Word Recognition based on a Graph-structured Dictionary Katsuki Minamino 2127 A Word Graph Based N-Best Search in Continuous Speech Recognition Bach-Hiep Tran, Frank Seide, Volker Steinbiss 2131 Viterbi Beam Search with Layered Bigrams David M. Goblirsch 2135 A Wave Decoder for Continuous Speech Recognition Eric Burhke, Wu Chou, Qiru Zhou 2139 Long Term On-line Speaker Adaptation for Large Vocabulary Dictation Eric Thelen 2143 Incremental Generation of Word Graphs Gerhard Sagerer, Heike Rautenstrauch, G. A. Fink, Bernd Hildebrandt, A. Jusek, Franz Kummert 2147 Improvement in N-Best Search for Continuous Speech Recognition Irina Illina, Yifan Gong 2151 Sethos: The UPC Speech Understanding System Antonio Bonafonte, José B. Mariño, Albino Nogueiras 2155 Segmental Search for Continuous Speech Recognition Pietro Laface, Luciano Fissore, A. Maro, Franco Ravera SuA1P2 -- Multimodal Dialogue/HCI 2159 An Investigation into the Generation of Mouth Shapes for a Talking Head A. P. Breen, E. Bowers, W. Welsh 2163 A Text-to-audiovisual-speech Synthesizer for French Bertrand Le Goff, Christian Benoît 2167 Analysis of Head Movements and its Role in Spoken Dialogue Yuri Iwano, Shioya Kageyama, Emi Morikawa, Shu Nakazato, Katsuhiko Shirai 2171 RWC Multimodal Database for Interactions by Integration of Spoken Language and Visual Information Satoru Hayamizu, Osamu Hasegawa, Katunobu Itou, Katuhiko Sakaue, Kazuyo Tanaka, Shigeki Nagaya, Masayuki Nakazawa, T. Endoh, Fumio Togawa, Kenji Sakamoto, Kazuhiko Yamamoto 2175 About the Relationship Between Eyebrow Movements and Fo Variations Christian Cavé, Isabelle Guaïtella, Roxane Bertrand, Serge Santi, Françoise Harlay, Robert Espesser 2179 How Many Words is a Picture Really Worth? Laurel Fais, Kyung-ho Loken-Kim, Tsuyoshi Morimoto 2183 Visual Synthesis of Source Acoustic Speech Through Kohonen Neural Networks A. Lagana`, F. Lavagetto, A. Storace 2187 Audio-visual Speech Perception Without Speech Cues Helena M. Saldaña, David B. Pisoni, Jennifer M. Fellowes, Robert E. Remez SuA1S1 -- Multilingual Speech Processing I 2191 Multilingual Speech Recognition at Dragon Systems Jim Barnett, A. Corrada, G. Gao, L. Gillick, Y. Ito, S. Lowe, L. Manganaro, B. Peskin 2195 Multi-lingual Phoneme Recognition Exploiting Acoustic-phonetic Similarities of Sounds Joachim Köhler 2199 Japanese Speech Databases for Robust Speech Recognition Atsushi Nakamura, Shoichi Matsunaga, Tohru Shimizu, Masahiro Tonomura, Yoshinori Sagisaka 2203 Spoken Language Processing in a Multilingual Context Lori F. Lamel, M. Adda-Decker, Jean Luc Gauvain, G. Adda 2207 Multilingual Human-computer Interactions: From Information Access to Language Learning Victor Zue, Stephanie Seneff, Joseph Polifroni, Helen Meng, James Glass 2211 SpeeData: Multilingual Spoken Data Entry U. Ackermann, B. Angelini, F. Brugnara, M. Federico, D. Giuliani, R. Gretter, G. Lazzari, H. Niemann SuA2L1 -- Acoustics in Synthesis 2215 Pseudo-articulatory Representations in Speech Synthesis and Recognition William H. Edmondson, Jon P. Iles, Dorota J. Iskra 2219 Synthesis of Initial (/s/-) Stop-liquid Clusters using HLsyn David R. Williams 2223 Synthesis of Trill Chilin Shih 2227 Phone-based Speech Synthesis with Neural Network and Articulatory Control W.K. Lo, P.C. Ching 2231 Analysis of Ten Vowel Sounds Across Gender and Regional/Cultural Accent P. Martland, S.P. Whiteside, Steve W. Beet, L. Baghai-Ravary 2235 Speech Morphing by Gradually Changing Spectrum Parameter and Fundamental Frequency Masanobu Abe SuA2L2 -- Pitch and Rate 2239 The Multi-Lag-Window Method for Robust Extended-range F0 Determination Edouard Geoffrois 2243 Nonlinear Estimation of DEGG Signals with Applications to Speech Pitch Detection Kenneth E. Barner 2247 Pitch Analysis Methods for Cross-Speaker Comparison John. A. Maidment, M. Luisa Garcia-Lecumberri 2250 Continuous Adaptation of Linear Models with Impulsive Excitation Steve W. Beet, L. Baghai-Ravary 2254 Quantitative Analysis of the Local Speech Rate and its Application to Speech Synthesis Sumio Ohno, Masamichi Fukumiya, Hiroya Fujisaki 2258 A Fast and Reliable Rate of Speech Detector Jan P. Verhasselt, Jean-Pierre Martens SuA2L3 -- Acoustic Modeling II 2262 Context Modeling and Clustering in Continuous Speech Recognition Jean-Claude Junqua, Lorenzo Vassallo 2266 Hierarchical Partition of the Articulatory State Space for Overlapping-feature Based Speech Recognition Li Deng, Jim Jian-Xiong Wu 2270 A Fuzzy Acoustic-phonetic Decoder for Speech Recognition Olivier Oppizzi, David Fournier, Philippe Gilles, Henri Méloni 2274 Syllable-level Desynchronisation of Phonetic Features for Speech Recognition Katrin Kirchhoff 2277 A Probabilistic Framework for Feature-based Speech Recognition James Glass, Jane Chang, Michael McCandless 2281 Modeling Context-dependent Phonetic Units in a Continuous Speech Recognition System for Mandarin Chinese Jim Jian-Xiong Wu, Li Deng, Jacky Chan SuA2P1 -- General ASR Posters 2285 JANUS-II: Towards Spontaneous Spanish Speech Recognition Puming Zhan, Klaus Ries, Marsal Gavaldà, Donna Gates, Alon Lavie, Alex Waibel 2289 Reduced Semi-continuous Models for Large Vocabulary Continuous Speech Recognition in Dutch Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle 2293 Validating Different Flexible Vocabulary Approaches on the Swiss French PolyPhone and PolyVar Databases Andrei Constantinescu, Olivier Bornet, Gilles Caloz, Gérard Chollet 2297 Use of a Reliability Coefficient in Noise Cancelling by Neural Net and Weighted Matching Algorithms Nestor Becérra Yoma, Fergus R. McInnes, Mervyn A. Jack 2301 Likelihood Normalization Using an Ergodic HMM for Continuous Speech Recognition Kazuhiko Ozeki 2305 Dynamic Control of a Production Model Laurence Candille, Henri Méloni 2309 Speech Recognition Using Sub-word Units Dependent On Phonetic Contexts Of Both Training and Recognition Vocabularies Hiroaki Hattori, Eiko Yamada 2313 Hidden Markov Models Merging Acoustic and Articulatory Information to Automatic Speech Recognition Bruno Jacob, Christine Senac 2316 Creation of Unseen Triphones from Diphones and Monophones using a Speech Production Approach Mats Blomberg, Kjell Elenius 2320 Speaker-independent Dictation of Chinese Speech with 32K Vocabulary Bo Xu, Bing Ma, Shuwu Zhang, Fei Qu, Taiyi Huang 2324 Using Accent-specific Pronunciation Modelling for Robust Speech Recognition J.J. Humphries, P.C. Woodland, D. Pearce 2328 Dictionary Learning for Spontaneous Speech Recognition Tilo Sloboda, Alex Waibel 2332 Comparison of Channel Normalisation Techniques for Automatic Speech Recognition Over the Phone Johan de Veth, Louis Boves 2336 Anchor Point Detection for Continuous Speech Recognition in Spanish: The Spotting of Phonetic Events Manuel A. Leandro, Jose M. Pardo 2340 Cepstral Compensation by Polynomial Approximation for Environment-independent Speech Recognition Bhiksha Raj, Evandro B. Gouvêa, Pedro J. Moreno, Richard M. Stern 2344 Effect of Speech Coders on Speech Recognition Performance B.T. Lilly, K.K. Paliwal 2348 Wavelet Transforms For Non-uniform Speech Recogntion Systems Léonard Janer, Josep Martí, Climent Nadeu, Eduardo Lleida-Solano 2352 A Binaural Model as a Front-end for Isolated Word Recognition Tsuyoshi Usagawa, Markus Bodden, Klaus Rateitschek 2356 A New Speech Enhancement: Speech Stream Segregation Hiroshi G. Okuno, Tomohiro Nakatani, Takeshi Kawabata SuA2S1 -- Multilingual Speech Processing II 2360 Head Automata for Speech Translation Hiyan Alshawi 2364 Word Clustering with Parallel Spoken Language Corpora Ye-Yi Wang, John Lafferty, Alex Waibel 2368 Toward Translating Korean Speech Into Other Languages Jae-Woo Yang, Youngjik Lee 2371 VERBMOBIL: The Evolution of a Complex Large Speech-to-Speech Translation System Thomas Bub, Johannes Schwinn 2375 Translation of Conversational Speech with JANUS-II Alon Lavie, Alex Waibel, Lori Levin, Donna Gates, Marsal Gavaldà, Torsten Zeppenfeld, Puming Zhan, Oren Glickman SuP1L1 -- Data-based Synthesis 2379 Non-segmental Analysis and Synthesis Based on a Speech Database Andrew Slater, John Coleman 2383 Microsegment Synthesis - Economic Principles in a Low-cost Solution Ralf Benzmüller, William J. Barry 2387 Whistler: A Trainable Text-to-Speech System X.D. Huang, A. Acero, J. Adcock, H.W. Hon, J. Goldsmith, J. Liu, Mike Plumpe 2391 Generation of Multiple Synthesis Inventories by a Bootstrapping Procedure Thomas Portele, Karl-Heinz Stöber, Horst Meyer, Wolfgang Hess 2395 Modeling Segmental Duration in German Text-to-Speech Synthesis Bernd Möbius, Jan P.H. van Santen 2399 Autolabelling Japanese ToBI Nick Campbell SuP1L2 -- Speaker Identification and Verification 2403 General Phrase Speaker Verification Using Sub-word Background Models and Likelihood-ratio Scoring S. Parthasarathy, A.E. Rosenberg 2407 Unknown-Multiple Signal Source Clustering Problem Using Ergodic HMM and Applied to Speaker Classification J. Murakami, M. Sugiyama, H. Watanabe 2411 GMM and ARVM Cooperation and Competition for Text-independent Speaker Recognition on Telephone Speech J.-L. Le Floch, C. Montacié, M.-J. Caraty 2415 Selective use of the Speech Spectrum and a VQGMM Method for Speaker Identification Qiguang Lin, Ea-Ee Jan, ChiWei Che, Dong-Suk Yuk, James L. Flanagan 2419 Speaker Verification through Large Vocabulary Continuous Speech Recognition Michael Newman, Larry Gillick, Yoshiko Ito, Don McAllaster, Barbara Peskin 2423 Predictive Neural Networks in Text Independent Speaker Verification: an Evaluation on the SIVA Database Andrea Paoloni, Susanna Ragazzini, G. Ravaioli SuP1L3 -- Acoustic Phonetics 2427 Durational Characterstics of Hindi Consonant Clusters Nisheeth Shrotriya, Rajesh Verma, S.K. Gupta, S.S. Agrawal 2431 The Use of Wavelet Transforms in Phoneme Recognition Beng T. Tan, Minyue Fu, Andrew Spray, Phillip Dermody 2435 Acoustic Properties of Phonemes in Continuous Speech for Different Speaking Rate Hisao Kuwabara 2439 Prosodic Parameterization of Spoken Japanese Based on a Model of the Generation Process of F0 Contours Hiroya Fujisaki, Sumio Ohno 2443 A Logistic Regression Model for Detecting Prominences Arman Maghbouleh 2446 High-quality Prosodic Modification of Speech Signals Beat Pfister SuP1P1 -- Perception of Vowels and Consonants 2450 On the Syllable Structures of Chinese Relating to Speech Recognition Jialu Zhang 2454 Can a Moraic Nasal Occur Word-initially in Japanese? Takashi Otake, Kiyoko Yoneyama 2458 Perceptual Assimilation of American English Vowels by Japanese Listeners W. Strange, Reiko Akahane-Yamada, B.H. Fitzgerald, R. Kubo 2462 Context and Speaker Effects in the Perceptual Assimilation of German Vowels by American Listeners W. Strange, O.-S. Bohn, S. A. Trent, M.C. McNair, K.C. Bielec 2466 Examination of a Perceptual Non-native Speech Contrast: Pharyngealized/Non-pharyngealized Discrimination by French-speaking Adults Mohamed Zahid 2470 Context-dependent Relevance of Burst and Transitions for Perceived Place in Stops: It's in Production, not Perception Roel Smits 2474 The Perception of Morae in Long Vowels Comparison Among Japanese, Korean and English Speakers Ryoji Baba, Kaori Omuro, Hiromitsu Miyazono, Tsuyoshi Usagawa, Masahiko Higuchi 2478 Juncture Cues to Disfluency Robin J. Lickley 2482 Effects of Duration and Formant Movement on Vowel Perception James R. Sawusch 2486 Benchmarking Human Performance for Continuous Speech Recognition N. Deshmukh, R.J. Duncan, A. Ganapathiraju, J. Picone 2490 Intelligibility of Speech with Filtered Time Trajectories of Spectral Envelopes Takayuki Arai, Misha Pavel, Hynek Hermansky, Carlos Avendano 2494 Perceptual Use of Vowel and Speaker Information in Breath Sounds D. H. Whalen, Sonya M. Sheffert 2498 The Role of Neighborhood Relative Frequency in Spoken Word Recognition Philippe Mousty, Monique Radeau, Ronald Peereman, Paul Bertelson 2502 Transitional Probability and Phoneme Monitoring James M. McQueen, Mark A. Pitt 2506 Identification of Vowel Features from French Stop Bursts Anne Bonneau 2510 Listening in a Second Language Z.S. Bond, Thomas J. Moore, Beverley Gable 2514 Perception of Lexical Tone Across Languages: Evidence for a Linguistic Mode of Processing Denis Burnham, Elizabeth Francis, Di Webster, Sudaporn Luksaneeyanawin, Chayada Attapaiboon, Francisco Lacerda, Peter Keller 2518 Acoustic Correlates to the Effects of Talker Variability on the Perception of English /r/ and /l/ by Japanese Listeners James S. Magnuson, Reiko Akahane-Yamada SuP2LP -- Closing Ceremony and Plenary Lecture 2522 Natural Communication with Machines - Progress and Challenge James L. Flanagan * Unavailable at time of printing