Volume 1 Contents



ThA1LP -- Opening Ceremony and Plenary Lecture

Chairs: H. Timothy Bunnell, Alfred I. duPont Institute; and Richard A. Foulds, Alfred I. duPont Institute
  1. The Comparative Study of Spoken-Language Processing Anne Cutler


ThA2L1 -- Large Vocabulary

Chair: Michael D. Riley, AT&T Labs - Research
  1. New Developments in the INRS Continuous Speech Recognition System Z. Li, M. Heon, Douglas O'Shaughnessy
  2. On Designing Pronunciation Lexicons for Large Vocabulary, Continuous Speech Recognition Lori Lamel, Gilles Adda
  3. Word Graph Rescoring Using Confidence Measures Pablo Fetter, Frédéric Dandurand, Peter Regel-Brietzmann
  4. A Bottom-up Approach for Handling Unseen Triphones in Large Vocabulary Continuous Speech Recognition X.L. Aubert, Peter Beyerlein, Meinhard Ullrich
  5. Discriminative Optimisation of Large Vocabulary Recognition Systems V. Valtchev, P.C. Woodland, S. J. Young
  6. Japanese Large-vocabulary Continuous-speech Recognition using a Business-newspaper Corpus Tatsuo Matsuoka, Katsutoshi Ohtsuki, Takeshi Mori, Sadaoki Furui, Katsuhiko Shirai
  7. Handling Compound Nouns in a Swedish Speech-understanding System David Carter, Jaan Kaja, Leonardo Neumeyer, Manny Rayner, Fuliang Weng, Mats Wiren
  8. Initial Evaluation of a Preselection Module for a Flexible Large Vocabulary Speech Recognition System in Telephone Environment J. Macias-Guarasa, A. Gallardo, J. Ferreiros, Jose M. Pardo, L. Villarrubia


ThA2L2 -- Multimodal ASR (Face and Lips)

Chair: Eric Petajan, Bell Labs - Lucent Technologies
  1. Asynchronous Integration of Visual Information in an Automatic Speech Recognition System Mamoun Alissali, Paul Deleglise, Alexandrina Rogozan
  2. Audiovisual Speech Recognition using Multiscale Nonlinear Image Decomposition. I.A. Matthews, J. Bangham, S.J. Cox
  3. Robust Audiovisual Integration using Semicontinuous Hidden Markov Models Qin Su, Peter L. Silsbee
  4. The Effect of Visual Information on Word Initial Consonant Perception of Dysarthric Speech Richard P. Schumeyer, Kenneth E. Barner
  5. A Multiple Deformable Template Approach for Visual Speech Recognition Devi Chandramohan, Peter L. Silsbee
  6. Speaker Independent Bimodal Phonetic Recognition Experiments P. Cosi, E. Magno Caldognetto, F. Ferrero, M. Dugatto, K. Vagges
  7. Speechreading using Shape and Intensity Information Juergen Luettin, Neil A. Thacker, Steve W. Beet
  8. Speaker Identification by Lipreading Juergen Luettin, Neil A. Thacker, Steve W. Beet


ThA2L3 -- Perception of Words

Chair: Sharon Manuel, Emerson College and Massachusetts Institute of Technology
  1. How Word Onsets Drive Lexical Access and Segmentation: Evidence from Acoustics, Phonology and Processing David W. Gow Jr., Janis Melvold, Sharon Manuel
  2. RAW: A Real-speech Model for Human Word Recognition David van Kuijk, Peter Wittenburg, Ton Dijkstra
  3. How Facilitatory can Lexical Information Be During Word Recognition? Evidence from Moroccan Arabic Mehdi Meftah, Sami Boudelaa
  4. Effects of Frequency on the Auditory Perception of Open- Versus Closed-class Words Alette P. Haveman
  5. Phonotactic and Metrical Influences on Adult Ratings of Spoken Nonsense Words Michael S. Vitevitch, Paul A. Luce, Jan Charles-Luce, David Kemmerer
  6. Lipreading Supplemented by Voice Fundamental Frequency: To What Extent Does the Addition of Voicing Increase Lexical Uniqueness for the Lipreader? Edward T. Auer Jr., Lynne E. Bernstein
  7. Strategies Used in Rhyme-Monitoring S. te Riele, S.G. Nooteboom, H. Quené
  8. How do Dutch Listeners Process Words with Epenthetic Schwa? Wilma van Donselaar, Cecile Kuijpers, Anne Cutler


ThA2P1 -- Phonetics, Transcription, and Analysis

Chair: Jim Hieronymus, Bell Labs - Lucent Technologies
  1. Whole-word Phonetic Distances and the PGPfone Alphabet Patrick Juola, Philip Zimmermann
  2. Automatic Vowel Quality Description using a Variable Mapping to an Eight Cardinal Vowel Reference Set Shuping Ran, J. Bruce Millar, Phil Rose
  3. Automatic Detection and Segmentation of Pronunciation Variants in German Speech Corpora Andreas Kipp, Maria-Barbara Wesenick, Florian Schiel
  4. ANGIE: A New Framework for Speech Analysis Based on Morpho-phonological Modelling Stephanie Seneff, Raymond Lau, Helen Meng
  5. Perceptual Contrast in the Korean and English Vowel System Normalized Byunggon Yang
  6. On Phonetic Characteristics of Pause in the Korean Read Speech Yong-Ju Lee, Sook-hyang Lee
  7. Cross-Language Effects of Lexical Stress in Word Recognition: The Case of Arabic English Bilinguals Sami Boudelaa, Mehdi Meftah
  8. Automatic Generation of German Pronunciation Variants Maria-Barbara Wesenick
  9. Estimating the Quality of Phonetic Transcriptions and Segmentations of Speech Signals Maria-Barbara Wesenick, Andreas Kipp
  10. An Acoustic Analysis of Contemporary Vowels of the Standard Slovenian Language Bojan Petek, Rastislav Sustarsic,Smiljana Komar
  11. Using Decision Trees to Construct Optimal Acoustic Cues Sandrine Robbe, Anne Bonneau, Sylvie Coste, Yves Laprie
  12. Maximum Jaw Displacement in Contrastive Emphasis Donna Erickson, Osamu Fujimura
  13. Subglottal Pressure and Final Lowering in English Rebecca Herman, Mary Beckman, Kiyoshi Honda
  14. Phonological Variation: Epenthesis and Deletion of Schwa in Dutch Cecile Kuijpers, Wilma van Donselaar, Anne Cutler
  15. Can a Moraic Nasal Occur Word-initially in Japanese? Takashi Otake, Kiyoko Yoneyama


ThA2P2 -- Spoken Language Processing for Special Populations

Chair: Valerie Hazan, University College London
  1. Feedback Considerations for Speech Training Systems James J. Mahshie
  2. Clinical Applications of Computer-Based Speech Training for Children with Hearing Impairment Anne-Marie Öster
  3. Enhancing Information-rich Regions of Natural VCV and Sentence Materials Presented in Noise Valerie Hazan, Andrew Simpson
  4. Speech Perceptual Abilities of Children with Specific Reading Difficulty (Dyslexia) Valerie Hazan, Alan Adlard
  5. Bimodal Perception of Spectrum Compressed Speech Larry D. Paarmann, Michael K. Wynne
  6. Effect of Sentential Context on Syllabic Stress Perception by Hearing-impaired Listeners Dragana Barac-Cikoja, Sally Revoile
  7. Applications of Automatic Speech Recognition to Speech and Language Development in Young Children Martin Russell, Catherine Brown, Adrian Skilling, Rob Series, Julie Wallace, Bill Bohnam, Paul Barker
  8. Sub-band Adaptive Speech Enhancement for Hearing Aids D. R. Campbell
  9. Adapting a TTS System to a Reading Machine for the Blind Thomas Portele, Juergen Kraemer


ThA2S1 -- Dialogue Special Session I

Chairs: James R. Glass, MIT Laboratory for Computer Science; and Yasunaga Niimi, Kyoto Institute of Technology
  1. Modeling of Spoken Dialogue with and without Visual Information Katsuhiko Shirai
  2. Multimodal Discourse Modelling in a Multi-user Multi-domain Environment Stephanie Seneff, David Goddeau, Christine Pao, Joseph Polifroni
  3. Automatic Acquisition of Probabilistic Dialogue Models Kenji Kita, Yoshikazu Fukui, Masaaki Nagata, Tsuyoshi Morimoto
  4. Units of Dialogue Management: An Example Paul Heisterkamp, Scott McGlashan
  5. Error Resolution During Multimodal Human-computer Interaction Sharon Oviatt, Robert VanGent
  6. Improved Spontaneous Dialogue Recognition Using Dialogue and Utterance Triggers by Adaptive Probability Boosting Ramesh R. Sarukkai, Dana H. Ballard
  7. Speech Recognition for Spontaneously Spoken German Dialogues Kai Hübener, Uwe Jost, Henrik Heine
  8. Using Prosodic Information to Constrain Language Models for Spoken Dialogue Paul Taylor, Hiroshi Shimodaira, Stephen Isard, Simon King, Jacqueline Kowtko


ThP1L1 -- Language Modeling I

Chair: Roberto Pieraccini, AT&T Labs - Research
  1. Combination of Word-based and Category-based Language Models T.R. Niesler, P.C. Woodland
  2. A Multi-level Lexical-semantics Based Language Model Design for Guided Integrated Continuous Speech Recognition Francisco J. Valverde-Albacete, Jose M. Pardo
  3. A Category Based Approach for Recognition of Out-of-Vocabulary Words Florian Gallwitz, Elmar Noeth, Heinrich Niemann
  4. Scalable Backoff Language Models Kristie Seymore, Ronald Rosenfeld
  5. Modeling Long Distance Dependence in Language: Topic Mixtures vs. Dynamic Cache Models R. Iyer, Mari Ostendorf
  6. Bayesian Estimation Methods for N-Gram Language Model Adaptation Marcello Federico


ThP1L2 -- Feature Extraction for Speech Recognition I

Chair: Shubha Kadambe, Atlantic Aerospace Electronics Corp.
  1. Feature Dimension Reduction Using Reduced-Rank Maximum Likelihood Estimation for Hidden Markov Models Don X. Sun
  2. Using Multi-Level Segmentation Coefficients to Improve HMM Speech Recognition Kai Hübener
  3. A Comparative Study of Linear Feature Transformation Techniques for Automatic Speech Recognition T. Eisele, R. Haeb-Umbach, D. Langmann
  4. Inclusion of Temporal Information into Features for Speech Recognition Ben Milner
  5. New Cepstral Representation using Wavelet Analysis and Spectral Transformation for Robust Speech Recognition Hubert Wassner, Gérard Chollet
  6. Wavelet Based Feature Extraction for Phoneme Recognition C.J. Long, S. Datta


ThP1L3 -- Speech Production - Measurement and Modeling

Chair: Terrance M. Neary, University of Alberta
  1. Extraction of Tongue Contours in X-ray Images with Minimal User Interaction Yves Laprie, Marie-Odile Berger
  2. Three-dimensional Measurement of the Vocal Tract by MRI Didier Demolin, Thierry Metens, Alain Soquet
  3. Syllable Affiliation of Final Consonant Clusters Undergoes a Phase Transition Over Speaking Rates Philip Gleason, Betty Tuller, J. A. Scott Kelso
  4. Towards a Biomechanical Model of the Larynx Arthur Lobo, Michael O'Malley
  5. Effects of Auditory Feedback on F0 Trajectory Generation Hideki Kawahara, Hiroko Kato, J. C. Williams


ThP1P1 -- Speech Coding / HMMs and NNs in ASR

Chair: Jean-Luc Gauvain, LIMSI-CNRS
  1. On the Effects of Accent and Language on Low Rate Speech Coders I. S. Burnett, J. J. Parry
  2. VQ Codevector Index Assignment Using Genetic Algorithms for Noisy Channels J.S. Pan, Fergus R. McInnes, Mervyn A. Jack
  3. An Improved Vector Quantization Algorithm for Speech Transmission Over Noisy Channels Gavin C. Cawley
  4. Very Low Delay and High Quality Coding of 20 Hz-15 kHz Speech Signals at 64 kbit/s C. Murgia, G. Feng, A. Le Guyader, C. Quinquis
  5. Application of Speaker Modification Techniques to Phonetic Vocoding Carlos M. Ribeiro, Isabel M. Trancoso
  6. Entropy Coded Vector Quantization with Hidden Markov Models Tadashi Yonezaki, Kiyohiro Shikano
  7. An Application of Recurrent Neural Networks to Low Bit Rate Speech Coding Minoru Kohata
  8. CELP Coding System Based on Mel-Generalized Cepstral Analysis Kazuhito Koishida, Keiichi Tokuda, Takao Kobayashi, Satoshi Imai
  9. Wideband Re-synthesis of Narrowband CELP-coded Speech Using Multiband Excitation Model Cheung-Fat Chan, Wai-Kwong Hui
  10. Recurrent Neural Networks for Phoneme Recognition Takuya Koizumi, Mikio Mori, Shuji Taniguchi, Mitsutoshi Maruya
  11. A Model for the Acoustic Phonetic Structure of Arabic Language using a Single Ergodic Hidden Markov Model M.A. Mokhtar, A. Zein-el-Abddin
  12. Modelling Long Term Variability Information in Mixture Stochastic Trajectory Framework Yifan Gong, Irina Illina, Jean-Paul Haton
  13. Segmental Phonetic Features Recognition by means of Neural-fuzzy Networks and Integration in an N-best Solutions Post-processing T. Moudenc, R. Sokol, G. Mercier
  14. Stochastic Trajectory Model with State-Mixture for Continuous Speech Recognition Irina Illina, Yifan Gong
  15. Recognition of Spelled Names over the Telephone Hermann Hild, Alex Waibel
  16. Optimal Tying of HMM Mixture Densities using Decision Trees Gilles Boulianne, Patrick Kenny
  17. Speech Recognition Using an Enhanced FVQ Based on a Codeword Dependent Distribution Normalization and Codeword Weighting by Fuzzy Objective Function Hwan Jin Choi, Yung Hwan Oh
  18. Using the Self-Organizing Map to Speed up the Probability Density Estimation for Speech Recognition with Mixture Density HMMs Mikko Kurimo, Panu Somervuo


ThP1S1 -- Dialogue Special Session II

Chairs: Patti Price, SRI International; and Akira Kurematsu, University of Electro-Communications
  1. Combining the Detection and Correction of Speech Repairs Peter A. Heeman, Kyung-ho Loken-Kim, James F. Allen
  2. Generating Spontaneous Elliptical Utterance Yuji Sagawa, Wataru Sugimoto, Noboru Ohnishi
  3. Developing the Modelling of Swedish Prosody in Spontaneous Dialogue Gösta Bruce, Marcus Filipsson, Johan Frid, Björn Granström, Kjell Gustafson, Merle Horne, David House, Birgitta Lastow, Paul Touati
  4. Spoken Language Generation in a Multimedia System Shimei Pan, Kathleen R. McKeown
  5. Synthesizing Dialogue Speech of Japanese Based on the Quantitative Analysis of Prosodic Features Keikichi Hirose, Mayumi Sakata, Hiromichi Kawanami
  6. Spoken Dialogue Interface in a Dual Task Situation Shuichi Tanaka, Shu Nakazato, Keiichiro Hoashi, Katsuhiko Shirai


ThP1S2 -- Neural Models of Speech Processing I

Chair: Eric D. Young, Johns Hopkins University
  1. How is Information About Speech Encoded in the Peripheral Auditory System? Eric D. Young
  2. Spectral Shape Analysis in the Central Auditory System Shihab Shamma


ThP2L1 -- Language Modeling II

Chair: Jerome R. Bellegarda, Apple Computer, Inc.
  1. Modeling Disfluencies in Conversational Speech Man-hung Siu, Mari Ostendorf
  2. Evaluation of a Language Model using a Clustered Model Backoff John Miller, Fil Alleva
  3. Language Modeling Using X-grams Antonio Bonafonte, José B. Mariño
  4. Class Phrase Models For Language Modelling Klaus Ries, Finn Dag Buo, Alex Waibel
  5. Introducing Linguistic Constraints into Statistical Language Modeling Petra Geutner
  6. Language Modeling with Stochastic Automata Jianying Hu, William Turin, Michael K. Brown


ThP2L2 -- Feature Extraction for Speech Recognition II

Chair: Shubha Kadambe, Atlantic Aerospace Electronics Corp.
  1. New Fast Wavelet Packet Transform Algorithms for Frame Synchronized Speech Processing Andrzej Drygajlo
  2. Frequency-Warping in Speech S. Umesh, L. Cohen, N. Marinovic, D. Nelson
  3. Extracting Speech Features from Human Speech-like Noise Daisuke Kobayashi, Shoji Kajita, Kazuya Takeda, Fumitada Itakura
  4. Subband-Crosscorrelation Analysis for Robust Speech Recognition Shoji Kajita, Kazuya Takeda, Fumitada Itakura
  5. A New ASR Approach Based on Independent Processing and Recombination of Partial Frequency Bands Hervé Bourlard, Stéphane Dupont
  6. Frequency and Time Filtering of Filter-bank Energies for HMM Speech Recognition Climent Nadeu, José B. Mariño, Javier Hernando, Albino Nogueiras


ThP2L3 -- Vowels

Chair: John Ohala, University of California, Berkeley
  1. Temporal Cues for Vowels and Universals of Vowel Inventories Carrie E. Lang, John J. Ohala
  2. Acoustic Variability in Spontaneous Conversational Speech of American English Talkers Ann K. Syrdal
  3. Cross-language Speech Perception: Swedish, English, and Spanish Speakers' Perception of Front Rounded Vowels Raquel Willerman, Patricia K. Kuhl
  4. Inter-language Vowel Perception and Production by Korean and Japanese Listeners John C.L. Ingram, See-Gyoon Park
  5. Intelligibility and Acoustic Correlates of Japanese Accented English Vowels Diane Kewley-Port, Reiko Akahane-Yamada, Kiyoaki Aikawa
  6. Segmentation Strategies for Spoken Language Recognition: Evidence from Semi-bilingual Japanese Speakers of English Kiyoko Yoneyama


ThP2P1 -- NNs and Stochastic Modeling

Chair: Wu Chou, Bell Labs - Lucent Technologies
  1. Integrating Connectionist, Statistical and Symbolic Approaches for Continuous Spoken Korean Processing Geunbae Lee, Jong-Hyeok Lee, Kyubong Park, Byung-Chang Kim
  2. Towards ASR on Partially Corrupted Speech Hynek Hermansky, Sangita Timberwala, Misha Pavel
  3. Parametric Trajectory Models for Speech Recognition Herbert Gish, Kenney Ng
  4. Use of Gaussian Selection in Large Vocabulary Continuous Speech Recognition Using HMMs K.M. Knill, M.J.F. Gales, S. J. Young
  5. Cross Phone State Clustering using Lexical Stress and Context J. Hogberg, K. Sjolander
  6. Likelihood Ratio Decoding and Confidence Measures for Continuous Speech Recognition Eduardo Lleida, Richard C. Rose
  7. A Study on Continuous Chinese Speech Recognition Based on Stochastic Trajectory Models Xiaohui Ma, Yifan Gong, Yuqing Fu, Jiren Lu, Jean-Paul Haton
  8. A Proposal for a New Algorithm of Reference Interval-free Continuous DP for Real-time Speech or Text Retrieval Yoshiaki Itoh, Jiro Kiyama, Hiroshi Kojima, Susumu Seki, Ryuichi Oka
  9. Language Modeling by String Pattern N-gram for Japanese Speech Recognition Akinori Ito, Masaki Kohda
  10. Statistical Language Modeling using a Variable Context Length Reinhard Kneser
  11. A Comparison of Hybrid HMM Architectures Using Global Discriminative Training Finn Tore Johansen
  12. Improved Probability Estimation with Neural Network Models Wei Wei, Etienne Barnard, Mark Fanty
  13. A Neural Network Using Acoustic Sub-word Units for Continuous Speech Recognition Ha-Jin Yu, Yung-Hwan Oh
  14. On the Error Criteria in Neural Networks as a Tool for Human Classification Modelling Louis F. M. ten Bosch, Roel Smits
  15. A Non-linear Filtering Approach to Stochastic Training of the Articulatory-acoustic Mapping Using the EM Algorithm Gordon Ramsay
  16. A Tool for Automated Design of Language Models Y.P. Yang, J.R. Deller Jr.
  17. Acoustic-phonetic Decoding Based on Elman Predictive Neural Networks F. Freitag, E. Monte
  18. On Improving Discrimination Capability of an RNN Based Recognizer Tan Lee, P.C. Ching
  19. An Evaluation of Statistical Language Modeling for Speech Recognition using a Mixed Category of Both Words and Parts-of-speech Yumi Wakita, Jun Kawai, Hitoshi Iida


ThP2S1 -- Dialogue Special Session III

Chairs: Paul Dalsgaard, Aalborg University; and Hiroya Fujisaki, Science University of Tokyo
  1. A Dialogue Control Strategy Based on the Reliability of Speech Recognition Yasuhisa Niimi, Yutaka Kobayashi
  2. SpeechWear: A Mobile Speech System Alexander I. Rudnicky, Stephen Reed, Eric H. Thayer
  3. WHEELS: A Conversational System in the Automobile Classifieds Domain Helen Meng, Senis Busayapongchai, James Glass, David Goddeau, Lee Hetherington, Edward Hurley, Christine Pao, Joseph Polifroni, Stephanie Seneff, Victor Zue
  4. Effective Human-computer Cooperative Spoken Dialogue: The AGS Demonstrator M.D. Sadek, A. Ferrieux, A. Cozannet, P. Bretier, F. Panaget, J. Simonin
  5. Dialog in the RAILTEL Telephone-based System S.K. Bennacef, L. Devillers, S. Rosset, Lori Lamel
  6. Dialogue Processing in a Conversational Speech Translation System Alon Lavie, Lori Levin, Yan Qu, Alex Waibel, Donna Gates, Marsal Gavaldà, Laura Mayfield, Maite Taboada


ThP2S2 -- Neural Models of Speech Processing II

Chair: Eric D. Young, Johns Hopkins University
  1. Novel Speech Processing Mechanism Derived from Auditory Neocortical Circuit Analysis Boris Aleksandrovsky, James Whitson, Gretchen Andes, Gary Lynch, Richard Granger
  2. Modeling Neurons in the Anteroventral Cochlear Nucleus for Amplitude Modulation (AM) Processing: Application to Speech Sound Ping Tang, Jean Rouat
  3. Noise Suppression and Loudness Normalization in an Auditory Model-based Acoustic Front-end Halewijn Vereecken, Jean-Pierre Martens
  4. A Psychoacoustic Model for the Noise Masking of Voiceless Plosive Bursts Jim Hant, Brian Strope, Abeer Alwan
  5. Training Machine Classifiers to Match the Performance of Human Listeners in a Natural Vowel Classification Task Martin Hunke, Thomas Holton
  6. A Neural Matrix Model for Active Tracking of Frequency-modulated Tones Kiyoaki Aikawa, Hideki Kawahara, Minoru Tsuzaki