Volume 4 Contents



SuA1L1 -- Robust Speech Processing

Chair: Mazin Rahim, AT&T Labs - Research
  1. Channel and Noise Normalization Using Affine Transformed Cepstrum Xiaoyu Zhang, Richard J. Mammone
  2. Spectral Estimation and Normalisation for Robust Speech Recognition Tom Claes, Fei Xie, Dirk Van Compernolle
  3. Trellis Encoded Vector Quantization for Robust Speech Recognition Wu Chou, Nambi Seshadri, Mazin Rahim
  4. Phone Clustering using the Bhattacharyya Distance Brian Mak, Etienne Barnard
  5. Variability of Lombard Effects Under Different Noise Conditions Atsushi Wakao, Kazuya Takeda, Fumitada Itakura
  6. Lombard Effect Compensation and Noise Suppression for Noisy Lombard Speech Recognition Sang-mun Chi, Yung-Hwan Oh


SuA1L2 -- Dialects and Speaking Styles

Chair: Jim Hieronymus, Bell Labs - Lucent Technologies
  1. The Use of Shibboleth Words for Automatically Classifying Speakers by Dialect A.W.F. Huggins, Yogen Patel
  2. The Organization of Dialect Diversity in North America William Labov
  3. Data Collection of Japanese Dialects and its Influence into Speech Recognition Ikuo Kudo, Takao Nakama, Tomoko Watanabe, Reiko Kameyama
  4. Statistical Dialect Classification Based on Mean Phonetic Features David R. Miller, James Trischitta
  5. Norwegian Numerals: a Challenge to Automatic Speech Recognition Knut Kvale
  6. Evaluation of the Telefónica I+D Natural Numbers Recognizer over Different Dialects of Spanish from Spain and America C. de la Torre, J. Caminero-Gil, J. Alvarez, C. Martín del Alamo, L. Hernández-Gómez


SuA1L3 -- Production and Perception of Prosody

Chair: Gunnar Fant, KTH
  1. Rhythmic Constraints on English Stress Timing Fred Cummins, Robert F. Port
  2. On the Interaction of Clash, Focus and Phonological Phrasing Irene Vogel, Steve Hoskins
  3. On the Quantal Nature of Speech Timing Gunnar Fant, Anita Kruckenberg
  4. Differential Perception of Tonal Contours Through the Syllable David House
  5. Pitch, Loudness, and Segmental Duration Correlates: Towards a Model for the Phonetic Aspects of Finnish Prosody Martti Vainio, Toomas Altosaar
  6. Prosodic Manipulation System of Speech Material for Perceptual Experiments Nobuaki Minematsu, Seiichi Nakagawa, Keikichi Hirose


SuA1P1 -- Topics in ASR and Search

Chair: Enrico Bocchieri, AT&T Labs - Research
  1. Clustered Language Models with Context-Equivalent States J.P. Ueberla, I. R. Gransden
  2. Modeling of Contextual Effects and its Application to Word Spotting Yuji Yonezawa, Masato Akagi
  3. A New Keyword Spotting Algorithm with Pre-calculated Optimal Thresholds J. Junkawitsch, L. Neubauer, H. Höge, G. Ruske
  4. Detection of Ambiguous Portions of Signal Corresponding to OOV Words or Misrecognized Portions of Input Roxane Lacouture, Yves Normandin
  5. Techniques for Approximating a Trigram Language Model Fabio Brugnara, Marcello Federico
  6. Unsupervised and Incremental Speaker Adaptation under Adverse Environmental Conditions Keizaburo Takagi, Koichi Shinoda, Hiroaki Hattori, Takao Watanabe
  7. An Adaptive-Beam Pruning Technique for Continuous Speech Recognition Hugo Van hamme, Filip Van Aelten
  8. Data Based Filter Design for RASTA-like Channel Normalization in ASR Carlos Avendano, Sarel van Vuuren, Hynek Hermansky
  9. A Comparison of Time Conditioned and Word Conditioned Search Techniques for Large Vocabulary Speech Recognition S. Ortmanns, H. Ney, Frank Seide, I. Lindam
  10. Language-model Look-ahead for Large Vocabulary Speech Recognition S. Ortmanns, H. Ney, A. Eiden
  11. A New Search Algorithm in Segmentation Lattices of Speech Signals Jean-Luc Husson, Yves Laprie
  12. LR-Parser-driven Viterbi Search with Hypotheses Merging Mechanism Using Context-dependent Phone Models Tomokazu Yamada, Shigeki Sagayama
  13. Discrete-Utterance Recognition with a Fast Match Based on Total Data Reduction Jan Nouza
  14. On-line Garbage Modeling with Discriminant Analysis for Utterance Verification J. Caminero, C. de la Torre, L. Villarrubia, C. Martín, L. Hernández
  15. Cheating with Imperfect Transcripts Paul Placeway, John Lafferty
  16. Novel Training Method for Classifiers used in Speaker Adaptation Naoto Iwahashi
  17. Large Vocabulary Word Recognition based on a Graph-structured Dictionary Katsuki Minamino
  18. A Word Graph Based N-Best Search in Continuous Speech Recognition Bach-Hiep Tran, Frank Seide, Volker Steinbiss
  19. Viterbi Beam Search with Layered Bigrams David M. Goblirsch
  20. A Wave Decoder for Continuous Speech Recognition Eric Burhke, Wu Chou, Qiru Zhou
  21. Long Term On-line Speaker Adaptation for Large Vocabulary Dictation Eric Thelen
  22. Incremental Generation of Word Graphs Gerhard Sagerer, Heike Rautenstrauch, G. A. Fink, Bernd Hildebrandt, A. Jusek, Franz Kummert
  23. Improvement in N-Best Search for Continuous Speech Recognition Irina Illina, Yifan Gong
  24. Sethos: The UPC Speech Understanding System Antonio Bonafonte, José B. Mariño, Albino Nogueiras
  25. Segmental Search for Continuous Speech Recognition Pietro Laface, Luciano Fissore, A. Maro, Franco Ravera


SuA1P2 -- Multimodal Dialogue/HCI

Chair: Donald Hindle, AT&T Labs - Research
  1. An Investigation into the Generation of Mouth Shapes for a Talking Head A. P. Breen, E. Bowers, W. Welsh
  2. A Text-to-audiovisual-speech Synthesizer for French Bertrand Le Goff, Christian Benoît
  3. Analysis of Head Movements and its Role in Spoken Dialogue Yuri Iwano, Shioya Kageyama, Emi Morikawa, Shu Nakazato, Katsuhiko Shirai
  4. RWC Multimodal Database for Interactions by Integration of Spoken Language and Visual Information Satoru Hayamizu, Osamu Hasegawa, Katunobu Itou, Katuhiko Sakaue, Kazuyo Tanaka, Shigeki Nagaya, Masayuki Nakazawa, T. Endoh, Fumio Togawa, Kenji Sakamoto, Kazuhiko Yamamoto
  5. About the Relationship Between Eyebrow Movements and Fo Variations Christian Cavé, Isabelle Guaïtella, Roxane Bertrand, Serge Santi, Françoise Harlay, Robert Espesser
  6. How Many Words is a Picture Really Worth? Laurel Fais, Kyung-ho Loken-Kim, Tsuyoshi Morimoto
  7. Visual Synthesis of Source Acoustic Speech Through Kohonen Neural Networks A. Lagana`, F. Lavagetto, A. Storace
  8. Audio-visual Speech Perception Without Speech Cues Helena M. Saldaña, David B. Pisoni, Jennifer M. Fellowes, Robert E. Remez


SuA1S1 -- Multilingual Speech Processing I

Chair: Alex Waibel, Carnegie Mellon University
  1. Multilingual Speech Recognition at Dragon Systems Jim Barnett, A. Corrada, G. Gao, L. Gillick, Y. Ito, S. Lowe, L. Manganaro, B. Peskin
  2. Multi-lingual Phoneme Recognition Exploiting Acoustic-phonetic Similarities of Sounds Joachim Köhler
  3. Japanese Speech Databases for Robust Speech Recognition Atsushi Nakamura, Shoichi Matsunaga, Tohru Shimizu, Masahiro Tonomura, Yoshinori Sagisaka
  4. Spoken Language Processing in a Multilingual Context Lori F. Lamel, M. Adda-Decker, Jean Luc Gauvain, G. Adda
  5. Multilingual Human-computer Interactions: From Information Access to Language Learning Victor Zue, Stephanie Seneff, Joseph Polifroni, Helen Meng, James Glass
  6. SpeeData: Multilingual Spoken Data Entry U. Ackermann, B. Angelini, F. Brugnara, M. Federico, D. Giuliani, R. Gretter, G. Lazzari, H. Niemann


SuA2L1 -- Acoustics in Synthesis

Chair: Michael Macon, Georgia Institute of Technology
  1. Pseudo-articulatory Representations in Speech Synthesis and Recognition William H. Edmondson, Jon P. Iles, Dorota J. Iskra
  2. Synthesis of Initial (/s/-) Stop-liquid Clusters using HLsyn David R. Williams
  3. Synthesis of Trill Chilin Shih
  4. Phone-based Speech Synthesis with Neural Network and Articulatory Control W.K. Lo, P.C. Ching
  5. Analysis of Ten Vowel Sounds Across Gender and Regional/Cultural Accent P. Martland, S.P. Whiteside, Steve W. Beet, L. Baghai-Ravary
  6. Speech Morphing by Gradually Changing Spectrum Parameter and Fundamental Frequency Masanobu Abe


SuA2L2 -- Pitch and Rate

Chair: David Talkin, Entropic Research Laboratory
  1. The Multi-Lag-Window Method for Robust Extended-range F0 Determination Edouard Geoffrois
  2. Nonlinear Estimation of DEGG Signals with Applications to Speech Pitch Detection Kenneth E. Barner
  3. Pitch Analysis Methods for Cross-Speaker Comparison John. A. Maidment, M. Luisa Garcia-Lecumberri
  4. Continuous Adaptation of Linear Models with Impulsive Excitation Steve W. Beet, L. Baghai-Ravary
  5. Quantitative Analysis of the Local Speech Rate and its Application to Speech Synthesis Sumio Ohno, Masamichi Fukumiya, Hiroya Fujisaki
  6. A Fast and Reliable Rate of Speech Detector Jan P. Verhasselt, Jean-Pierre Martens


SuA2L3 -- Acoustic Modeling II

Chair: Li Deng, University of Waterloo
  1. Context Modeling and Clustering in Continuous Speech Recognition Jean-Claude Junqua, Lorenzo Vassallo
  2. Hierarchical Partition of the Articulatory State Space for Overlapping-feature Based Speech Recognition Li Deng, Jim Jian-Xiong Wu
  3. A Fuzzy Acoustic-phonetic Decoder for Speech Recognition Olivier Oppizzi, David Fournier, Philippe Gilles, Henri Méloni
  4. Syllable-level Desynchronisation of Phonetic Features for Speech Recognition Katrin Kirchhoff
  5. A Probabilistic Framework for Feature-based Speech Recognition James Glass, Jane Chang, Michael McCandless
  6. Modeling Context-dependent Phonetic Units in a Continuous Speech Recognition System for Mandarin Chinese Jim Jian-Xiong Wu, Li Deng, Jacky Chan


SuA2P1 -- General ASR Posters

Chair: Lori Lamel, LIMSI-CNRS
  1. JANUS-II: Towards Spontaneous Spanish Speech Recognition Puming Zhan, Klaus Ries, Marsal Gavaldà, Donna Gates, Alon Lavie, Alex Waibel
  2. Reduced Semi-continuous Models for Large Vocabulary Continuous Speech Recognition in Dutch Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle
  3. Validating Different Flexible Vocabulary Approaches on the Swiss French PolyPhone and PolyVar Databases Andrei Constantinescu, Olivier Bornet, Gilles Caloz, Gérard Chollet
  4. Use of a Reliability Coefficient in Noise Cancelling by Neural Net and Weighted Matching Algorithms Nestor Becérra Yoma, Fergus R. McInnes, Mervyn A. Jack
  5. Likelihood Normalization Using an Ergodic HMM for Continuous Speech Recognition Kazuhiko Ozeki
  6. Dynamic Control of a Production Model Laurence Candille, Henri Méloni
  7. Speech Recognition Using Sub-word Units Dependent On Phonetic Contexts Of Both Training and Recognition Vocabularies Hiroaki Hattori, Eiko Yamada
  8. Hidden Markov Models Merging Acoustic and Articulatory Information to Automatic Speech Recognition Bruno Jacob, Christine Senac
  9. Creation of Unseen Triphones from Diphones and Monophones using a Speech Production Approach Mats Blomberg, Kjell Elenius
  10. Speaker-independent Dictation of Chinese Speech with 32K Vocabulary Bo Xu, Bing Ma, Shuwu Zhang, Fei Qu, Taiyi Huang
  11. Using Accent-specific Pronunciation Modelling for Robust Speech Recognition J.J. Humphries, P.C. Woodland, D. Pearce
  12. Dictionary Learning for Spontaneous Speech Recognition Tilo Sloboda, Alex Waibel
  13. Comparison of Channel Normalisation Techniques for Automatic Speech Recognition Over the Phone Johan de Veth, Louis Boves
  14. Anchor Point Detection for Continuous Speech Recognition in Spanish: The Spotting of Phonetic Events Manuel A. Leandro, Jose M. Pardo
  15. Cepstral Compensation by Polynomial Approximation for Environment-independent Speech Recognition Bhiksha Raj, Evandro B. Gouvêa, Pedro J. Moreno, Richard M. Stern
  16. Effect of Speech Coders on Speech Recognition Performance B.T. Lilly, K.K. Paliwal
  17. Wavelet Transforms For Non-uniform Speech Recogntion Systems Léonard Janer, Josep Martí, Climent Nadeu, Eduardo Lleida-Solano
  18. A Binaural Model as a Front-end for Isolated Word Recognition Tsuyoshi Usagawa, Markus Bodden, Klaus Rateitschek
  19. A New Speech Enhancement: Speech Stream Segregation Hiroshi G. Okuno, Tomohiro Nakatani, Takeshi Kawabata


SuA2S1 -- Multilingual Speech Processing II

Chair: Alex Waibel, Carnegie Mellon University
  1. Head Automata for Speech Translation Hiyan Alshawi
  2. Word Clustering with Parallel Spoken Language Corpora Ye-Yi Wang, John Lafferty, Alex Waibel
  3. Toward Translating Korean Speech Into Other Languages Jae-Woo Yang, Youngjik Lee
  4. VERBMOBIL: The Evolution of a Complex Large Speech-to-Speech Translation System Thomas Bub, Johannes Schwinn
  5. Translation of Conversational Speech with JANUS-II Alon Lavie, Alex Waibel, Lori Levin, Donna Gates, Marsal Gavaldà, Torsten Zeppenfeld, Puming Zhan, Oren Glickman


SuP1L1 -- Data-based Synthesis

Chair: Yoshinori Sagisaka, ATR Interpreting Telecommunications Research Laboratory
  1. Non-segmental Analysis and Synthesis Based on a Speech Database Andrew Slater, John Coleman
  2. Microsegment Synthesis - Economic Principles in a Low-cost Solution Ralf Benzmüller, William J. Barry
  3. Whistler: A Trainable Text-to-Speech System X.D. Huang, A. Acero, J. Adcock, H.W. Hon, J. Goldsmith, J. Liu, Mike Plumpe
  4. Generation of Multiple Synthesis Inventories by a Bootstrapping Procedure Thomas Portele, Karl-Heinz Stöber, Horst Meyer, Wolfgang Hess
  5. Modeling Segmental Duration in German Text-to-Speech Synthesis Bernd Möbius, Jan P.H. van Santen
  6. Autolabelling Japanese ToBI Nick Campbell


SuP1L2 -- Speaker Identification and Verification

Chair: Doug Reynolds, MIT Lincoln Laboratory
  1. General Phrase Speaker Verification Using Sub-word Background Models and Likelihood-ratio Scoring S. Parthasarathy, A.E. Rosenberg
  2. Unknown-Multiple Signal Source Clustering Problem Using Ergodic HMM and Applied to Speaker Classification J. Murakami, M. Sugiyama, H. Watanabe
  3. GMM and ARVM Cooperation and Competition for Text-independent Speaker Recognition on Telephone Speech J.-L. Le Floch, C. Montacié, M.-J. Caraty
  4. Selective use of the Speech Spectrum and a VQGMM Method for Speaker Identification Qiguang Lin, Ea-Ee Jan, ChiWei Che, Dong-Suk Yuk, James L. Flanagan
  5. Speaker Verification through Large Vocabulary Continuous Speech Recognition Michael Newman, Larry Gillick, Yoshiko Ito, Don McAllaster, Barbara Peskin
  6. Predictive Neural Networks in Text Independent Speaker Verification: an Evaluation on the SIVA Database Andrea Paoloni, Susanna Ragazzini, G. Ravaioli


SuP1L3 -- Acoustic Phonetics

Chair: Nick Campbell, ATR Interpreting Telecommunications Research Laboratory
  1. Durational Characterstics of Hindi Consonant Clusters Nisheeth Shrotriya, Rajesh Verma, S.K. Gupta, S.S. Agrawal
  2. The Use of Wavelet Transforms in Phoneme Recognition Beng T. Tan, Minyue Fu, Andrew Spray, Phillip Dermody
  3. Acoustic Properties of Phonemes in Continuous Speech for Different Speaking Rate Hisao Kuwabara
  4. Prosodic Parameterization of Spoken Japanese Based on a Model of the Generation Process of F0 Contours Hiroya Fujisaki, Sumio Ohno
  5. A Logistic Regression Model for Detecting Prominences Arman Maghbouleh
  6. High-quality Prosodic Modification of Speech Signals Beat Pfister


SuP1P1 -- Perception of Vowels and Consonants

Chair: Doug Whalen, Haskins Laboratories
  1. On the Syllable Structures of Chinese Relating to Speech Recognition Jialu Zhang
  2. Perceptual Assimilation of American English Vowels by Japanese Listeners W. Strange, Reiko Akahane-Yamada, B.H. Fitzgerald, R. Kubo
  3. Context and Speaker Effects in the Perceptual Assimilation of German Vowels by American Listeners W. Strange, O.-S. Bohn, S. A. Trent, M.C. McNair, K.C. Bielec
  4. Examination of a Perceptual Non-native Speech Contrast: Pharyngealized/Non-pharyngealized Discrimination by French-speaking Adults Mohamed Zahid
  5. Context-dependent Relevance of Burst and Transitions for Perceived Place in Stops: It's in Production, not Perception Roel Smits
  6. The Perception of Morae in Long Vowels Comparison Among Japanese, Korean and English Speakers Ryoji Baba, Kaori Omuro, Hiromitsu Miyazono, Tsuyoshi Usagawa, Masahiko Higuchi
  7. Juncture Cues to Disfluency Robin J. Lickley
  8. Effects of Duration and Formant Movement on Vowel Perception James R. Sawusch
  9. Benchmarking Human Performance for Continuous Speech Recognition N. Deshmukh, R.J. Duncan, A. Ganapathiraju, J. Picone
  10. Intelligibility of Speech with Filtered Time Trajectories of Spectral Envelopes Takayuki Arai, Misha Pavel, Hynek Hermansky, Carlos Avendano
  11. Perceptual Use of Vowel and Speaker Information in Breath Sounds D. H. Whalen, Sonya M. Sheffert
  12. The Role of Neighborhood Relative Frequency in Spoken Word Recognition Philippe Mousty, Monique Radeau, Ronald Peereman, Paul Bertelson
  13. Transitional Probability and Phoneme Monitoring James M. McQueen, Mark A. Pitt
  14. Identification of Vowel Features from French Stop Bursts Anne Bonneau
  15. Listening in a Second Language Z.S. Bond, Thomas J. Moore, Beverley Gable
  16. Acoustic Correlates to the Effects of Talker Variability on the Perception of English /r/ and /l/ by Japanese Listeners James S. Magnuson, Reiko Akahane-Yamada
  17. Perception of Lexical Tone Across Languages: Evidence for a Linguistic Mode of Processing Denis Burnham, Elizabeth Francis, Di Webster, Sudaporn Luksaneeyanawin, Chayada Attapaiboon, Francisco Lacerda, Peter Keller


SuP2LP -- Closing Ceremony and Plenary Lecture

Chairs: H. Timothy Bunnell, Alfred I. duPont Institute; and Richard A. Foulds, Alfred I. duPont Institute
  1. Natural Communication with Machines - Progress and Challenge James L. Flanagan