Volume 3 Contents



SaA1L1 -- Speech Recognition Using HMMs and NNs

Chair: Nelson Morgan, ICSI and University of California, Berkeley


  1. An Incremental Speaker-Adaptation Technique for Hybrid HMM-MLP Recognizer Joao P. Neto, Ciro A. Martins, Luís B. Almeida
  2. Phoneme Segmentation of Continuous Speech using Multi-layer Perceptron Youngjoo Suh, Youngjik Lee
  3. Stochastic Perceptual Speech Models with Durational Dependence Jeff Bilmes, Nelson Morgan, Su-Lin Wu, Hervé Bourlard
  4. Boosting the Performance of Connectionist Large Vocabulary Speech Recognition G.D. Cook, A.J. Robinson
  5. HMMs and OWE Neural Network for Continuous Speech Recognition Nicolas Pican, Dominique Fohr, Jean-François Mari
  6. Smoothed Local Adaptation of Connectionist Systems Steve Waterhouse, Dan Kershaw, Tony Robinson


SaA1L2 -- Adverse Environments and Multiple Microphones

Chair: Tony Robinson, Cambridge University


  1. Robust Speech Recognition with Speaker Localization by a Microphone Array Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano
  2. Sound Source Localization in Reverberant Environments using an Outlier Elimination Algorithm Ea-Ee Jan, James L. Flanagan
  3. The 1995 Abbot LVCSR System for Multiple Unknown Microphones Dan Kershaw, Tony Robinson, Steve Renals
  4. Experiments of Speech Recognition in a Noisy and Reverberant Environment using a Microphone Array and HMM Adaptation D. Giuliani, M. Omologo, P. Svaizer
  5. Increasing Robustness in GMM Speaker Recognition Systems for Noisy and Reverberant Speech with Low Complexity Microphone Arrays Joaquín González-Rodríguez, Javier Ortega-García, César Martin, Luis Hernández
  6. Robust Automatic Speech Recognition Using a Multi-channel Signal Separation Front-End Kuan-Chieh Yen, Yunxin Zhao


SaA1L3 -- Prosodic Synthesis in Dialogue

Chair: Mark Steedman, University of Pennsylvania


  1. Prosody Generation in Text-to-Speech Conversion Using Dependency Graphs Anders Lindström, Ivan Bretan, Mats Ljungqvist
  2. Extraction Method of Non-restrictive Modification in Japanese as a Marked Factor of Prosody Hisako Asano, Hisashi Ohara, Yoshifumi Ooyama
  3. Modeling Contrast in the Generation and Synthesis of Spoken Language Scott Prevost
  4. A Left-to-right Processing Model of Pausing in Japanese Based on Limited Syntactic Information Hajime Tsukada
  5. Modeling of Intonation Bearing Emphasis for TTS-Synthesis of Greek Dialogues D. Galanis, V. Darsinos, G. Kokkinakis
  6. Synthesizing Prosody: a Prominence-based Approach Barbara Heuft, Thomas Portele


SaA1P1 -- Speech Synthesis

Chair: Thierry Dutoit, Faculté Polytechnique de Mons


  1. Multilingual Text Analysis for Text-to-Speech Synthesis Richard Sproat
  2. Spoken-style Explanation Generator for Japanese Kanji using a Text-to-speech System Yoshifumi Ooyama, Hisako Asano, Koji Matsuoka
  3. A Method for Estimating Prosodic Symbol from Text for Japanese Text-To-Speech Synthesis Ken-ichi Magata, Tomoki Hamagami, Mitsuo Komura
  4. Statistical Methods in Data-driven Modeling of Spanish Prosody for Text to Speech E. López-Gonzalo, J.M. Rodríguez-García
  5. Intonation Processing for TTS Using Stylization and Neural Network Learning Method Jung-Chul Lee, Youngjik Lee, Sang-Hun Kim, Minsoo Hahn
  6. Generating F0 Contours from ToBI Labels using Linear Regression Alan W. Black, Andrew J. Hunt
  7. The Broad Study of Homograph Disambiguity for Mandarin Speech Synthesis Wern-Jun Wang, Shaw-Hwa Hwang, Sin-Horng Chen
  8. The MBROLA project: Towards a Set of High Quality Speech Synthesizers Free of Use for Non Commercial Purposes T. Dutoit, V. Pagel, N. Pierret, F. Bataille, O. Van der Vrecken
  9. Training Data Selection for Voice Conversion Using Speaker Selection and Vector Field Smoothing Makoto Hashimoto, Norio Higuchi
  10. A New Voice Transformation Method Based on Both Linear and Nonlinear Prediction Analysis Ki Seung Lee, Dae Hee Youn, Il Whan Cha
  11. On the Transformation of the Speech Spectrum for Voice Conversion G. Baudoin, Yannis Stylianou
  12. Spectral Analysis of Synthetic Speech and Natural Speech with Noise over the Telephone Line Cristina Delogu, Andrea Paoloni, Susanna Ragazzini, Paola Ridolfi
  13. A New Speech Synthesis System Based on the ARX Speech Production Model Weizhong Zhu, Hideki Kasuya
  14. Speech Synthesis Using the CELP Algorithm Geraldo Lino de Campos, Evandro Bacci Gouvêa
  15. A Mandarin Text-to-Speech System Shaw-Hwa Hwang, Sin-Horng Chen, Yih-Ru Wang
  16. Residual-based Speech Modification Algorithms for Text-to-Speech Synthesis M.D. Edgington, A. Lowry
  17. A Generalized LR Parser for Text-to-speech Synthesis Per Olav Heggtveit
  18. Enhanced Shape-invariant Pitch and Time-scale Modification for Concatenative Speech Synthesis M.P. Pollard, B.M.G. Cheetham, C.C. Goodyear, M.D. Edgington, A. Lowry
  19. An Excitation Synchronous Pitch Waveform Extraction Method and its Application to the VCV-concatenation Synthesis of Japanese Spoken Words Yasuhiko Arai, Ryo Mochizuki, Hirofumi Nishimura, Takashi Honda
  20. A New Chinese Text-to-Speech System with High Naturalness Ren-Hua Wang, Qinfeng Liu, Difei Tang
  21. Voice Conversion Based on Topological Feature Maps and Time-variant Filtering Ansgar Rinscheid


SaA1P2 -- Instructional Technology for Spoken Language

Chair: Reiko A. Yamada, ATR Human Information Processing Research Laboratories


  1. Language Training System Utilizing Speech Modification Meron Yoram, Keikichi Hirose
  2. Perception of English /r/ and /l/ Speech Contrasts by Native Korean Listeners with Extensive English-language Experience D.G. Jamieson, K. Yu
  3. Automatic Text-independent Pronunciation Scoring of Foreign Language Student Speech Leonardo Neumeyer, Horacio Franco, Mitchel Weintraub, Patti Price
  4. Assessing the Contribution of Instructional Technology in the Teaching of Pronunciation Antônio Simoes
  5. Detection of Foreign Speakers' Pronunciation Errors for Second Language Training - Preliminary Results Maxine Eskenazi
  6. Foreign Accent in Intonation Patterns - A Contrastive Study Applying a Quantitative Model of the F0 Contour Hansjörg Mixdorff
  7. Input Modality Effects in Foreign Accent Duncan J. Markham, Yasuko Nagano-Madsen


SaA1S1 -- Multimodal Spoken Language Processing I

Chairs: Lynne E. Bernstein, House Ear Institute; and Christian Benoît, ICP-Grenoble


  1. For Speech Perception by Humans or Machines, Three Senses are Better than One Lynne E. Bernstein, Christian Benoît
  2. A Few Factors Which Affect the Degree of Incorporating Lip-read Information into Speech Perception Kaoru Sekiyama, Yoh'ichi Tohkura, Michio Umeda
  3. Characterizing Audiovisual Information During Speech E. Vatikiotis-Bateson, K.G. Munhall, Y. Kasahara, F. Garcia, H. Yehia
  4. The Implications of the Tadoma Method of Speechreading for Spoken Language Processing Charlotte M. Reed
  5. Seeing Speech in Space and Time: Psychological and Neurological Findings Ruth Campbell


SaA2L1 -- Prosody - Phonological/Phonetic Measures

Chair: Paul Taylor, University of Edinburgh


  1. What's in the "Pure" Prosody? Volker Strom, Christina Widera
  2. F0 Declination in Read-aloud and Spontaneous Speech Marc Swerts, Eva Strangert, Mattias Heldner
  3. Prediction of Prosodic Phrase Boundaries Considering Variable Speaking Rate Yeon-jun Kim, Yung-hwan Oh
  4. Prediction of F0 Parameter of Contextualized Utterances in Dialogue Yoichi Yamashita, Riichiro Mizoguchi
  5. The Production and Perception of Potentially Ambiguous Intonation Contours by Speakers of Russian and Japanese V. Makarova, J. Matsui
  6. What is Invariant and What is Optional in the Realization of a FOCUSED Word? A Cross-dialectal Study of Swedish Sentences With Moving Focus Robert Eklund


SaA2L2 -- Phonetics and Perception

Chair: Christine Shadle, University of Southhampton


  1. Quantifying Spectral Characteristics of Fricatives Christine H. Shadle, Sheila J. Mair
  2. Acoustic Characteristics of Ejectives in Ingush Natasha Warner
  3. An Acoustic Profile of Consonant Reduction R.J.J.H. van Son, Louis C. W. Pols
  4. Devoicing in Post-vocalic Canadian-French Obstruants Danièle Archambault, Blagovesta Maneva
  5. Paying Attention to Speaking Rate Alexander L. Francis, Howard C. Nusbaum
  6. The Lack of Invariance Problem and the Goal of Speech Perception Irene Appelbaum


SaA2L3 -- Language Acquisition

Chair: Harriet S. Magen, Rhode Island College


  1. The Acoustic Structure of Vowels in Mothers' Speech to Infants and Adults Jean E. Andruski, Patricia K. Kuhl
  2. Acoustical Characteristics of Sound Production of Deaf and Normally Hearing Infants Chris J. Clement, Florien J. Koopmans-van Beinum, Louis C. W. Pols
  3. Learning Non-native Vowel Categories John Kingston, Christine Bartels, José Benkí, Deanna Moore, Jeremy Rice, Rachel Thorburn, Neil Macmillan
  4. Word Recognition by Japanese Infants P.A. Halle, Toshisada Deguchi, Yuji Tamekawa, B. Boysson-Bardies, Shigeru Kiritani
  5. Investigations of the Word Segmentation Abilities of Infants Peter W. Jusczyk
  6. Developmental Change in Perception of Clause Boundaries by 6- and 10-Month-old Japanese Infants Akiko Hayashi, Yuji Tamekawa, Toshisada Deguchi, Shigeru Kiritani


SaA2P1 -- Production and Prosody Posters

Chair: Carol Espy-Wilson, Boston University


  1. A Frequency Domain Method for Parametrization of the Voice Source Paavo Alku, Erkki Vilkman
  2. Glottal Correlates of the Word Stress and the Tense/Lax Opposition in German Krzysztof Marasek
  3. Coarticulatory Stability in American English /r/ Suzanne Boyce, Carol Y. Espy-Wilson
  4. An MRI-based Analysis of the English /r/ and /l/ Articulations Shinobu Masaki, Reiko Akahane-Yamada, Mark K. Tiede, Yasuhiro Shimada, Ichiro Fujimoto
  5. Does Lexical Stress or Metrical Stress Better Predict Word Boundaries in Dutch? David van Kuijk
  6. Optopalatograph (OPG): A New Apparatus for Speech Production Analysis A. A. Wrench, A. D. McIntosh, W. J. Hardcastle
  7. Prediction of Vowel Systems using a Deductive Approach René Carré
  8. Distinctions Between [t] and [tch] using Electropalatography Data Sheila J. Mair, Celia Scully, Christine H. Shadle
  9. Relating Formants and Articulation in Intelligibility Test Words Michiko Hashi, Raymond D. Kent, John R. Westbury, Mary J. Lindstrom
  10. The Role of Coarticulation in the Perception of Vowel Quality in Modern Standard Arabic Imad Znagui, Mohamed Yeou
  11. Updating the Reading EPG Simon Arnfield, Wilf Jones
  12. Lexical Stress Detection on Stress-minimal Word Pairs Goangshiuan S. Ying, Leah H. Jamieson, Ruxin Chen, Carl D. Mitchell
  13. An Acoustic Study of the Interaction Between Stressed and Unstressed Syllables in Spoken Mandarin Jing Wang
  14. Automatic Detection of Accent Nuclei at the Head of Words for Speech Recognition Nobuaki Minematsu, Seiichi Nakagawa
  15. Automatic Generation of Prosodic Structure for High Quality Mandarin Speech Synthesis Fu-chiang Chou, Chiu-yu Tseng, Lin-shan Lee
  16. A Study on Japanese Prosodic Pattern and its Modeling in Restricted Speech Tomoki Hamagami, Ken-ichi Magata, Mitsuo Komura
  17. A Phonetic Study of Focus in Intransitive Verb Sentences Steve Hoskins
  18. Variation in Vocal Fold Vibration Associated with Prosodic Conditions Shigeru Kiritani, Hiroshi Imagawa, Seiji Niimi
  19. Goethe for Prosody Stefan Rapp
  20. Prosodic Cues in Syntactically Ambiguous Strings; An Interactive Speech Planning Mechanism K.A. Straub
  21. A Functional Model for Generation of the Local Components of F0 Contours in Chinese Jinfu Ni, Ren-Hua Wang, Deyu Xia
  22. The Acquisition of Voiceless Stops in the Interlanguage of Second Language Learners of English and Spanish Marie Fellbaum
  23. Jaw Contribution to Timing Control of "Guttural" Consonants Production Ahmed M. Elgendy


SaA2S1 -- Multimodal Spoken Language Processing II

Chairs: Lynne E. Bernstein, House Ear Institute; and Christian Benoît, ICP-Grenoble


  1. Studies of the McGurk Effect: Implications for Theories of Speech Perception Kerry P. Green
  2. Using the Visual Component in Automatic Speech Recognition N. M. Brooke
  3. Perceptual Organization of Speech in One and Several Modalities: Common Functions, Common Resources Robert E. Remez
  4. Multi-modal Encoding of Speech in Memory: A First Report David B. Pisoni, Helena M. Saldaña, Sonya M. Sheffert


SaA2S2 -- Emotion in Recognition and Synthesis (Poster Preview)

Chair: Klaus R. Scherer, University of Geneva


  1. Word Class Driven Synthesis of Prosodic Annotations Simon Arnfield
  2. Dynamical Modelling of Vowel Sounds as a Synthesis Tool M. Banbrook, S. McLaughlin
  3. Emotional Speech Elicited using Computer Games Tom Johnstone
  4. Automatic Statistical Analysis of the Signal and Prosodic Signs of Emotion in Speech Roddy Cowie, Ellen Douglas-Cowie
  5. Recognizing Emotion in Speech Frank Dellaert, Thomas Polzin, Alex Waibel
  6. Emotions in Time Domain Synthesis Barbara Heuft, Thomas Portele, Monika Rauth


SaP1L1 -- User-Machine Interfaces

Chair: Candy Kamm, AT&T Labs - Research


  1. Evaluating Automatic Speech Recognition as a Component of a Multi-input Device Human-computer Interface B.A. Mellor, C. Baber, C. Tunley
  2. Data Collection for the MASK Kiosk: WOz vs Prototype System A. Life, I. Salter, J.N. Temem, F. Bernard, S. Rosset, S.K. Bennacef, Lori Lamel
  3. An Experimental Japanese/English Interpreting Video Phone System M. Karaorman, T.H. Applebaum, T. Itoh, M. Endo, Y. Ohno, M. Hoshimi, T. Kamai, K. Matsui, K. Hata, S. Pearson, J.-C. Janqua
  4. User Participation and Compliance in Speech Automated Telecommunications Applications Sara Basson, Stephen Springer, Cynthia Fong, Hong Leung, Ed Man, Michele Olson, John Pitrelli, Ranvir Singh, Suk Wong
  5. Embedding Speech in Web Interfaces Samuel Bayer
  6. Voice-activated Home Banking System and its Field Trial Toshihiro Isobe, Masatoshi Morishima, Fuminori Yoshitani, Nobuo Koizumi, Ken'ya Murakami


SaP1L2 -- TTS Systems and Rules

Chair: Juergen Schroeter, AT&T Labs - Research


  1. A Text Analyzer for Korean Text-to-Speech Systems Sangho Lee, Yung-Hwan Oh
  2. Design and Evaluation of a Phonological Phrase Parser for Spanish Text-to-Speech Helen E. Karn
  3. Comparison of Two Tree-Structured Approaches for Grapheme-to-Phoneme Conversion Ove Andersen, Roland Kuhn, Ariane Lazaridès, Paul Dalsgaard, Jürgen Haas, Elmar Nöth
  4. A Recurrent Network that Learns to Pronounce English Text M.J. Adamson, R.I. Damper
  5. Archisegment-based Letter-to-Phone Conversion for Concatenative Speech Synthesis in Portuguese Eleonora Cavalcante Albano, Agnaldo Antonio Moreira
  6. A New Method of Generating Speech Synthesis Units Based on Phonological Knowledge and Clustering Technique Yuki Yoshida, Shin'ya Nakajima, Kazuo Hakoda, Tomohisa Hirokawa


SaP1L3 -- Prosody and Labeling

Chair: Louis Boves, Nymegen University


  1. Consistency in Transcription and Labelling of German Intonation with GToBI Martine Grice, Matthias Reyelt, Ralf Benzmüller, Jörg Mayer, Anton Batliner
  2. Syntactic-prosodic Labeling of Large Spontaneous Speech Data-bases Anton Batliner, R. Kompe, A. Kiessling, H. Niemann, E. Nöth
  3. Relationship Between Discourse Structure and Dynamic Speech Rate Florien J. Koopmans-van Beinum, Monique E. van Donzel
  4. Using Prosodic Clues to Decide When to Produce Back-channel Utterances Nigel Ward
  5. Dialog Act Classification with the Help of Prosody Marion Mast, Ralf Kompe, Stefan Harbeck, Andreas Kiessling, Heinrich Niemann, Elmar Nöth, E. G. Schukat-Talamazzini, V. Warnke
  6. Using Lexical Stress in Continuous Speech Recognition for Dutch David van Kuijk, Henk van den Heuvel, Louis Boves


SaP1P1 -- Speaker/Language Identification and Verification

Chair: Sadaoki Furui, NTT Human Interface Lab


  1. Automatic Accent Classification of Foreign Accented Australian English Speech Karsten Kumpf, Robin W. King
  2. Discriminative Adaptation for Speaker Verification F. Korkmazskiy, Biing-Hwang Juang
  3. Perceptual Features of Unknown Foreign Languages as Revealed by Multi-dimensional Scaling V. Stockmal, D. Muljani, Z.S. Bond
  4. On-line Incremental Adaptation for Speaker Verification using Maximum Likelihood Estimates of CDHMM Parameters Kin Yu, John S. Mason
  5. Combining Methods to Improve Speaker Verification Decision Dominique Genoud, Frédéric Bimbot, Guillaume Gravier, Gérard Chollet
  6. Incremental Speaker Adaptation with Minimum Error Discriminative Training for Speaker Identification Cesar Martín del Alamo, J. Alvarez, C. de la Torre, F.J. Poyatos, L. Hernández
  7. Frame Level Likelihood Normalization for Text-independent Speaker Identification using Gaussian Mixture Models Konstantin P. Markov, Seiichi Nakagawa
  8. On Using Prosodic Cues in Automatic Language Identification Ann E. Thymé-Gobbel, Sandra E. Hutchins
  9. Speaker Recognition Model using Two-dimensional Mel-Cepstrum and Predictive Neural Network Tadashi Kitamura, Shinsai Takei
  10. Unknown Language Rejection in Language Identification System Hingkeung Kwan, Keikichi Hirose
  11. Spoken Language Identification using Large Vocabulary Speech Recognition James L. Hieronymus, Shubha Kadambe
  12. Accent Identification Carlos Teixeira, Isabel M. Trancoso, António Serralheiro
  13. Comparison of Text-independent Speaker Recognition Methods on Telephone Speech with Acoustic Mismatch Sarel van Vuuren
  14. On the Sources of Inter- and Intra-speaker Variability in the Acoustic Dynamics of Speech Xue Yang, J. Bruce Millar, Iain Macleod
  15. Language Identification with Inaccurate String Matching Kay M. Berkling, Etienne Barnard
  16. Robust Prosodic Features for Speaker Identification M.J. Carey, E.S. Parris, H. Lloyd-Thomas, S.J. Bennett
  17. Text Independent Speaker Identification on Noisy Environments by Means of Self Organizing Maps E. Monte, J. Hernando, X. Miró, A. Adolf
  18. Language-identification Using Language-dependent Phonemes and Language-independent Speech Units Paul Dalsgaard, Ove Andersen, Hanne Hesselager, Bojan Petek


SaP1S1 -- Large Vocabulary Speech Recognition: The Switchboard Domain I

Chairs: Ronald Rosenfeld, Carnegie Mellon University; and Hervé Bourlard, Faculté Polytechnique De Mons


  1. Introduction to SWB Jorden Cohen
  2. Disfluencies in SWB Elizabeth Shriberg
  3. Error Analysis and Disfluency Modeling Ronald Rosenfeld
  4. Fast Sparse Data Training/Portability Andreas Stolcke
  5. Phrase Structure Language Models Salim Roukos
  6. Language Modeling Issues for Spanish Herbert Gish
  7. SRI Speaking Mode Experiments Andreas Stolcke


SaP1S2 -- Emotion in Recognition and Synthesis I

Chair: Klaus R. Scherer, University of Geneva


  1. Adding the Affective Dimension: A New Look in Speech Analysis and Synthesis Klaus R. Scherer
  2. Ethological Theory and the Expression of Emotion in the Voice John J. Ohala
  3. Synthesizing Emotions in Speech: Is it Time to Get Excited? Iain R. Murray, John L. Arnott


SaP2L1 -- Stochastic Techniques in Robust Speech Recognition

Chair: Richard Rose, AT&T Labs - Research


  1. A Study on Task-independent Subword Selection and Modeling for Speech Recognition Chin-Hui Lee, Biing-Hwang Juang, Wu Chou, J.J. Molina-Perez
  2. Simultaneous ANN Feature and HMM Recognizer Design using String-based Minimum Classification Error (MCE) Training Mazin G. Rahim, Chin-Hui Lee
  3. Quantizing Mixture-weights in a Tied-mixture HMM Sunil K. Gupta, Frank K. Soong, Raziel Haimi-Cohen
  4. Variance Compensation within the MLLR Framework for Robust Speech Recognition and Speaker Adaptation M.J.F. Gales, D. Pye, P.C. Woodland
  5. Maximum-likelihood Stochastic Matching Approach to Non-linear Equalization for Robust Speech Recognition A.C. Surendran, Chin-Hui Lee, Mazin G. Rahim
  6. Estimation of Channel Bias for Telephone Speech Recognition Jen-Tzung Chien, Hsiao-Chuan Wang, Lee-Min Lee


SaP2L2 -- Prosodic Synthesis in Text to Speech

Chair: Bernd Moebius, Bell Labs - Lucent Technologies


  1. Synthesis of English Intonation using Explicit Models of Reading and Spontaneous Speech M. E. Johnson
  2. Generating Intonation by Superposing Gestures Yann Morlec, Gérard Bailly, Vèronique Aubergé
  3. Implementation and Evaluation of a Model for Synthesis of Swedish Intonation Merle Horne, Marcus Filipsson
  4. Natural Prosody Generation for Domain Specific Text-to-Speech Systems Nobuyuki Katae, Shinta Kimura
  5. Improving Text-to-Speech Synthesis Mark Tatham, Eric Lewis
  6. Synthesis of Stressed Speech from Isolated Neutral Speech Using HMM-based Models Sahar E. Bou-Ghazale, John H.L. Hansen
  7. Modeling Segment Intonation for Slovene TTS System Ales Dobnikar


SaP2L3 -- Dialogue Events

Chair: David G. Novick, European Institute of Cognitive Sciences and Engineering


  1. Word Predictability After Hesitations: A Corpus-based Study Elizabeth Shriberg, Andreas Stolcke
  2. Interruptions and Intonation Li-chiung Yang
  3. On not Recognizing Disfluencies in Dialogue Robin J. Lickley, Ellen Gurman Bard
  4. A Theory of Word Frequencies and its Application to Dialogue Move Recognition Phil Garner, Sue Browning, Roger Moore, Martin Russell
  5. Utterance Units and Grounding in Spoken Dialogue David R. Traum, Peter A. Heeman
  6. Coordinating Turn-taking with Gaze David G. Novick, Brian Hansen, Karen Ward


SaP2P1 -- Databases and Tools

Chair: Bruce M. Buntschuh, AT&T Labs - Research


  1. BABEL: An Eastern European Multi-language Database Peter Roach, Simon Arnfield, W. Barry, J. Baltova, M. Boldea, A. Fourcin, W. Gonet, R. Gubrynowicz, E. Hallum, L. Lamel, K. Marasek, A. Marchal, E. Meister, K. Vicsi
  2. USTC95---A Putonghua Corpus Ren-Hua Wang, Deyu Xia, Jinfu Ni, Bicheng Liu
  3. Telephone Data Collection using the World Wide Web Edward Hurley, Joseph Polifroni, James Glass
  4. The "SIVA" Speech Database for Speaker Verification: Description and Evaluation M. Falcone, A. Gallo
  5. A Multi-level Description of Date Expressions in German Telephone Speech Christoph Draxler
  6. Viterbi Search Visualization Using Vista: A Generic Performance Visualization Tool Robert H. Halstead Jr., Ben Serridge, Jean-Manuel Van Thong, William Goldenthal
  7. A Multilingual Phonetic Representation and Analysis System for Different Speech Databases Toomas Altosaar, Matti Karjalainen, Martti Vainio
  8. FRESCO: The French Telephone Speech Data Collection - Part of the European SpeechDat(M) Project D. Langmann, R. Haeb-Umbach, Louis Boves, E. den Os
  9. Predicting the Out-of-Vocabulary Rate and the Required Vocabulary Size for Speech Processing Applications Johannes Müller, Holger Stahl, Manfred Lang
  10. AMULET: Automatic MUltisensor Speech Labelling and Event Tracking: Study of the Spatio-temporal Correlations in Voiceless Plosive Production Nathalie Parlangeau, Alain Marchal
  11. Constructing Multi-level Speech Database for Spontaneous Speech Processing Minsoo Hahn, Sanghun Kim, Jung-Chul Lee, Yong-Ju Lee
  12. Preliminaries to a Romanian Speech Database Marian Boldea, Alin Doroga, Tiberiu Dumitrescu, Maria Pescaru
  13. Labelled Data Bank of Spoken Standard German The Kiel Corpus of Read/Spontaneous Speech Klaus J. Kohler
  14. SAPPHIRE: An Extensible Speech Analysis and Recognition Tool Based on Tcl/Tk Lee Hetherington, Michael McCandless
  15. Automatic Detection of Topic Boundaries and Keywords in Arbitrary Speech Using Incremental Reference Interval-free Continuous DP Jiro Kiyama, Yoshiaki Itoh, Ryuichi Oka
  16. Very-large-vocabulary Mandarin Voice Message File Retrieval using Speech Queries Bo-Ren Bai, Lee-Feng Chien, Lin-Shan Lee
  17. Gandalf - A Swedish Telephone Speaker Verification Database H. Melin
  18. The DCIEM Map Task Corpus: Spontaneous Dialogue Under Sleep Deprivation and Drug Treatment Ellen Gurman Bard, C. Sotillo, A. H. Anderson, M. M. Taylor
  19. The Nemours Database of Dysarthric Speech Xavier Menéndez-Pidal, James B. Polikoff, Shirley M. Peters, Jennie E. Leonzio, H.T. Bunnell
  20. POST: Parallel Object-oriented Speech Toolkit Jean Hennebert, Dijana Petrovska Delacrétaz


SaP2S1 -- Large Vocabulary Speech Recognition: The Switchboard Domain II

Chairs: Ronald Rosenfeld, Carnegie Mellon University; and Hervé Bourlard, Faculté Polytechnique De Mons


  1. Insights into Spoken Language Gleaned from Phonetic Transcription of the Switchboard Corpus Steven Greenberg
  2. Automatic Learning of Word Pronunciation from Data Eric Fosler
  3. Modeling Systematic Variations in Pronunciation Bill Byrne
  4. Speech Data Modeling Nelson Morgan
  5. Linguistic Dependency Modeling Andreas Stolcke
  6. Summary, Observations, and Plans for the Future Fred Jelinek


SaP2S2 -- Emotion in Recognition and Synthesis II

Chair: Klaus R. Scherer, University of Geneva


  1. Discussion Period Klaus R. Scherer