DEVELOPING AAC SYSTEMS THAT MODEL INTELLIGENT PARTNER INTERACTIONS: METHODOLOGICAL CONSIDERATIONS Peter B. Vanderheyden1, Christopher A. Pennington1, Denise M. Peischl1, Wendy M. McKnitt1, Kathleen F. McCoy1, Patrick W. Demasco1, Hans van Balkom2, and Harry Kamphuis2 1Applied Science and Engineering Laboratories, University of Delaware/A.I. duPont Institute 2IRV, Institute for Rehabilitation Research, Hoensbroek, Netherlands ABSTRACT Augmented conversations are interactive. In a pilot study, we analyzed the transcripts of students with cerebral palsy describing pictures to their therapists. Described herein are some of the patterns observed during these interactions, and how they may reflect more general features in augmented communication. We suggest that some of these patterns should be modelled within future intelligent AAC systems. BACKGROUND Some computer-based augmentative and alternative communication systems are able to utilize knowledge of syntax, semantics, and vocabulary in order to facilitate the production of complete and correct sentences. To date, a number of studies have discussed aspects of the conversational nature of augmented interactions, but little attention has been paid to how they might be integrated into the AAC system itself. Conversation is often a cooperative, bi-directional, and multimodal process of constructing and exchanging information. In the context of AAC, a conversational partner often becomes actively involved in constructing the augmented speaker's message (1). The partner may ask questions, repeat part of the augmented speaker's utterance, or simply nod and smile in agreement. This feedback may in turn affect the message being produced by the augmented speaker. Computer-based AAC systems currently "see" only the words the user selects. The less information the user provides as input, the less the likelihood of accurate output. Studies with manual AAC systems suggest, however, that other modes of communication may be preferred. For example, children with cerebral palsy chose to use vocalizations, gestures, or eye gaze as modes of communication far more often than their manual symbol boards in interactions with their mothers or their speech therapists (2).These alternate modes of communication may be critical to fully understand augmented interactions. Already, the use of gestural recognition is being considered for future AAC systems (3). In order to develop truly intelligent AAC systems, we must understand and address such characteristics of conversational interactions between an AAC user and a partner. STATEMENT OF THE PROBLEM The goal is to design an augmentative communication system that provides common interactional features of a conversation between a person using a communication aid and a conversational partner. The first step has been taken in defining criteria for such a system by analyzing the data from a pilot study, and identifying a number of patterns that occurred during the dyadic interactions. APPROACH Pilot Study. The pilot data was collected by transcribing videos originally recorded by van Balkom (4). Adolescent students with cerebral palsy described pictures in a children's book to their primary speech therapists, using their own manual symbol charts. Four such adolescent-therapist dyads were videotaped and analyzed. Each student was instructed to describe the pictures as if telling a story to younger children. The therapist was instructed to repeat each word as it was selected by the student, paraphrase the sentence when it was completed, and then ask the student for confirmation that the paraphrased interpretation was correct. A single camera was used to videotape both the student and the therapist. Students took between 11 minutes and one hour to retell their stories. Transcription system. In an effort to capture as much of the multimodal content of the interactions as possible, the following vocalizations and gestures by both students and therapists were recorded: o vocal productions (both words and non-words) o hand and arm gestures - pointing at the communication board - pointing at the storybook - pointing elsewhere - unsuccessful attempt to turn the page - successful attempt to turn the page - miscellaneous gesture o facial expressions - smile - miscellaneous facial expression o head gestures - head nod - head shake - looking elsewhere (not at board, storybook, or therapist) - miscellaneous head gesture Eye gaze was not recorded, because experimental conditions did not allow accurate judgment of eye gaze direction. DISCUSSION Initial analysis of the data has indicated several interesting features and patterns of interaction that will be investigated further. For the sake of discussion, these observations are grouped into several categories, none of which should be considered exhaustive. Co-construction. One intriguing behavior observed was that the therapist often repeated a sequence of the student's selections before a sentence was completed. The form of this repetition can be described as a function of two dimensions: degree of incrementality, and degree of interpretation. ====================================================================== Table 1: Incrementality vs. Interpretation Student Non-Incremental Incremental Non- Interpretative boy boy boy girl girl boy girl walk walk boy girl walk Interpretative boy a boy a boy girl a girl a boy and a girl walk walking a boy and a girl are walking ====================================================================== To illustrate (Table 1), the student might select the symbols `boy', `girl', and `walk'. The therapist would echo "boy" after the first word. With a non-incremental/non-interpretative strategy, the therapist might also simply echo "girl" after the second word. With an incremental/non-interpretative strategy, the therapist might say "boy girl". With an incremental/interpretative strategy, the therapist might say "boy and girl", or "the boy and the girl", or "they". These strategies were not observed consistently: a single dyad might contain several strategies, or combinations of strategies. Interestingly, these strategies were observed despite the instructions that therapists should repeat the symbols only as they were selected, and interpret them only at the end of the sentence. This suggests that therapists found it natural to respond this way in the communicative situation. Word Find. When the student was unable to find a desired word on the communication board, or did not know a word, several strategies were used. The student might select a similar or related word, or point at the storybook, or use gestures to express the idea. For example, to express the word for sweeping, students moved their hands in a sweeping motion. Strategies employed by the therapist included guessing at the elusive word, examining and describing the picture in the storybook, or asking the student to spell the word if it was known but did not appear on the communication board. Conversational Repair. The student indicated that the therapist's interpretation was incorrect in a variety of ways, including head shakes, uttering "no", or pointing at "no" on the communication board. Occasionally students responded to and corrected their own errors after hearing the therapist say the word aloud. Therapists indicated an error or misunderstanding with a head shake or by saying "no" or "I don't understand". When a student omitted an important word, the therapist sometimes paraphrased the sentence by substituting an indeterminate filler for the missing word. For example, a therapist paraphrased `girl', `make', `in', and `pan' as: "Girl makes something in the pan". In response, the student responded by selecting the appropriate word to take the place of "something". Confirmation. Generally, students confirmed the therapist's paraphrase gesturally or vocally, but occasionally selected `yes' on the communication board. Therapists used verbal acknowledgments, head nodding, and reiterations of the student's sentence to indicate their understanding and agreement. Therapists inherently offered students the chance to confirm or object to their interpretation of each word when they echoed the word aloud immediately after the student selected it, as they had been instructed to do. As well, therapists frequently requested confirmation from the students explicitly, asking, for example, "Is this what you mean?" before or after paraphrasing a sentence. Other. Therapists often gave encouragements, such as "You're doing an excellent job!" or "That's good", and directives, such as "Ok, we can go on" or "Let's start over here". In addition, therapists asked "Is there anything else?" or "Are you done with this picture?" at almost every picture. It was not always clear, however, whether the therapist was prompting the student to say more, or simply inquiring whether to turn the page. Some additional commentary by the therapist occurred due to external distractions. FUTURE CONSIDERATIONS Preliminary analysis has suggested a number of refinements to be considered in subsequent studies. Data Collection. A second camera would allow direct observation of the symbols selected on the communication board. In addition, better sound and lighting quality would be helpful for the task of transcription. The storybook should be positioned so that the therapist cannot see the pictures as the student is describing them. The knowledge available to the therapist would then be more similar to the knowledge available to an intelligent AAC system. In the pilot study, the storybook was often used as an "extension" of the communication board through pointing gestures. Also, several times the therapist appeared to use knowledge of the storybook to add information to the descriptions that the student had not communicated. Design Criteria. In the pilot study, the therapist played a dual role as interpreter and listener. In order to separate these two roles, a third person, unfamiliar with the student, may act as the listener. The therapist would perform the role of an imaginary AAC device, freely interpreting the communication of the student and conveying it to the listener. Also, the therapist would not be explicitly instructed to echo each selection and paraphrase at the end of a sentence. Instead, we are interested in the interactive dialogue that would occur naturally in this situation. Finally, in future studies adult subjects may be used, as well as a variety of AAC devices (both electronic and manual). In addition, a more familiar environment (e.g., home) may be provided for data collection. This will help ensure the observation of natural interaction patterns. CONCLUSIONS Observations from this pilot study reflect the potential for discovering common dialogue features and patterns in the interactive communication that takes place between AAC users and their partners. Further studies will allow us to define these patterns more accurately and completely, forming the basis of a model that can be used in the development of future intelligent AAC systems. REFERENCES (1) Kraat, A. (1987) Communication Interaction Between Aided and Natural Speakers: A State of the Art Report. Madison, WI: IPCAS. (2) Heim, M. (1990). Communicative skills of nonspeaking CP-children: A study on interaction. Paper presented at the Biennial ISAAC International Conference on Augmentative and Alternative Communication (4th, Stockholm, Sweden, August 12-16, 1990). (3) Roy, D. M., Panayi, M., Harwin, W. S., & Fawcus, R. (1993) Advanced input methods for people with cerebral palsy: A vision of the future. Proceedings of the 16th Annual RESNA Conference. (pp. 99-101). Las Vegas, USA: RESNA. (4) van Balkom, H., Kamphuis, H., Demasco, P., & Foulds, R. (in preparation). Language technology in AAC: Automatic translation of graphic symbols into text and/or synthesized speech. ACKNOWLEDGEMENTS This work has been supported by a Rehabilitation Engineering Center Grant from the National Institute on Disability and Rehabilitation Research (#H133E30010). Additional support has been provided by the Nemours Foundation. The authors would like to thank the student collaborators and the staff at HMS School for Children with Cerebral Palsy, Philadelphia, for their interest and participation. Peter Vanderheyden Applied Science and Engineering Laboratories A. I. duPont Institute P.O. Box 269 Wilmington, DE 19899 USA Internet: vanderhe@asel.udel.edu