A Domain Independent Semantic Parser For Compansion Kathleen McCoy, Patrick Demasco, Mark Jones, Christopher Pennington & Charles Rowe Applied Science and Engineering Laboratories University of Delaware/Alfred I. duPont Institute Wilmington, Delaware USA Introduction This work is part of an augmentative communication project being conducted at the Applied Science and Engineering Laboratories at the University of Delaware and the A.I. duPont Institute. The goal of this project is to increase the communication rate of physically disabled individuals via Natural Language Processing techniques. We wish to take as input a compressed message (i.e., one containing mainly the content words of the desired utterance) from the disabled individual and yet pass a syntactically and semantically well-formed utterance to a speech synthesizer or text preparation system. At the same time, we wish to do this by placing as little a burden on the user as possible (Demasco et al., 89). Thus, we are not interested in a simple coding system (cf. (Baker, 82)) where sentences have been stored and are simply indexed by their content words. The system is broken into several phases. The first phase, the semantic parser (see also (Small&Rieger, 82)) is responsible for determining the semantic role being played by each input word. It must determine which word is the verb, what role each noun phrase plays with respect to the verb (e.g., actor, theme), and what modification relationships are present. The resulting semantic representation is then passed to the translation component which is responsible for replacing the semantic terms with their language-specific instantiations. The final phase of the processing is a sentence generator which forms a syntactically correct sentence. This paper is concerned primarily with the semantic parser. Knowledge and Processing In (McCoy et al., 89) we reported on a semantic parser which relied primarily upon the domain of discourse. As the lexicon grew, this solution seemed unsatisfying because it was difficult to constrain the domain of discourse, and the knowledge management for an arbitrary number of possible domains proved unwieldy. A new approach is being developed which identifies the roles that words can play in a domain independent manner. The problem faced by the parser is the ambiguity resulting from the compressed input because it contains no syntactic information. Consider the following input: "John study red house". The first problem faced by the system is determining the function of each word in the sentence. Each individual word can have different semantic classifications and thus its function in the sentence may be ambiguous. For example, the word "study" has two meanings: an action, as in "John studies", or a location, as in "John is in his study". In order to recognize all possible word meanings (and to constrain further processing) we classified all words into five categories according to the different semantic functions that the word can play: Actions (verbs), Objects, Descriptive Lexicons (adjectives), Modifiers to the Descriptive Lexicon (adverbs), and Prepositions. Within each category, we further classified the words so that finer semantic distinctions can be made. If each individual word is looked at in isolation, it (usually) has many interpretations. However, if the input words are taken together, then many of the interpretations can be eliminated. In the above example, while "study" may either be an object or a verb, no other word in the sentence occurs in the verbs hierarchy. Thus "study" will be taken as the verb. Once the verb of the sentence has been determined (1), additional knowledge sources can be applied to further reduce ambiguity. The main verb predicts much of the structure of the sentence. It dictates, in a top-down fashion, which semantic roles are mandatory and which roles should never appear, as well as type information concerning possible fillers of each role. For example, the verb "go" cannot have a THEME case in the semantic structure. Furthermore, it cannot have a FROM-LOC case unless it also has a TO-LOC. This information is captured with frame predictions which are associated with each verb in the verbs knowledge base. Based on the particular verb, a set of semantic structures (called frames) is created. Each frame encodes one possible interpretation of the current sentence and contains typed variables which may be filled with words from the input. Each variable type corresponds to a type of possible input word (i.e., they are types taken from one of the system knowledge bases). For example, the frame associated with the word "study" indicates that the AGENT must be human and that it can take a THEME (some abstract or physical object) and a LOCATION (a physical place). The AGENT is required while THEME and LOCATION are optional. Once a frame has been chosen, bottom-up processing attempts to fill out the frame by fitting the input words into the appropriate slots. A frame is considered satisfactory if all of its obligatory roles can be filled. A final source of ambiguity must be dealt with. So far we have considered words of the input modifying the verb by planning a role with respect to it. We must also consider the possibility of one word of the input serving to modify the meaning of another input word. Because a certain word can only modify certain types of other words, we attach words which can be modifiers onto the words of types that they can possibly modify. For example, the word "red" is attached to the type PHY-OBJ (PHYsical-OBJect) in the object knowledge base. Therefore, the word "red" can modify the word "house" since "house" is a PHY-OBJ, but cannot modify the word "idea" since "idea" is not. Using the modifier information the system attempts to associate any modifier words with the words of the input that they modify. In this way, the "red" from the input will be attached as a modifier to the word "house". Using the above processing and knowledge sources the parser will come up with two possible interpretations of the input: "John studies at the red house" (here the red house is taken as a location) and "John studies the red house" (here the red house is taken as the thing that John is studying). Notice both interpretations account for all input words and fill all obligatory roles associated with the chosen verb. Current Functionality The current system is able to recognize different uses of a verb depending on the modifiers present in the input. Consider "John look Boston" and "John look tired". Respectively, the final outputs are "John looks at Boston" and "John looks tired". Notice that "look" can be used in two very different ways. The system determines which use is intended by considering the rest of the input. If there is an OBJect or LOCation to look at, then the first use is recognized. If this is not the case, but a modifier that may modify the agent occurs in the input, then the second use is recognized. As an example of how the entire input is used to constrain possible interpretations, consider "John study weather university". Notice that in isolation "university" could either be a LOCATION or a thing being studied (a THEME). However, since "weather" can only play the THEME role with respect to "study" the system correctly generates: "John studies weather at the university". In addition, the present system has the capacity to infer the verb intended by the user. The present system chooses between have and be as verb candidates depending on the given input elements. Given the input "John tired", the final product is "John is tired". Given the input "John paper", the final output is "John has the paper". This system will also infer the actor (subject) of the intended sentence. In particular, if no agent is given, the system will infer the user to be the agent (and thus generate a first person pronoun for that slot). Given the single-word input of "hungry," the system will determine the intended verb and the intended agent, and generate the sentence "I am hungry". Conclusion A prototype implementation of the system has been completed and is currently being evaluated. The lexicon currently contains several hundred words of various types. The implementation is in Common Lisp and runs on several different hardware platforms. The evaluation currently includes a paper experiment to determine whether or not the current system mirrors the action of a human. Future plans for evaluation include testing the prototype system with potential users. Acknowledgments This work is supported by Grant Number H133E80015 from the National Institute on Disability and Rehabilitation Research. Additional support has been provided by the Nemours Foundation. References B. Baker. Minspeak. Byte, 186ff, September 1982. P. Demasco, K. McCoy, Y. Gong, C. Pennington, and C. Rowe. Towards More Intelligent AAC Interfaces: The Use of Natural Language Processing. In Proceedings of the 12th Annual conference, pages 141-142, RESNA, New Orleans, Louisiana, June 1989. K. McCoy, P. Demasco, Y. Gong, C. Pennington, and C. Rowe. A semantic parser for understanding ill-formed input. In Proceedings of the 12th Annual conference, pages 145-146, RESNA, New Orleans, Louisiana, June 1989. S. Small and C. Rieger. Parsing and comprehending with word experts (a theory and its realization). In Wendy G. Lehnert and Martin H. Ringle, Editors, Strategies for Natural Language Processing, 1982. Endnotes (1) In the case that the system identifies more than one word that can fill the role of the main verb, it will spawn an independent process for each verb candidate. Contact Kathleen McCoy Dept. of Computer and Information Sciences University of Delaware Newark, DE. 19716 Email: mccoy@udel.edu