The Language Representation Database Project

We are working on creating a central database with accessibility to various lexical information sources (e.g. WordNet, the Brown Corpus, a phonetic module). Using the database and queries, we will be able to extract customized lexicons for use with multiple applications.

Table of Contents

Background
Purpose
Design
Diagram
Results
Publications
Acknowledgements

Contributors

Pat Demasco, Kathy McCoy, Chris Pennington, Wendy Zickus

Lisa Michaud -- michaud@asel.udel.edu

Last modified: Fri Jan 30 15:47:28 EST 1998

During the past three decades, there has been much interest and growth in the field of lexical semantics. However, there has been some skepticism about the ability to discover a general and practical way of representing word meaning. With the development of computerized on-line lexicons there has been increased research on the properties and attributes these lexicons should encompass. What should an on-line lexicon contain? If there are too many attributes for a large lexicon, the access time will be intolerable for many practical purposes. If the lexicon has a limited number of attributes, the usefulness of the lexicon is limited. Another question to answer is how to organize and group the words within the lexicon? Do we simply load words into a large database alphabetically, organize it in synonym sets (as was done in WordNet), or organize certain classes of words into groups (as Beth Levin has done with her English verb classes)? The development for this project has focused on looking at some of the lexical applications available in todays market (e.g. WordNet, MobyWords, the Brown Corpus, Postgres(tm), POET(tm)) as a background for further development of the design of the Language Representation Database.

The purpose of this project is to create a centralized multi-component object-oriented database with the ability to extract customized lexicons for various different on-going projects at the Applied Science and Engineering Laboratories.

Currently we are exploring the use of POET(tm) as our database software package. Interfaces need to be created to access the various lexical sources we will be using, such as WordNet, the Brown Corpus, a phonetic module and more. Depending on the dictionary needs of a specific project, a query will be sent to the Language Representation Database and a lexicon fulfilling these needs will be extracted. There needs to be further specification and design at this point on the Language Representation Database.

Diagram

Our future work will be focused on getting the interface between the database and the various lexical sources operable. Then we can focus on the smaller lexicon extraction implementation.

Zickus, W. M. (1995). A software engineering approach to developing an object-oriented lexical access database and semantic reasoning module. Technical Report 95-13, Department of Computer and Information Sciences, University of Delaware, Newark, DE.
[abstract], [text (182K)], [postscript (467K)]

Zickus, W. M., McCoy, K. F., Demasco, P. W., & Pennington, C. A. (1995) A lexical database for intelligent AAC systems. In A. Langton (Ed.), Proceedings of the RESNA '95 Annual Conference (pp. 124-126). Arlington, VA: RESNA Press.
[abstract], [text (14K)], [postscript (86K)]

Zickus, W. M. (1994). A comparative analysis of Beth Levin's English verb class alternations and WordNet's senses for the verb classes HIT, TOUCH, BREAK, and CUT. In Proceedings of The Post-Coling94 International Workshop on Directions of Lexical Research (pp. 66-74). Beijing, China: Tsinghua University.
[abstract], [text (30K)], [postscript (98K)]

This work has been supported by a Rehabilitation Engineering Center grant from the National Institute on Disability and Rehabilitation Research. Additional support has been provided by the Nemours Foundation.