Iconic Interfaces in Intelligent AAC Systems

Bruce R. Baker, Semantic Compaction Systems
Eric H. Nyberg 3rd, Carnegie Mellon University/Semantic Compaction
Systems
Kathleen F. McCoy, ASEL, University of Delaware/A.I. DuPont Institute
Patrick W. Demasco, ASEL, University of Delaware/A.I. DuPont Institute

(c) 1990 RESNA Press. Reprinted with permission.

Abstract

The problems in developing a system with the complete linguistic
ability of a human speaker are overwhelming given the current state
of technology and linguistic theory. However, some of the issues
involved can be finessed by providing the system with certain kinds of
linguistic information. Using multimeaning sequenced icons as an
input medium allows the system to get much information out of few
actuations. This paper discusses the use of such an interface in an
intelligent AAC system.

Multi-Meaning Sequenced Icons

The ideal interface represents input selections transparently and
has a relatively small number of input choices or keys. An interface
intended for individuals who are cognitively or physically impaired
must generate language with a low number of actuations. Letters
have serious problems as input media because they require 6+
actuations per content word. AAC operators usually select one key
every 5 to 8 seconds.  Congenitally speech impaired people often have
weak reading and spelling skills. As an alternative to letters, single
meaning pictures have serious limitations, because hundreds of
pictures are required to instantiate even the simplest vocabulary
(with only one lexical item per picture).

A patented technique called semantic compaction approaches the problem
of sentence generation by interpreting icons to have different
meanings in different contexts (Baker, 82), (Baker, 84). This
technique exploits the natural polysemy inherent in illustrations of
real-life objects. For example, an apple is not only an apple, it is
also red, a fruit, and round. When used in a sentence describing a
favorite color, an apple icon can be interpreted as indicating the
color red. In general, this approach allows operators to access a
much larger number of concepts with the same number of input keys when
multi-meaning icons are used in place of letters, words or single
meaning pictures.

For both cognitively impaired and cognitively intact individuals,
semantic compaction has been thoroughly explored using simple
transducer programs to retrieve pre-stored language items when
particular sequences of icons are actuated. For example, on a system
designed for a cognitively intact operator, an average of 2 actuations
is required to retrieve a word (3 or 4 actuations for a template
sentence). Such a system supports a substantial (greater than 50%)
reduction in the physical effort required for communication, using an
input layout with fewer than 100 keys. The cognitive requirements
are substantially reduced as well.  This is a major improvement, but
it is not enough. If the sentence to be generated has 8 words (not an
unreasonable sentence length), then this type of system will still
require 16 actuations to create it. If the operator needs 8 seconds
for each actuation (which is the case for a large population of
operators), then an 8 word sentence will still take roughly 128
seconds, or over two minutes.

Symbol Parsing

Existing AAC devices allow the operator to actuate certain choices or
sequences of choices on an input device to retrieve pre-stored
language items (single words or template sentences). While this is
certainly useful, it does not help the cognitively impaired operator
who does not have intact syntactic knowledge.  The system will output
only the words that are retrieved by the operator, in exactly the
order that they are accessed, whether or not the resulting utterance
is meaningful. This places most of the cognitive burden of
communication on the operator, who must provide syntactic, stylistic
and pragmatic information to produce an utterance. The pre-stored
language items that are retrieved by the operator are generated by the
device in the order that they are retrieved, even if that order does
not produce a meaningful sentence. Systems that accept sequences of
symbols or icons and produce words or sentences in the same order can
be called symbol parsing or transduction systems.

Intelligent Word Parsing

It is becoming possible to develop more intelligent systems which take
advantage of artificial intelligence and linguistic techniques to
reduce the syntactic load and other cognitive burdens placed on the
operator (Demasco et al., 89). The ideal system could require only the
content words of an intended utterance and yet produce a syntactically
and semantically well formed sentence. Such an intelligent parser has
several tasks. Consider the input: John go store yesterday.  The
system must have semantic information about the individual words to
determine that go is a verb, John is an agent of the action (since go
must take an animate agent), store is the object (since physical
locations are things that can be gone to), etc. (McCoy et al., 89a).

Once the semantic roles have been deduced, a natural language
generation system could be used to fill in necessary syntactic
information (e.g., determiners, prepositions, necessary verb
inflections) to produce a sentence such as: John went to the store
yesterday.(McCoy et al., 89b).

An intelligent word parser could lift much of the cognitive load
from the operator since s/he would no longer be concerned with
syntactic information. The system would be constrained to produce
semantically and syntactically well-formed sentences.

Of course, the production of such utterances is not without a
cost. The system requires large amounts of knowledge and powerful
inferencing techniques in order to produce the utterances.

Icons and Intelligent Parsing

By combining a scaled-down version of an intelligent parser with
multi-meaning sequenced icons, it is possible to make a usable,
intelligent AAC system a reality. A well-chosen geographic layout
can provide the system with syntactic and semantic information based
on the locations of the icons that are selected, thus reducing on the
knowledge and inferencing required by the intelligent parser.

While this layout strategy builds on the basic idea behind the
Fitzgerald key (Musselwhite & St. Louis, 82), it differs in several
important ways.

o The symbols in the proposed system are multimeaning.

o The user need not select all of the words in the sentence to be
generated -- the word parse is responsible for automatically adding
function words that are required for purely syntactic reasons.

o The arrangement is actually helping the system (although it is
likely to have benefits for language-impaired operators as well).

In representing a large vocabulary by combining icons together, the
icons in three 8-icon columns can combine to produce 300
agents. Similarly, the same 24 icons can combine to produce 300
patients if the icons are situated in a different location to
distinguish them from the agent icons. Two columns of 8 icons repre-
senting modifiers or function words can be placed at crucial parts of
the board. The central notion behind this layout is that the icons can
sequence to access a large number of unique strings. It is the
multi-meaning nature of icons and their ability to form sequences that
indicate a single notion which give them their combinatorial
expressive power. If we were using single meaning pictures, we would
only be able to represent 24 agents and 24 patients if we used the
same layout.  It is important to note that the usefulness of an
intelligent system will be defeated if it makes use of a bur-
densome interface that places a high physical and/or cognitive load on
the operator while placing severe restrictions on the depth of his/her
vocabulary. For this reason, multi-meaning icons seem to be the most
appropriate interface media for intelligent AAC systems.

Conclusion

We have described a method that synthesizes iconic interfaces with
intelligent parsing. The result is a system which combines the
advantages of both with minimal constraints. By keeping the
vocabulary at approximately 1000 words, it is possible to implement
this system on current generation hardware. The result would be an
intelligent communication system useful to a large number of people
with disabilities.

Acknowledgments

This work is partially supported by Grant #H133E80015 from the
National Institute on Disability and Rehabilitation
Research. Additional support has been provided by the Nemours
Foundation.

References

B. Baker. Minspeak. Byte, 186ff, September, 1982.

B. Baker. Semantic compaction for sub-sentence vocabulary units
compared to other encoding and prediction systems. In Proceedings of
the 10th Conference on Rehabilitation Technology, pages 118-120,
RESNA, San Jose, CA, 1984.

P. Demasco, K. McCoy, Y. Gong, C. Pennington, and C. Rowe. Towards
more intelligent AAC interfaces: the use of natural language
processing. In Proceedings of the 12th Annual Conference, pages
141-142, RESNA, New Orleans, Louisiana, June 1989.

K. McCoy, P. Demasco, Y. Gong, C. Pennington, and C. Rowe.A Semantic
Parser for Understanding Ill-Formed Input. In Proceedings of the 12th
Annual Conference, pages 145-146, RESNA, New Orleans, Louisiana, June
1989.

K. McCoy, P. Demasco,Y. Gong, C. Pennington, and C. Rowe. Toward A
Communication Device Which Generates Sentences. In Proceedings of the
12th Annual Conference, pages 141-142, RESNA, New Orleans, Louisiana,
June 1989.

C. Musselwhite and K. St. Louis, Communication Programming for the
Severely Handicapped: Vocal and Non-Vocal Strategies, College Hill
Press, San Diego, CA, 1982.


Contact

Bruce Baker
Semantic Compaction Systems
801 McNeilly Road
Pittsburgh, PA 15226