Study on Voice Controlled
Computing
Introduction
This paper summarizes a research project studying
some of the issues involved in controlling a computer by speaking to it
in a human language. I did this study my Senior year as a computer
science major at Bob Jones University
under the direction of Mr.
Dan Wooster.
In the original
plan for the project, the software I was going to develop was to be
integrated with the operating system. The computer would accept input
from either the keyboard or a speech recognition unit. It would then
determine whether the input was to be fed directly into an application
or translated into a command for the operating system to perform.
The figure below illustrates this plan.
Part way through the semester, it became obvious that
the project was far too big to be done in a semester at the undergrad
level. So I scaled it back and decided to simply write a Linux shell
that would take English sentences and execute the intended operating system
command. That is a much more reasonable short-term goal, but still
involves a great deal of complexity. First, the shell must make sense
of the sentence, then it must match that meaning to a shell command or
application.
That kind of "understanding" in a piece of software
pushes us into the realm of artificial intelligence. To write these "smart"
pieces of the project, I chose to use PROLOG, following Dr.
Ray C. Dougherty's work which he presents in Natural Language Computing:
An English Generative Grammar in Prolog. Dr. Dougherty first
explains the use of PROLOG, which I attempt to summarize later in this
report. But his background is in linguistics, and that is the area
in which his book most helpful. He discusses the creation of a grammar
in PROLOG, after which I have modeled a simple grammar
for this project. Following Dr. Dougherty's method I wrote a seperate
lexicon of the words that will be recognized by the shell. The lexicon
I include here is very small and useful only for demonstration purposes.
Large lexicons should not be written as PROLOG facts, for reasons I explain
later.
The grammar I wrote successfully tells whether a
sentence of the form it recognizes is grammatical. But it does not
label the parts of the sentence structure to help a semantic analysis unit
attach meaning to the sentence. One of Dr. Dougherty's example programs
does output the structure through the use of lots of parentheses.
That may be the best one can hope for.
For more on PROLOG, this study, and suggestions for future research,see
the outline of this report. For more information
from a linguist's perspective, read:
Dougherty,Ray C. Natural Language Computing:An English Generative
Grammar in Prolog. 1994, Hillsdale, New Jersey, Lawrence
Erlbaum Associates.