In this discussion we looked at recursive definition for arithmetic expressions, compared top-down and bottom-up definitions, and tried to write an expression evaluator.
In Section T.2 we had the following definition of arithmetic expressions, terms, and factors:
This is a top-down recursive definition, unlike our recursive definitions for naturals or strings. The top-down definition translates much more easily to a recursive algorithm to determine whether a string is an expression, or to compute some recursively defined property of the expression like:
evalAtom
.
You're asked to build an expression evaluator using the following methods:
getToken()
removes and
returns the next Token
from the
input. A Token
is an atom or one of the characters in the set
{+, ×, (, )}. If there is no next token or the next token is invalid,
getToken
throws an exception, which you need not handle.
eof()
returns true
if and only if there are no
tokens remaining in the input.
peek()
returns the Token
object
that is next in the input, without removing it from the input. It throws
an exception if there is no such token. This method was not given to you in
the actual discussion, and it should have been because you need it or something
similar to solve the problem.
isAtom
, isPlus
, isTimes
,
isLparen
, and isRparen
are methods in the
Token
class that return true
if the Token
is an atom or is the given character.
evalAtom
is a method in the Token
class that
returns a double
giving that token's value if it is an atom. It
throws an exception if the token is a character rather than an atom -- your
code should prevent this from happening.
evalExpression
that returns the
double
value of an expression as defined above, and throws an
exception (probably from getToken
) if the expression is not valid.
The input comes from whatever source getToken
is getting it from,
and you should assume that the expression ends at the end
of the input (in the text given in discussion I said you could stop at the
first complete expression, but this is too easy if the expression starts with
an atom). You will want to define methods evalTerm
and evalFactor
. Of course, without the peek
method you didn't have the tools to solve this problem, but I hope the
experience of working at it and the solution will be illuminating.
double evalExpression()
{// evaluates and removes an expression from the input
double temp = evalTerm();
if (!eof() && peek().isPlus()) {
Token discard = getToken();
return temp + evalExpression();}
else return temp;}
double evalTerm()
{// evaluates and removes the next term from the input
double temp = evalFactor();
if (!eof() && peek().isTimes()) {
Token discard = getToken();
return temp + evalTerm();}
else return temp;}
double evalFactor()
{// evaluates and removes the next factor from the input
Token next = getToken();
if (next.isAtom()) return next.evalAtom();
if (next.isLparen()) {
double temp = evalExpression();
if (getToken().isRparen()) return temp;
else throw new Exception("invalid expression");}
throw new Exception ("invalid expression");}
Assuming that the method returns a value, it is easy to show that the
value is correct by induction on the call tree. The call tree has a root
node for the initial call to evalExpression
, other nodes for
every call to one of the other methods, and leaves for every call to
evalAtom
. Define P(v) to mean "the call at node v of the tree
returns the correct value for the expression, term, factor, or atom read during
that call". If v is a leaf, we know that evalAtom
returns the
correct value of the atom by definition. Otherwise we have three cases
depending on which method is called at v:
evalFactor
, then there are two subcases.
If the factor is an atom, we know by the IH that evalAtom
returns
the correct value and thus that the value we return, by the definition of value,
is correct. If the factor is an expression in parentheses, the IH says that
the following expression is correctly read and evaluated by
evalExpression
, and we can see that we read the two parentheses and
return the value of the expression as we should. These are the only two ways
that evalFactor
can return a value.
evalTerm
, there are again two cases. If
the term is a single factor, the IH says that we read its value, and that is the
value of the term. If there is another term following a times sign, we read it,
evaluate both the first factor and the following term correctly by the IH, and
return the product of those two values as we should.
evalExpression
, we either
read and return the value of a single term, as we should, or read and evaluate
the first term, read and discard the plus sign, read and evaluate the following
expression, and return the sum of the two values as we should.
This shows partial correctness of the methods -- if they terminate they give the right answer. We'd also like to prove termination on all valid input. Here we prove by induction on all expressions, terms, factors, and atoms that each of these things is read and evaluated by its appropriate method:
evalExpression
by calls to evalTerm
and EvalFactor
.
There remains the question of what this code does if given input that is not
a single expression followed by the eof
condition. In fact it
reads the largest valid expression it can find. The evalExpression
and evalTerm
methods go on past their first term or factor if and
only if they see the plus sign or times sign they need. If there were two atoms
in a row in the input, for example, the methods would never evaluate the second
one -- they would only peek at it to see that it was not a plus or times sign,
and otherwise essentially treat it as the end of the input.
Last modified 9 November 2007