# Solutions to Second Midterm Exam

### Directions:

• Answer the problems on the exam pages.
• There are four problems, some with multiple parts, for 100 total points plus 10 extra credit. Actual scale was A = 90, C = 65.
• Some useful definitions precede the questions below.
• No books, notes, calculators, or collaboration.
• In case of a numerical answer, an arithmetic expression like "217 - 4" need not be reduced to a single integer.

Correction in orange added 23 March 2015.

```  Q1: 10 points
Q2: 40+10 points
Q3: 15 points
Q4: 35 points
Total: 100+10 points
```

Question text is in black, solutions in blue.

Corrections in green made 2 April 2015.

#### Definitions:

N is the set of naturals (non-negative integers), {0, 1, 2, 3,...}

Question 2 uses two binary relations R and S, each subsets of N × N. They are defined by the following recursive pseudo-Java methods: R(x, y) is true (meaning (x, y) ∈ R) if and only if `r(x, y)` returns `true`, and similarly for S. The code for S was changed during the exam -- this is the corrected version.

``````

public static boolean r (natural x, natural y) {
if (x == 0) return true;
if (x == 1) {
if (y == 0) return false;
if (y == 1) return true;
return r(x, y - 2);}
if (y <= 1) return r(x - 2, y);
return r(x - 2, y - 2);}

public static boolean s (natural x, natural y) {
if (x % 2 == y % 2) return true;
if (x % 2 == 0) return true;
return false;}

``````

Questions 3 and 4 use the following recursively defined family of labeled directed graphs. Graph G0 consists of a single node A with no edges. To obtain graph Gn+1 from graph Gn, we add a new row of n + 2 nodes at the bottom, and add 3n + 3 edges, each labeled with the cost (weight) n + 1, to connect this new row to the bottom row of Gn. Here are ASCII pictures of the graphs G1, G2, and G3. There are three kinds of directed edges -- going down and to the left, going horizontally right, and going up and to the left.

``````
A
G1:                                       / ^
A                          G3:     1/   \
/ ^                                 /     \1
1/   \                               V   1   \
/     \1                            B ------> C
V   1   \                           / ^       / ^
B ------> C                        2/   \    2/   \
/     \2  /     \2
V   2   \ V   2   \
A                      D ------> E ------> F
G2:      / ^                    / ^       / ^       / ^
1/   \                 3/   \    3/   \    3/   \
/     \1               /     \3  /     \3  /     \3
V   1   \              V   3   \ V   3   \ V   3   \
B ------> C            G ------> H ------> I ------> J
/ ^       / ^
2/   \    2/   \
/     \2  /     \2
V   2   \ V   2   \
D ------> E ------> F

``````

• Question 1 (10): Identify each of the following five concepts, giving enough detail to make it clear that you are familiar with them (2 points each):

• (a, 2) a connected undirected graph

A connected undirected graph is one where there is a path from any vertex to any other vertex.

• (b, 2) the depth of a rooted tree

The depth of a rooted tree is the length (number of edges in) the longest directed path from the root to a leaf.

• (c, 2) the Induction Rule for binary strings

If we prove P(λ) and ∀w: ∀a: P(w) → P(wa), where w's type is "string" and a's type is "letter", then we may conclude ∀w: P(w).

• (d, 2) the prefix string for a boolean expression

The prefix string of a boolean expression consists of the highest-level operator, followed by the prefix string of the left subexpression (if any), followed by the prefix string of the right subexpression (if any).

• (e, 2) the reversal of a string

The reversal of a string w is another string with the same length of w, consisting of the letters of w in the reverse order, that is, with the last character of w first and the first character of w last.

• Question 2 (40+10): Here R and S are the two binary relations on N defined by the pseudo-Java methods `r` and `s` above.

• (a, 5) Calculate the truth values of R(2, 3), R(2, 4), R(3, 4), and R(3, 2) from the code.

R(2, 3) = R(0, 1) = true, R(2, 4) = R(0, 2) = true, R(3, 4) = R(1, 2) = R(1, 0) = false, R(3, 2) = R(1, 0) = false.

• (b, 5) Describe R in English, that is, say for exactly which naturals R(x, y) is true. For this part of the problem you do not need to justify your answer if you are correct.

R(x, y) is true if x is even or if both x and y are odd. R(x, y) is false if x is odd and y is even. We can justify these claims informally by noting that every recursive call reduces either x or y, or both, by two and thus preserves the parity of x and y. The cases that do not cause further recursive calls are R(0, y) (true for any y), R(1, 0) (false), and R(1, 1) true. But to prove anything about the behavior of R on general input, we will need an inductive proof.

• (c, 15) Prove that R is a reflexive relation, which means proving ∀n: R(n, n). Use induction on naturals, possibly strong induction or separate inductions on the odds and evens. Use only the code of `r`, not any relationship between the relations R and S.

Let P(n) be the statement "R(n, n)". (Many of you chose incorrect P(n)'s such as "∀n: R(n, n)", which is a proposition that does not depend on n at all.)

Base Case: P(0) is the statement "R(0, 0)", which is true by examining the code.

Second Base Case: P(1) is "R(1, 1)", which is also true by examining the code.

Why do we want two base cases? If we use string induction, our strong inductive hypothesis is "∀i: (i ≤ n) → R(i, i)". If n > 0, we prove P(n + 1) by observing that R(n + 1, n + 1) calls R(n - 1, n - 1), which is true by the inductive hypothesis. But this argument does not prove R(1, 1) since there is no call to R(-1, -1), so we need to argue separately that R(n + 1, n + 1) is true when n = 0.

If we use odd-even induction, our inductive step is P(n) → P(n + 2), which follows because R(n + 2, n + 2) calls R(n, n) for any natural n. We then need two base cases, P(0) for the evens and P(1) for the odds.

Many people set up the strong induction correctly, but when it was time to prove P(n + 1) they they then gave the informal argument about a chain of calls going on until x = 0 or x = 1, rather than use the inductive hypothesis. Many of these arguments were incomplete as well as non-inductive, because the chain of recursive calls can finish in a number of different ways, depending on the relative size of the original x and y.

• (d, 15) Prove that if two naturals x and y are both even, then R(x, y) ↔ S(x, y) is true. (Hint: Let P(x) be the statement "For all even y, R(x, y) ↔ S(x, y) is true". Prove P(x) by induction on the evens, using a separate induction on y in each case.)

Following my hint, let P(x) be as stated. We prove P(x) for all even x by induction on the evens.

Base Case: P(0) says "For all even y, R(0, y) ↔ S(0, y) is true". To prove this, we note that R(0, y) and S(0, y) are true for any y, by examination of the code of the two methods. (Many people incorrectly interpreted P(0) as "R(0, 0) ↔ S(0, 0)".)

Inductive Case: We let x be an arbitrary even natural, assume P(x), which says that R(x, y) ↔ S(x, y) for all even y, and prove P(x + 2), which says that R(x + 2, z) ↔ S(x + 2, z) for all even y.

The S predicate is true whenever its first argument is even, by examination of its code. We must thus show that R(x + 2, z) is true where z is an arbitrary even natural, given our assumption. The code's computation of R(x + 2, z) depends on whether z = 0. R(x + 2, 0) is equal to R(x, 0), and if z > 0, R(x + 2, z) is equal to R(x, z - 2). In either case, the recursive call has first argument x and second argument an even natural, so by the inductive hypothesis it returns true.

• (e, 10XC) Complete the proof that R and S are actually the same relation, that is, that R(x, y) ↔ S(x, y) for all naturals x and y.

There are three other cases that we can prove by induction on either the odds or the evens.

If x is even and y is odd, we know that S(x, y) is true. We let P(x) be "for all odd y, R(x, y) is true". We proved P(0) for this P in the course of our base case for part (d) above. For the inductive case, we let z be an arbitrary odd natural and prove that R(x + 2, z) is true. If z = 1, R(x + 2, 1) calls R(x, 1) which is true by the IH. Otherwise R(x + 2, z) calls R(x, z - 2) which is true by the IH since z - 2 must also be odd.

If x is odd and y is even, we know that S(x, y) is false. We let P(y) be "for all even y, R(x, y) is false". P(0) now says that R(0, y) is false for any odd y. We prove this fact by induction on odd y -- the base is R(0, 1), which is true by inspection of the code. For the inductive case, we merely note that R(0, y + 2) calls R(0, y), which is false by the inductive hypothesis.

Now we return to the main proof for the case where x is odd and y is even. We let x be arbitrary, assume P(x), and prove P(x + 2) which says that R(x + 2, z) is true for arbitrary odd z. R(x + 2, z) calls either R(x, z) (if z = 1) or R(x, z - 2) (otherwise). In either case the recursive call returns false by the IH, since both z and z - 2 are odd.

The final case is where x and y are both odd. S(x, y) is true in this case by examination of the code. So we let P(x) be "R(x, y) is true for all odd y" and prove P(x) by induction on the odd naturals. P(1) requires a separate induction, whose base is R(1, 1) (true by inspection) and whose induction is R(1, y) → R(1, y + 2) (true by the recursive call).

It remains to prove the inductive case, where x is an arbitary odd natural, we assume P(x), and we need to prove P(x + 2). For an arbitrary odd z, R(x + 2, z) calls either R(x, z) (if z = 1) or R(x, z - 2) (otherwise), and in either case this call returns true by the IH.

• Question 3 (15): This question uses the family of labeled directed graphs Gi defined above.

Let T(n) be the total length (cost, weight) of all the directed edges in the graph Gn. Using the recursive definition of Gn, prove (by ordinary induction on n) that T(n) = n(n + 1)(2n + 1)/2.

Let P(n) be "T(n) = n(n + 1)(2n + 1)/2".

P(0) says that T(0) = 0(0 + 1)(2(0) + 1)/2 = 0, which is true because the definition of G0 says that it has no edges.

We assume that T(n) = n(n + 1)(2n + 1))/2 and consider T(n + 1). Gn+1 is made from Gn by the addition of 3n + 3 new edges, each of length n + 1, so T(n + 1) = T(n) + (3n + 3)(n + 1).

Thus all we need to show is that n(n + 1)(2n + 1)/2 + 3(n + 1)2 = (n + 1)(n + 2)(2n + 3)/2. Cancelling one factor of n + 1 (which is not 0 because n is a natural), this equation becomes n(2n + 1)/2 + 3(n + 1) = (n + 2)(2n + 3)/2. Doubling both sides, this becomes n(2n + 1) + 6(n + 1) = (n + 2)(2n + 3) or 2n2 + n + 6n + 6 = 2n2 + 7n + 6, which is true.

• Question 4 (35): These questions use the labeled directed graph G2, and the names of its nodes, given above.

• (a, 10) Trace the depth-first search of the directed graph G2 starting from node B, with no goal node and with a closed list to prevent re-exploring a node. Draw the resulting DFS tree, indicating the non-tree edges and identifying them as back, forward, or cross edges. When two nodes are put on the stack at the same time, the first one off should be the one with the lower-cost edge. (This is the only way we consider the edge labels in this part of the problem.)

Trace: We begin with B on the stack. It is popped and C and D go on, with C on the top. C is popped and A and E go on, with A on top so the stack is (A, E, D). A is popped, and B is not pushed because it is on the closed list. E is popped, and F is pushed (B is not pushed as it is on the closed list). F is popped (C is not pushed as it is on the closed list), and then D is popped (E is not pushed as it is on the closed list. The stack is now empty and the search ends.

Tree: B is the root, with two children C and D. C has two children A and E, and E has a single child F. A, D, and F are leaves. There are four non-tree edges -- back edges A to B, E to B, and F to C, and a cross edge D to E. (The last edge is a cross edge because E is neither an ancestor nor a descendent of D in the tree.)

• (b, 5) Describe what happens if we carry out the search from part (a) without using a closed list.

We push B, pop it, push C and D, pop C, push A and E, pop A, and push B. We are now in the same position as when we first pushed B, except that E and D are under it. We will pop B again, and three steps later we will again push B with E, D, E, D under it. This will continue forever, popping B, C, and A in rotation. (In practice our stack will overflow rather than continue to grow forever.)

• (c, 10) Trace a uniform-cost search of the labeled directed graph G2, starting from node B and terminating when the shortest-path distance from B to node F has been determined by the algorithm.

We begin by pushing B0, by which I mean a copy of B with priority 0. We pop B0 and push C1 and D2. We pop C1 and push A2 and E3. We may now pop either A2 or D2. (In a heap implementation of a priority queue there would be no particular reason to expect one or the other to be popped. Whenever we pop A2 we push nothing because B is on the closed list. We pop D2 and push E4. (It's important that we do this -- the algorithm does not know that E3 is already in the priority queue.) We pop E3 and push F5. We pop E4 and discard it because E is on the closed list. Finally we pop F5, telling us that the distance from B to F is 5.

• (d, 5) In order to carry out an A* of G2 with goal node F, we want to define a heuristic function h where h(x) is the minimum number of edges in any directed path from x to F. Why is this an admissible heuristic? Compute (without proof) the values of h(A), h(B), h(C), h(D), and h(E).

To be admissible, a heuristic must always be non-negative and for every node x, h(x) must be less than or equal to the shortest-path distance from x to F. The number of edges on the shortest path must be less than or equal to the length of the shortest path because each edge has weight at least 1.

h(E) is 1, h(D) and h(C) are 2, h(B) is 3, and h(A) is 4.

• (e, 5) Given an arbitrary directed graph (with positive integer edge labels) and an arbitrary goal node, we want to calculate the value of the function h(x) defined above, for every node x in the graph. What would be the best way to do this, using the search techniques we have learned?

A BFS from x will determine the value of h(x) because it finds the path from x to F with the least number of edges. (A uniform-cost search with all edge weights 1 is equivalent to this BFS.)

But we can do better, by carrying out a single BFS of the reversal of the graph (a graph with the same vertices but with each edge replaced by one with the same endpoints in the other order). This one search will find the smallest number of edges in any path from F to each x in the reversal graph, which is the smallest number of edges in any path from x to F in the original graph.

A DFS will simply not work, because it is not guaranteed to find a path with the minimal number of edges.