# Solutions to Third Midterm Exam

#### 20 November 2007

Question text is in black, solutions in blue. The scale was A = 90, C = 54.

```  Q1: 10 points
Q2: 15 points
Q3: 20 points
Q4: 15 points
Q5: 15 points
Q6: 25 points
Total: 100 points
```

• Question 1 (10): Recall that a simple path in a directed graph is one that never visits any node more than once. Prove that for any natural n, any simple path with n edges visits exactly n+1 nodes. (Hint: Use ordinary induction.)

We let P(n) be the statement "any simple path of n edges has exactly n+1 nodes". For the base case, P(0) says "any simple path of 0 edges has exactly one node". This is true because the only simple path of 0 edges from a node goes to that node itself, and so that node is the only one visited.

For the inductive case, we assume P(n) and prove P(n+1), which says that any simple path of n+1 edges visits exactly n+2 nodes. Let α be an arbitrary simple path of n+1 edges, from some node s to some node t. By the definition of paths, α consists of some path β with n edges from s to some node x, followed by the edge (x,t). By the IH, β visits exactly n+1 nodes. Thus α visits those n+1 nodes plus the node t, and since α is a simple path, x cannot be equal to any of the n+1 nodes visited by β. So α visits exactly n+2 nodes, and since α was arbitrary we have proved P(n+1) from the assumption of P(n). Since we have completed the induction, we have proved P(n) for all naturals n.

• Question 2 (15): Recall that a rooted directed tree is defined to be either (1) a single node which is both a leaf and the root, or (2) a root node with edges to the roots of each of one or more subtrees. Describe (in English or pseudo-Java) a recursive algorithm that takes a node of a rooted directed tree as inpt and returns the number of leaves in the subtree under that node. You may use any reasonable syntax for looking at all the children of a node: one way to do it is to assum that ou have methods `node firstChild()`, `node nextChild()`, and `boolean isAnotherChild()` in the `Node` class.

In English, we return 1 in the case of a single node, since it is a leaf. If the root node has children, we recursively compute the number of leaves under each child and add these numbers together (since a leaf under the root must be under exactly one of the root's children).

In pseudo-Java:

``````
natural numLeaves (node v)
{// returns number of leaves in subtree under v
if (v.isLeaf()) return 1;
node w = v.firstChild();
natural temp = numLeaves(w);
while (v.isAnotherChild())
temp += numLeaves(v.nextChild());
return temp;}
``````

Questions 3 and 4 deal with a particular kind of rooted directed trees called foo-trees. They are defined recursively:

• (i) A single leaf is a foo-tree.
• (ii) If the root of a tree has one or more children, the tree is a foo-tree if and only if the last subtree is a foo-tree and all the other subtrees consist of single leaves.
• (iii) These are the only foo-trees, so that if a predicate P(v) is true whenever v is a leaf, and true whenever v is the root of a tree as described in (ii) with P(w) true for v's last child w, then P(v) is true whenever v is the root of a foo-tree.

Here is an example of a foo-tree with eight nodes:

```         (a)
/ | \
/  |  \
/   |   \
/    |    \
/     |     \
/      |      \
(b)      (c)      (d)
/ | \
/  |  \
/   |   \
/    |    \
/     |     \
/      |      \
(e)      (f)      (g)
|
|
|
|
|
|
(h)
```

• Question 3 (20): Prove by induction on all foo-trees T that the breadth-first search of T and the depth-first search of T visit the nodes of T in exactly the same order. (You should assume that the children of each node of T come in an order that is specified as part of the definition of T.)

Base case: T is a single node v. The BFS and DFS of T both visit just v and so visit the same nodes in the same order.

Now assume that T is constructed by clause (ii) of the definition, so that every child of T except for the last one is a leaf, and the last child is the root of a foo-tree U. Assume as the IH that the BFS and DFS of U visit the nodes of U in the same order. Consider the BFS and DFS of T. The BFS first visits the children of T in order and puts them on the queue. It then takes the children before the last child off the queue and does nothing further, because they are leaves. At this point the only node on the queue is the root of U, and the BFS then proceeds identically to the BFS of U. The DFS of T also visits the children of T before the last child in order, because it visits each one, finds that it has no children, and proceeds to the next. When it reaches the root of U, it proceeds identically to the DFS of U. Thus both BFS and DFS visit the non-last children in order, then visit the nodes of U. They each visit the nodes of U in the same order by the IH, so overall they visit the nodes of T in the same order. By clause (iii) of the definition of foo-trees, we have proved our property for all foo-trees.

• Question 4 (15): Prove that if the breadth-first search of T and the depth-first search of T visit the nodes of T in exactly the same order, then T must be a foo-tree. (Hint: Any non-foo tree must have a node where condition (ii) fails to be true. Thus the non-foo-trees can be defined recursively as well -- a tree is non-foo if and only if the root (a) has a child that is not a leaf and is not last, or (b) has a last child that is the root of a non-foo tree. Using this definition, you can prove by induction on non-foo-trees that each of them has nodes that are visited in a different order by the BFS and DFS.)

We prove the contrapositive of the given statement -- that if a rooted directed tree T is not a foo-tree, then the BFS and DFS of T do not visit the nodes of T in the same order. We do this by induction on all non-foo trees T. The base case is when T's root has a child v that is not last and not a leaf. In this case the BFS visits v's next sibling right after v, while the DFS visits v's first child after v, so the order of the two searches is different.

The recursive case is when the last child of T is a non-foo tree U. Here the IH tells us that the BFS and DFS of U visit the nodes of U in different order, and so the BFS and DFS of T, since they simulate this search, must also visit nodes in different order.

Questions 5 and 6 deal with the following labeled undirected graph G. The node set is {a,b,c,d} and there are four edges: (a,c) with weight 4, (a,d) with weight 1, (b,c) with weight 1, and (b,d) with weight 1.

• Question 5 (15): Write down the single-step distance matrix A for this graph G. Use min-plus matrix multiplication (Floyd's Algorithm) to compute the shortest-path distance matrix for G. (Hint: Because G is undirected, the matrix A and all of its powers are symmetric -- the (i,j) entry is always equal to the (j,i) entry. You can use this fact to simplify your calculation.)

Many people did not remember the complete definition of the single-step distance matrix. The entries on the diagonal are 0, because that is the single-step distance from each node to itself. The entries for edges that don't exist are &infty;, because there is no finite single-step distance between the endpoints of a non-edge. This makes A equal to:

```    ( 0  inf  4   1 )
(inf  0   1   1 )
( 4   1   0  inf)
( 1   1  inf  0 )

A^2(a,a) = min (0+0, inf+inf, 4+4, 1+1) = 0
A^2(a,b) = min (0+inf, inf+0, 4+1, 1+1) = 2
A^2(a,c) = min (0+4, inf+1, 4+0, 1+inf) = 4
A^2(a,d) = min (0+1, inf+1, 4+inf, 1+0) = 1
A^2(b,b) = A^2 (c,c) = A^2(d,d) = 0, similar to A^2(a,a)
A^2(b,c) = min (inf+4, 0+1, 1+0, 1+inf) = 1
A^2(b,d) = min (inf+1, 0+1, 1+inf, 1+0) = 1
A^2(c,d) = min (4+1, 1+1, 0+inf, inf+0) = 2

Since A^2 is symmetric, it is:

( 0   2   4   1 )
( 2   0   1   1 )
( 4   1   0   2 )
( 1   1   2   0 )

We find A^3 by min-plus multiplying A by A^2 -- since n-1 = 3, A^3 is the final
shortest-path distance matrix.  The only entries that are different between
A^2 and A^3 are (a,c) and (c,a):

A^3(a,c) = min (0+4, inf+1, 4+0, 1+2) = 3

A^3 =      ( 0   2   3   1 )
( 2   0   1   1 )
( 3   1   0   2 )
( 1   1   2   0 )
```

• Question 6 (25):
• (a,15) Describe in detail the results of a uniform-cost search of G starting from node a. Draw the resulting search tree and indicate the shortest paths from a to each other node.

The queue starts with node a, with priority 0. We process (a,0), removing it from the queue and adding (c,4) and (d,1). The next node off the queue is (d,1), which we process to get (b,2). (We recognize a as already visited.) Then (b,2) is next off the queue, and we process it to get (c,3). Finally (c,3) comes off the queue before (c,4), and all nodes have left the queue.

In the search tree, each node appears at the end of the edge by which its chosen entry entered the queue. So we have the edges from a to d, from d to b, and from b to c. The edge from a to c is a non-tree edge because the entry for that edge was still on the queue when the last node was visited. The tree thus looks like:

```        (a)
| \
|  \
v   |
(d)  |
|   |
|   | non-tree edge from a to c
v   |
(b)  |
|   |
|  /
v v
(c)
```

The shortest path from a to each node is the path using tree edges -- the empty path of length 0 to a, the path of one edge and distance 1 to d, the path of two edges and distance 2 from a through d to b, and finally the three-edge path of distance 3 from a through d and b to c.

• (b,10) Carry out a breadth-first search of G from vertex c, ignoring the weights of edges. Let h(v), for each node v, be the level of v in the tree of this BFS. Indicate how this function h could be used in an A* search to find the shortest path from a to c. (It's important that the weight of each edge is at least 1.)

The BFS from c begins with just c on the queue. It puts a and b on the queue, then removes a and processes it to find d, the last node to be found. The BFS tree looks like this, with a non-tree edge between b and d (assuming that the neighbors of c are placed on the queue in order, with a before b):

```        (c)
/   \
/     \
v       v
(a)     (b)
|     /
|    /
|   /
|  /
v v
(d)
```

Thus we have h(a) = 1, h(b) = 1, h(c) = 0, and h(d) = 2.

If g(v) is the true distance in G (with weighted edges) from v to c, then h is an admissible heuristic for g because 0 ≤ h(v) ≤ g(v) for any node v. (This is where it comes in that the weights in G are all at least 1 -- if they were not we might have g(v) < h(v).) We could thus use h in an A* search of G from a, using c as a goal node. The initial entry for a would be (a,0+1), because k(a) = 0 and h(a) = 1. We take this entry off the queue and add (c,4+0) and (d,1+2). We take (d,3) off the queue and add (b,2+1). We take (b,3) off before (c,4), and add (c,3+0). When we take (c,3) off the queue, it represents the three-step path of distance 3. In this case the A* did not save us any time relative to the uniform-cost search.