CMPSCI 187: Programming With Data Structures
============================================

Today's topics
--------------

-   administrivia
-   recursion: more on helpers, when to use it, and why it can be slow
-   the queue ADT

Administrivia
=============

Mid-semester today
------------------

Last day for withdraw or P/F. All forms must be to the registrar in
Whitmore by 5pm.

A06 posted
----------

Recursion and the Tower of Hanoi await you. Due next Thursday, 12 March.

Recursion
=========

Another look at `addToTail` and `reverse`
-----------------------------------------

-   If the list is empty, make a new node and set `head` to it.
-   If the list has a node, remove it (keeping a temp pointer to it),
    add our new element to the tail of the remaining list, then put the
    temp node back at the head of the list.

``` {.java}
public void addToTail(T elem) {
  if (head == null) {
    LLNode<T> node = new LLNode<T>(elem);
    head = node;
  }
  else {
    LLNode<T> temp = head;
    head = head.getNext();
    addToTail(elem);
    temp.setNext(head);
    head = temp;
  }
}
```

(show operation on board to add an element to a 2-element list)

Here's another recursive approach, using a helper function and getting
rid of the *shared state* of `head` between the functions. The broader
the scope of the state of any one variable, the harder it is to reason
about, so arguably the approach below is cleaner.

``` {.java}
public void addToTail(T elem) {
  LLNode<T> newTail = new LLNode<T>(elem);
  head = addToTailHelper(newTail, head);
}

public LLNode<T> addToTailHelper(LLNode<T> newTail, LLNode<T> lst) {
  if (lst == null) {
    return newTail;
  }
  else {
    LLNode<T> first = lst;
    LLNode<T> rest = addToTailHelper(newTail, lst.getNext());
    first.setNext(rest);
    return first;
  }
}
```

The helper function takes the new node and a pointer into the list, and
returns a node. If it's at the end of the list, it just returns the new
node. If it's not at the end, it adds the new node to the end of rest of
the list and returns that list.

Here's the old `reverse()`:

``` {.java}
public void reverse() {
  if (head == null) return;
  if (head.getNext() == null) return;
  LLNode<T> temp = head;
  head = head.getNext();
  temp.setNext(null);
  reverse() ;
  addToTail(temp.getInfo());
```

Can you rewrite it similarly to the above (i.e., using a helper method
and not depending upon the use of `head`)? You'll also want to use
`addToTailHelper`.

    public void reverse() {
      head = reverseHelper(head);
    }

    private LLNode<T> reverseHelper(LLNode<T> lst) {
      if (lst == null) {
        return null;
      }
      LLNode<T> first = lst;
      LLNode<T> rest = lst.getNext();
      first.setNext(null);
      return addToTailHelper(first, reverseHelper(rest));
    }

What's happening here? We have one argument: the list remaining to
reverse. We start with the entire list remaining to reverse.

The base case is when we have no list remaining to reverse: we just
return null.

We make progress each recursive call, since the list remaining to
reverse is one element shorter.

The method is correct, if we assume it works on shorter lists -- we
remove the first element of the list, then place it on the end of the
reversed list.

More comments on stacking
-------------------------

We showed you could rewrite an explicit recursion:

``` {.java}
public void revPrint(LLNode<T> node) {
  if (node != null) {
    revPrint(node.getLink());
    System.out.println(" " + node.getInfo());
  }
}
public void revPrint() {
  revPrint(head);
}
```

using a stack:

``` {.java}
public void printReversed() {
  USI<T> stack = new LinkedStack<T>();
  LLNode<T> t = top;
  while (t != null) {
    stack.push(t.getInfo());
    t = t.getLink();
  }
  while (!stack.isEmpty()) {
    System.out.println(" " + stack.top());
    stack.pop();
  }
}
```

In theory, you can always replace recursion with an explicit stack, but
it's not always trivial -- we can't simply use a single stack and the
trivial push-them-all pop-them-all approach. You'll see this again in
CMPSCI 250.

Whether to use recursion
------------------------

Recursion has costs and benefits, like any technique. Whether or not to
use it depends upon the problem and the costs and benefits of other
techniques.

Recursive methods use up space on the call stack. Method calls are often
slower than direct iteration.

Iterative versions are often longer, sometimes don't map as clearly to
the problem and its textual solution, and can be harder to debug or
prove correct. The three questions are enormously powerful when
verifying correctness and completeness of a solution. E.g., could you
have written Hanoi iteratively? And how would you know it was right?

Recursion can also be slower in the O() sense, depending upon the
details.

The ideal solution is to write a recursive solution and have the
compiler translate it to an iterative one. Many languages (but not Java,
sad sad) do this automatically for tail-recursive functions.

O(too big) recursive algorithms
-------------------------------

Monday you looked at both iterative and recursive Fibonacci methods. The
iterative method was O(n), but the recursive method was much slower.
Why?

(Draw stack and call tree on board.)

Many values are computed more than once! Recursive `fib()` is
exponential in `n`! (Not quite 2\^n, actually proportional to fib, about
1.61\^n -- the tree is not full.)

Estimating O() behavior of recursive functions can require that you
consider the call tree, not just the method bodies.

Another example. Suppose I want to raise two to the *n*th power. I can
define it recursively as f(0) = 1; f(n) = f(n-1) + f(n-1).

(Draw call tree on board.)

Clicker question: Recursive exponentiation
------------------------------------------

But of course a simple loop is linear in n.

More slow recursion
-------------------

DJW give another example, foreshadowing a type of problem you'll see in
CMPSC240. If we have a group of *n* people, how many different subsets
of that group have exactly *n* people?

(draw example on board)

Here is a correct but *very slow* solution:

``` {.java}
int com(int group, int members) {
  if (members == 1) return group;
  if (members == group) return 1;
  return com(group-1, members-1) +
         com(group-1, members);
}
```

C(g, 1) = g, as there are g one-member subsets

C(g, g) = 1, as there is one g-member subset

In CMPSCI240 you'll learn why C(g, m) = C(g-1, m-1) + C(g-1, m).

This is the identity behind Pascal's Triangle (on board). It's called
Yang Hui's triangle in China -- Yang's predecessor Jia Xian described it
about three hundred years before Pascal. Indian and Greek mathematicians
alluded to it even earlier.

Introducing the Queue ADT
=========================

Queues
------

A *queue* is like a stack -- it is an ADT that holds a collection of
elements. Like a stack, you can insert and remove elements. Unlike a
stack, you can only remove the *oldest* (earliest-inserted) element in
the queue.

Stacks have a last-in-first-out (LIFO) behavior.

Queues have a first-in-first-out (FIFO) behavior.

We put elements into the "back" of the queue by "enqueueing" them, and
remove them from the front of the queue by "dequeueing" them. (This is
DJW's terminology, we'll see later that the Java API uses slightly
different terms.)

Queue is another word for "waiting line" -- this is like the line at the
supermarket or bank.

DJW's queues do not have the equivalent of their stack's `peek()` --
this time they don't seem to care the `dequeue()` is both an observer
and a transformer. Awesome.

A sample use of queues
----------------------

Remember D05's string reverser, which was run on a Stack?

``` {.java}
  final Stack<String> stack = new LinkedListStack<String>();
  final Scanner conIn = new Scanner(System.in);
  try {
    String string = conIn.nextLine();
    while (!string.trim().equals(".")) {
      stack.push(string);
      string = conIn.nextLine();
    }
  } finally {
    conIn.close();
  }
  while (!stack.isEmpty()) {
    System.out.println(stack.pop());
  }
```

Just by changing the stack to a queue, what happens? The strings are now
printed in the same order they came in.

Another use: searching graphs
-----------------------------

*Graphs* are an abstraction we'll explore in more detail (and that
you'll see again and again in later CS classes). Not a plot of a
function, but a relationship between vertices and edges.

For example, you can make a graph of the NE states, where the vertices
are the states, and an edge exists between them iff they are adjacent
(on board).

The DJW `blob` example from the last two classes can be trivially
transformed into a graph -- but so can many other problems. We'll focus
on blob for now, but keep in mind everything we do is general to
anything that can be described as a graph.

Our `markBlobs()` and `visit()` methods used recursion to find all the
filled squares in a connected cluster. Once we searched all neighbors of
a square, we were done with it and returned to its predecessor. This
resulted in a *backtracking* search.

We could do this with an explicit stack instead of recursion, and the
behavior would be the same. We'd start by visiting the first node. To
visit a node, we'd mark it, then push each of its neighbors onto the
stack. Then we'd loop until the stack was empty.

Pseudocode:

    stack.push(first node)
    while (stack not empty)
      node = stack.pop()
      mark node as visited
      for each neighbor:
        if unvisited
          stack.push(neighbor)

At any given time, the stack holds the nodes we are waiting to search
next.

This search is *greedy* in the sense that it pursues a given path
completely before backtracking -- we go deeper rather than broader
whenever possible. Hence the name *depth-first search*. (show on board)

What if we use the same algorithm, but with a stack instead of a queue?
That is, what if it looked like:

    queue.enqueue(first node)
    while (queue not empty)
      node = queue.dequeue()
      mark node as visited
      for each neighbor:
        if unvisited
          queue.enqueue(neighbor)

We'd visit the first square, then all of its neighbors, then all of
their neighbors, and so on. (show on board)

In other words, we'd visit all the squares at *distance* 1 from the
starting square before any square at distance 2, then all at distance 2
before any at distance 3, etc.

This is called a *breadth-first* search, because we search broadly
through the graph instead of going deeply in one direction.