Lecture 16: Graphs

Announcements

Quiz Monday.

Assignment 08 posted. Minor correction early this morning.

Pipe down during lecture please.

Graphs

Remember our discussion of trees, and how we talked about trees being a “kind” of graph? Graphs are this really useful thing, a kind of multipurpose data structure that let us represent so many things.

Notation

Recall G = {V, E}.

Note about directed vs undirected graphs.

Note about annotations / weights.

Vocabulary

Linear lists and trees are two ways to make objects out of nodes and connections from one node to another. There are many other ways to do it.

A graph consists of a set of nodes called vertices (one of them is a vertex) and a set of edges that connect the nodes. In an undirected graph, each edge connects two distinct vertices. In a directed graph, each edge goes from one vertex to another vertex.

Two vertices are adjacent (or neighbors) if there is an undirected edge from one to the other.

A path (in either kind of graph) is a sequence of edges where each edge goes to the vertex that the next edge comes from. A simple path is one that never reuses a vertex. In a tree, there is exactly one simple path from any one vertex to any other vertex.

A complete graph is one with every possible edge among its vertices – in other words, every vertex is adjacent to every other vertex.

A connected component is a set of vertices in which every vertex has a path to every other vertex (though not every vertex is adjacent).

A single graph might have two or more connected components! (on board)

Examples

maze

8-puzzle

tic-tac-toe

google map

In-class exercise

Imagine you wanted to represent the first two years worth of COMPSCI courses (121, 187, 220, 230, 240, 250) for majors (and their prerequisites) as a graph. What would it look like?

Graph abstraction and algorithms

Each of these examples can, if you squint at it correctly, be viewed as a graph. There is some “space” (finite or otherwise) of discrete points, and some points are connected to others.

This corresponds exactly to a graph. And what’s interesting here is that there are many algorithms that operate generally on graphs, regardless of the underlying problem. So we can write them (once) and solve many kinds of problems. Most common are things like:

search: start at a particular vertex, report true if there is a path to another vertex in the graph, or false otherwise)
path search: (also shortest-path search) find the shortest path from one vertex to anoter vertex (this might be lowest number of edges, or if edges have a “weight”, might be based upon sum of edge costs)
minimax search in an adversarial game: given a state, look for the “winning-est” move
all-pairs shortest path (which it turns out can be solved more efficiently than just doing each pairwise shortest-path search)

and many, many more.

Total vs partial knowledge of a graph

You can know (or compute) the entire graph “ahead of time” when it’s both small and knowable, for example, our earlier maze example. That is, you can create an ADT that allows you to set and access nodes and edges (and associated annotations) and instantiate it.

For some problems, the graph is too large to keep track of (e.g., the state space of chess is around 10^123). But obviously we have computer programs that can play chess. How do you do it? You generate a “partial view” of the state space, where you can find the “successors” of a particular state (on board) and their successors, and so on, up until you’re “out of time” to think more or out of space to store more, and do the best you can with this partial view.

How might these ADTs look in practice?

ADT for graphs

We need to be able to add and query lists (or sets) of vertices and edges. Of course, edges are just links between two vertices, so we needn’t have a separate data type. What might this look like in the simple case, where we don’t worry about annotations? Something like:

public interface UndirectedGraph<V> {
  void addVertex(V v);
  boolean hasVertex(V v);
  Set<V> vertices();

  void addEdge(V u, V v);
  boolean hasEdge(V u, V v);  
  Set<V> neighborsOf(V v);
}

What about if we are just concerned with a partial view of the graph? Maybe something like this:

public interface PartialUndirectedGraph<V> {
  List<V> neighborsOf(V v);
}

The implementation of a partial view would have to know quite a bit about the underlying problem in order to generate the neighbors, but on the other hand, you don’t need to generate everything, just the bits of the graph you care about.

Searching a graph

How might we go about trying to solve a search problem? That is, suppose we had a graph that had been instantiated a particular way. We’re given a start vertex, and we want to see if there’s a path in that graph to the end vertex. As a human, we can just look at a small graph and decide, but larger graphs eventually can’t just be glanced at. What’s a methodical way to check?

Well, we can look at the first node, then its neighbors. If one is the goal, we’re done. If not, we look at each of their neighbors, and so on. Here’s our first attempt (we’ll assume the start node is not the goal here):

boolean isPath(UndirectedGraph<Integer> graph, Integer start, Integer goal) {
  Integer current = start;
  Set<Integer> remaining = new HashSet<>();
  while (true) {
    if (current.equals(goal)) return true;
    remaining.addAll(graph.neighborsOf(current));
    List<Integer> list = new ArrayList<>(remaining);
    if (list.isEmpty()) break;
    current = list.get(0);
    remaining.remove(current);
  }
}

Does this work? Let’s see on the board. Uh-oh! Looks like we need to keep track of where we’ve been, too. Maybe something like:

boolean isPath(UndirectedGraph<Integer> graph, Integer start, Integer goal) {
  Integer current = start;
  Set<Integer> visited = new HashSet<Integer>();
  Set<Integer> toBeVisited = new HashSet<Integer>();
  while (true) {
    if (current.equals(goal)) return true;

    visited.add(current);
    toBeVisited.addAll(graph.neighborsOf(current));
    toBeVisited.removeAll(visited);

    if (toBeVisited.isEmpty()) return false;
    List<Integer> list = new ArrayList<>(toBeVisited);

    current = list.get(0);
    toBeVisited.remove(current);
  }
}

Of course, the order in which vertices are visited is undefined here. If we were to rewrite the code a little more, and make the toBeVisited structure a queue (first in, first out), then we’d end up with a “breadth-first” search. That’s a search where we visit all the vertices one “hop” away from the start before we then visit the vertices two hops away, and so on. (on board) If the structure were instead a stack (last in, first out) then the same code would produce a “depth-first” search, which looks deeply in one directly before backtracking. (on board) You’ll see this in much more detail in 187 (and later).

OK! So that’s not too bad, right? We’ll return to graph searching in a bit, but I want to turn now to how to implement the graph abstraction.

Implementing the abstraction

There are two basic ways to implement the graph abstraction. One is based upon arrays and is known as the “adjacency matrix” representation; the other is based upon lists and is known as the “adjacently list” representation.

(on board)

We’ll start implementing these next class.