16: Graphs

Announcements

Quiz Monday.

A08 hints

All data structures should instantiated in the constructor.

Data structures should only be modified in addDocument(); the other methods will query the data structures and return answers.

(Graphical description of the Maps on the board.)

I’m going to do a coding project plagiarism sweep soon, probably late next week. If you have a guilty conscience, confess ahead of time to perhaps be granted some leniency.

Graphs

Remember our discussion of trees, and how we talked about trees being a “kind” of graph? Graphs are this really useful thing, a kind of multipurpose data structure that let us represent so many things.

Notation

Recall G = {V, E}.

Note about directed vs undirected graphs.

Note about annotations / weights.

Vocabulary

Linear lists and trees are two ways to make objects out of nodes and connections from one node to another. There are many other ways to do it.

A graph consists of a set of nodes called vertices (one of them is a vertex) and a set of edges that connect the nodes. In an undirected graph, each edge connects two distinct vertices. In a directed graph, each edge goes from one vertex to another vertex.

Two vertices are adjacent (or neighbors) if there is an undirected edge from one to the other.

A path (in either kind of graph) is a sequence of edges where each edge goes to the vertex that the next edge comes from. A simple path is one that never reuses a vertex. In a tree, there is exactly one simple path from any one vertex to any other vertex.

A complete graph is one with every possible edge among its vertices – in other words, every vertex is adjacent to every other vertex.

A connected component is a set of vertices in which every vertex has a path to every other vertex (though not every vertex is adjacent).

A single graph might have two or more connected components! (on board)

Examples

google map

maze

tic-tac-toe

8-puzzle

In-class exercise

Imagine you wanted to represent the first two years worth of COMPSCI courses (121, 187, 220, 230, 240, 250) for majors (and their prerequisites) as a graph. What would it look like?

Graph abstraction and algorithms

Each of the previous examples can, if you squint at it correctly, be viewed as a graph. There is some “space” (finite or otherwise) of discrete points, and some points are connected to others.

This corresponds exactly to a graph. And what’s interesting here is that there are many algorithms that operate generally on graphs, regardless of the underlying problem. So we can write them (once) and solve many kinds of problems. Most common are things like:

search: start at a particular vertex, report true if there is a path to another vertex in the graph, or false otherwise)
path search: (also shortest-path search) find the shortest path from one vertex to anoter vertex (this might be lowest number of edges, or if edges have a “weight”, might be based upon sum of edge costs)
minimax search in an adversarial game: given a state, look for the “winning-est” move
all-pairs shortest path (which it turns out can be solved more efficiently than just doing each pairwise shortest-path search)

and many, many more.

Total vs partial knowledge of a graph

You can know (or compute) the entire graph “ahead of time” when it’s both small and knowable, for example, our earlier maze example. That is, you can create an ADT that allows you to set and access nodes and edges (and associated annotations) and instantiate it.

For some problems, the graph is too large to keep track of (e.g., the state space of chess is around 10^123). But obviously we have computer programs that can play chess. How do you do it? You generate a “partial view” of the state space, where you can find the “successors” of a particular state (on board) and their successors, and so on, up until you’re “out of time” to think more or out of space to store more, and do the best you can with this partial view.

How might these ADTs look in practice?

ADT for graphs

We need to be able to add and query lists (or sets) of vertices and edges. Of course, edges are just links between two vertices, so we needn’t have a separate data type. What might this look like in the simple case, where we don’t worry about annotations? Something like:

public interface UndirectedGraph<V> {
  void addVertex(V v);
  boolean hasVertex(V v);
  Set<V> vertices();

  void addEdge(V u, V v);
  boolean hasEdge(V u, V v);  
  Set<V> neighborsOf(V v);
}

What about if we are just concerned with a partial view of the graph? Maybe something like this:

public interface PartialUndirectedGraph<V> {
  List<V> neighborsOf(V v);
}

The implementation of a partial view would have to know quite a bit about the underlying problem in order to generate the neighbors, but on the other hand, you don’t need to generate everything, just the bits of the graph you care about.

Searching a graph

How might we go about trying to solve a search problem? That is, suppose we had a graph that had been instantiated a particular way. We’re given a start vertex, and we want to see if there’s a path in that graph to the end vertex. As a human, we can just look at a small graph and decide, but larger graphs eventually can’t just be glanced at. What’s a methodical way to check?

Well, we can look at the first node, then its neighbors. If one is the goal, we’re done. If not, we look at each of their neighbors, and so on. Here’s our first attempt (we’ll assume the start node is not the goal here):

boolean isPath(UndirectedGraph<Integer> graph, Integer start, Integer goal) {
  Integer current = start;
  Set<Integer> remaining = new HashSet<>();
  while (true) {
    if (current.equals(goal)) return true;
    remaining.addAll(graph.neighborsOf(current));
    List<Integer> list = new ArrayList<>(remaining);
    if (list.isEmpty()) break;
    current = list.get(0);
    remaining.remove(current);
  }
}

Does this work? Let’s see on the board. Uh-oh! Looks like we need to keep track of where we’ve been, too.