Announcements
Gaming in class? C’mon people. Save it for later, or at least sit in the back row.
A10 is not due until Monday – but there are no office hours over the weekend, so getting into it tonight might be a good idea.
Next Monday is a holiday! And, next Wednesday is a UMass Monday.
Graphs
Implementing the abstraction
A few lectures ago, we defined an UndirectedGraph
interface. Today we’re going to talk about how you’d implement that interface.
There are two basic ways to implement the graph abstraction. One is based upon arrays and is known as the “adjacency matrix” representation; the other is based upon lists and is known as the “adjacently list” representation.
(on board)
Here’s a naive implementation:
public class AdjacencyMatrixUndirectedGraph<V> implements UndirectedGraph<V> {
private List<V> vertices;
private final boolean[][] edges;
public AdjacencyMatrixUndirectedGraph(int maxVertices) {
vertices = new ArrayList<>();
edges = new boolean[maxVertices][maxVertices];
}
@Override
public void addVertex(V v) {
// what if the vertex is already in the graph?
vertices.add(v);
}
@Override
public boolean hasVertex(V v) {
return vertices.contains(v);
}
@Override
public Set<V> vertices() {
return new HashSet<>(vertices);
}
@Override
public void addEdge(V u, V v) {
// order of edges?
// u,v in graph?
edges[vertices.indexOf(u)][vertices.indexOf(v)] = true;
}
@Override
public boolean hasEdge(V u, V v) {
// order of edges?
// u,v in graph?
return edges[vertices.indexOf(u)][vertices.indexOf(v)];
}
@Override
public Set<V> neighborsOf(V v) {
// order of edges?
// v in graph?
Set<V> neighbors = new HashSet<>();
int index = vertices.indexOf(v);
for (int i = 0; i < vertices.size(); i++) {
if (edges[index][i]) {
neighbors.add(vertices.get(i));
}
}
return neighbors;
}
}
Note that upon reflection, there are some problems here (repeated vertices! order of vertices in edges! are vertices even in the graph?). Some of this we can fix in code (by having, say, a canonical ordering, or being sure to set both spots in the matrix); some of this implies we need to add to our API (methods that take arbitrary vertices as parameters should throw an exception).
In class exercise
What is the running time of hasEdge
?
How much space does the above implementation require, in terms of vertices or edges?
Remember, the main advantage of adjacency matrices is that they’re lightning fast in terms of checking if an edge is in the graph; it’s not just constant time, it’s constant time with a very low constant. Except our crappy implementation above requires a call to List.indexOf
first; so it’s actually linear in the number of vertices. But a highly-optimized version of an adjacency matrix representation of a graph would not do this (it would instead use just int
s for vertices) and would be “supah-fast”.
The main downside to adjacency matrices is that they consume a lot of space: the implementation above uses (maxVertices)^2 space, that is, space quadratic in the number of vertices. In the worst case, a graph actually needs this much space – an “almost-complete” graph is called a “dense” graph. But if most vertices are not connected to most other vertices, that is, if we have a “sparse” graph, a more efficient implementation is the adjacency list.
Let’s write one now using our by-now old friend the Map
:
public class AdjacenyListUndirectedGraph<V> implements UndirectedGraph<V> {
Map<V, List<V>> adjacencyList;
public AdjacenyListUndirectedGraph() {
adjacencyList = new HashMap<>();
}
@Override
public void addVertex(V v) {
// duplicate vertex?
adjacencyList.put(v, new ArrayList<>());
}
@Override
public boolean hasVertex(V v) {
return adjacencyList.containsKey(v);
}
@Override
public Set<V> vertices() {
// modification?
return adjacencyList.keySet();
}
@Override
public void addEdge(V u, V v) {
// order?
// u, v in adjacencyList?
adjacencyList.get(u).add(v);
}
@Override
public boolean hasEdge(V u, V v) {
return adjacencyList.get(u).contains(v);
}
@Override
public Set<V> neighborsOf(V v) {
return new HashSet<>(adjacencyList.get(v));
}
}
Again some problems here, including that we need to be careful of returning Set
s that share structure with the graph. The caller might mutate the Set
, and thus change the graph! If that’s not what we want (and it usually isn’t), then we should return copies of the structures that represent parts of the graph, not the original structures themselves.
In class exercise 2
What is the running time of hasEdge
?
How much space does the above implementation require, in terms of vertices and/or edges?
Is this “slower” than an adjacency matrix? Yes. In particular, any time we need to iterate over the list (contains
), we are, worst case, linear in the number of vertices. But we only need exactly as much space as is required to store each edge/vertex. In the worst case this is quadratic in the number of vertices, so we’re no better off than an adjacency matrix. But in a sparse graph, we come out ahead space-wise. And, saying a graph is sparse is roughly equivalent to saying that each vertex has a small constant number of edges, so contains
is usually OK in this case. (You’ll explore this more in 311).
“But Marc,” you might be thinking, “why not make it a Map<V, Set<V>>
and get the best of both worlds?” You can! And you will (mostly!). But while hash lookups are constant time, they’re not as quite as small a constant as array lookups. If you’re really, really worried about speed, and space is not an issue, you may end up using the adjacency matrix representation anyway. But enough about that – the details of graph representation in data structures go deep, and this isn’t a class about that.
Specialized linear ADTs
The “standard” linear ADTs (in Java) are the array and the (generic) List
. Arrays are a simple type, with very fast random access but the limitation of fixed sizes. Lists are more flexible, and their underlying implementation is generally written in terms of (resizing) arrays or (sometimes) in terms of a linked list.
But as we’ve mentioned, there are other linear data structures that one might use; they are similar to lists but restrict themselves in various ways. We’re going to revisit them so you’re ready when you see them again (“for the first time”) in 187. We’ll start with behavior, then do implementations.
Stacks
Stacks are a last-in, first-out data structure. They are like lists, but instead of allowing for random access (via get
, remove
, add
), they restrict a user to adding items on one end (the “top”) and removing from that same position. These operations are sometimes called push
(add an item), pop
(remove an item), and peek
(look at but do not remove the top item).
Modern Java suggests we use the Deque
interface, which is short for double-ended queue, and use the addFirst
, removeFirst
, and peekFirst
methods. In either case, though, the behavior is the same, LIFO.
s.push("a");
s.push("b");
s.pop();
s.push("c");
s.push("d");
s.peek();
(top on right)
- After the first operation, the stack contains [“a”]
- After the second, the stack containes [“a”, “b”].
- removes and returns “b”, then stack contains, [“a”]
- [“a”, “c”]
- [“a”, “c”, “d”]
peek
returns “d”
In class exercise 1
s.push(1);
s.push(2);
s.push(3);
s.peek()
s.pop();
s.push(1);
s.pop();
s.pop();
What are the contents of the stack after this code executes?
Queues
Queues are a first-in, first-out data structure. Java has a Queue
interface you can use, or you can (also) use Deque
, as described in its documentation. In a Queue
, we typically talk about add
(always at one end) remove
(always from the other), and sometimes peek
(just like a stack, returns but does not remove the next element that remove
would return).
q.add("a");
q.add("b");
q.remove();
q.add("c");
q.add("d");
q.peek();
(front on left, rear on right)
- [“a”]
- [“a”, “b”]
- removes and returns “a”, queue contains [“b”]
- [“b”, “c”]
- [“b”, “c”, “d”]
- returns “b”
In-class exercise 2
q.add(1);
q.add(2);
q.add(3);
q.peek()
q.remove();
q.add(1);
q.remove();
q.remove();
What are the contents of the queue after this code executes? (rear on right)
A side note: over/underflow
Stacks and queues can underflow. If you call pop
or remove
on an empty stack/queue, this will generate an exception.
Some stacks and queues are bounded, which means they have an explicit capacity. If you try to push
or add
to a stack/queue that is already at capacity, then you will overflow the structure and generate an exception.
Priority queues
A priority queue is like a queue, but it returns the next “smallest” (in Java) thing, rather than the first-in thing, when remove
or peek
is called.
It’s important to note that the exact order of the items stored in the priority queue is not visible to the user; you can only see the “next” / “top” item (that will be returned by peek
or remove
). Internally, priority queues are implemented as “heaps”, which are a tree-based structure similar to, but different from, the binary search trees we talked about briefly earlier this semester. Heaps allow for efficient (log n) insertion and removal of the smallest item.
How do we define “smallest”? The usual way, by either depending upon the “natural” ordering of the elements stored in the PriorityQueue<E>
(that is, they must implement Comparable
) or by passing in a Comparator
when constructing the PriorityQueue
.
Suppose then we do the following with a priority queue:
pq.add("b");
pq.add("a");
pq.remove();
pq.add("c");
pq.add("d");
pq.peek();
- [“b”]
- [“a”, “b”]
- removes and returns “a”, contents are [“b”]
- [“b”, “c”]
- [“b”, “c”, “d”] ; note we don’t know whether “c” or “d” comes first; all we know is “b” is up next to be removed
- returns “b”
In class exercise 3
pq.add(3);
pq.add(2);
pq.add(1);
pq.peek()
pq.remove();
pq.add(1);
pq.remove();
pq.remove();