08: More linked lists; Generics

Welcome

Announcements

Quiz Monday!

Moodle! It is our authoritative source of grades! Please make sure things are correct there!

Building StringLinkedList

We’ll need to keep a reference to the underlying list of nodes as an instance variable, and it turns out that’s the only instance variable we need. By convention, this is called head, and it’s initialized to null (an empty list) when we start:

private StringNode head;

public StringLinkedList() {
    head = null;
}

Let’s add the “simple” methods from last lecture, starting with size. What do we need to do? There’s no underlying array, so we can’t just examine the length attribute and return it. Instead, we need to traverse the list. In other words, (on board) we will start at the head element (if it exists), and follow the next references until we reach the end, signaled by a null value. Don’t forget that head is null, so your code should account for this:

public int size() {
    int size = 0;
    StringNode current = head;
    while (current != null) {
            size++;
            current = current.getNext();
    }
    return size;
}

Unlike arrays, where we can just jump to the node we want, a linked-list (almost) always requires that we traverse its elements until we get to the one we want.

Note that we don’t have to traverse the whole list for size; we could maintain a private size instance variable and just return it instead if we wanted to (and then, just like in StringArrayList, we need to remember to updated whenever adding or removing.) Let’s do so instead of the above:

    int size;
    StringNode head;

    public StringLinkedList() {
        size = 0;
        head = null;
    }

    @Override
    public int size() {
        return size;
    }

Now let’s do get. The exceptions are the same as before, but now we must traverse the list to find the ith element:

public String get(int i) throws IndexOutOfBoundsException {
    if (i < 0 || i >= size) {
        throw new IndexOutOfBoundsException();
    }
    int j = 0;
    StringNode current = head;
    while (current != null) {
        if (i == j) {
            return current.getContents();
        }
        current = current.getNext();
        j++;
    }
}

Now let’s look at add. If we want to add at the end of the list we can again traverse it to get to the end, and adjust the last element’s next reference appropriately. Note that we have to stop our traversal just before we get to the end, not after we go past it, which also means we have to special-case the head (on board first):

public void add(String s) {
    size++;
    StringNode node = new StringNode(s);

    // empty list? just set head to this new node
    if (head == null) {
        head = node;
        return;
    }

    // otherwise traverse to end, then append
    StringNode current = head;
    while (current.getNext() != null) {
        current = current.getNext();
    }
    current.setNext(node);
}

Similarly, if we want to add somewhere in the middle of the list, we have to stop just before we get there, and do surgery on both the previous and current (added) item to make the links in the list line up, again with a special case for the first spot:

@Override
public void add(int i, String s) throws IndexOutOfBoundsException {
    if (i < 0 || i >= size) {
        throw new IndexOutOfBoundsException();
    }

    size++;
    StringNode node = new StringNode(s);

    // adding to the front of the list
    if (i == 0) {
        // make the current head follow this node
         node.setNext(head);
         // then make this node the new head
         head = node;
         return;
    }


    // otherwise, traverse the list to find the *node-before* the new node
    // we want to insert
    StringNode nodeBefore = head;
    for (int j = 0; j < i-1; j++) {
         nodeBefore = nodeBefore.getNext();
    }

    // set the next node of the new node to the node-before's next node
    node.setNext(nodeBefore.getNext());

    // then set the node-before's node to the new node
    nodeBefore.setNext(node);
}

In-class exercise

Suppose x is a node in the middle a large linked list. What will be the effect of x.setNext(x.getNext().getNext())?

Suppose x is a node at the end of large linked list. What will be the effect of x.setNext(x.getNext().getNext())?

Finishing up StringLinkedList

Finally the remove method, which again has to do some surgery on the previous node (if it exists). Let’s do some examples on the board first (end of list; front of list; middle of list)

Then here’s the code for remove():

public String remove(int i) throws IndexOutOfBoundsException {
    if (i < 0 || i >= size) {
        throw new IndexOutOfBoundsException();
    }

    size--;
    String result;

    // removing the first node is a special case, since we need to manipulate
    // the head reference
    if (i == 0) {
        result = head.getValue();
        head = head.getNext();
        return result;
    }

    // otherwise, just like in add(), we find the node-before the node we want 
    // to remove
    StringNode nodeBefore = head;
    for (int j = 0; j < i-1; j++) {
        nodeBefore = nodeBefore.getNext();
    }

    // here we give the node to delete a name
    StringNode nodeToDelete = nodeBefore.getNext();

    // just to make it clear where result comes from
    result = nodeToDelete.getValue();

    // then we set the node-before's next pointer to the next pointer of the node to delete
    nodeBefore.setNext(nodeToDelete.getNext());

    // and return the result
    return result;
}

And we’ll run it in our toy program, switching StringArrayList to StringLinkedList. Notice the behavior doesn’t change. (StringListInterface sli = new StringLinkedList(); is the only change.)

Implications

Now we’ve seen two different ways to implement a simple abstract data type (the List), using arrays, and using references. All other abstract data types can be implemented in terms of one or both of these mechanisms.

We’ve also seen that both implementations have the same results. Though, if you think about it, one or the other is more efficient (or at least different) in some ways.

For example, the array-based list uses up to twice the memory of its current size(). The linked list uses only what it needs, plus space for the link. It actually turns out they’re about equivalent.

In-class exercise

Would you expect a call to get(someLargeIndex) in ArrayList or LinkedList to be faster?

For random access, the array-based implementation is better, since the get method is really a thin wrapper around array indexing, which is quite fast.

For addition or removal, it kind of depends. Adding something to the head of the linked list is really fast (make a node, set its next to head, set head to it), whereas adding a node to the front of an array-based list requires moving every single element in the array, and possibly enlarging the array, too.

On the other hand, adding an element to the end of the linked list is slow, since you have to traverse the entire list first. You could also keep a reference to the end of the list (called a tail pointer), but then you also have to update it and handle it in every method that modifies the list. On the third hand, only the implementor of the linked list needs to do this, not the user – as in our demo, both have the same interface, and neither expose their inner workings in terms of results (though runtime and memory usage might differ).

We won’t go into this comprehensive level of detail for (most) data structure implementation again in this course, though we will definitely revisit the idea of using an array or references to build more complicated data structures, like say this tree (on board). But that’s about as far as we’ll go, with diagrams rather than code. We’ll be more interested in the interface of the abstract data types and what behaviors they provide, rather than the fine details of the implementations.

Generics

Methods take parameters: rather than hardcoding all data and values into a method, we can make some parts of the data variable and parameterizable so that the method can be reused with different data and values. This make a lot of sense: Let’s say we want to write a method that adds five to its argument:

int add5(int i) {
    return i + 5;
}

That’s useful. But someday we want to also write a method that adds six:

int add6(int i) {
    return i + 6;
}

OK. Then add7, etc. Getting silly. We don’t want to write a different method each time, since since (1) there’s an infinite number of them! and (2) the operation of adding is mechanically the same each time. That is, there’s a generalized algorithm we write, once, and then can use many times.

int add(int i, int j) {
    return i + j;
}

The insight behind “generics” is that we can do the same thing with types – we can parameterize classes (and methods) with a type, too, and use it on different types of things. Our crappy StringListInterface, for example, while it lets us hold any list of Strings we want, was still limited to String data. But it turns out that instead of writing:

public interface StringListInterface {
    public void add(String s);
    public void add(String s, int i);
    public String remove(int i);
    public String get(int i);
    public int size();
}

you can parameterize the interface on a type using angle brackets:

public interface ListInterface<E> {
    public void add(E e);
    public void add(E e, int i);
    public E remove(int i);
    public E get(int i);
    public int size();
}

The ListInterface now sports a generic type name in angle brackets. We’ve defined a family of possible types here; note that each method that used to operate on strings now operates on this mysterious E.

We can also type parameterize a class:

public class Node<E> {
    private final E contents;
    private Node<E> next;
// more
}

…and the two together let us write generic code, that operates on generic types, based upon the type parameter.

Type parameters

The E is a type parameter – it says that the programmer who declares a variable of type ListInterface must also choose a particular type that the declared ListInterface will handle. ListInterfaces of different type parameters are of different types. For example, you cannot assign one to another unless they have the same type parameter, any more than you can assign a boolean to a String:

boolean x = "banana"; // not allowed, fails at compile time

List<String> x;
List<Integer> y;

... // some code ...

x = y; // not allowed, fails at compile time

In-class exercise


Will the following code compile (that is, are the types valid)? (x4)

More on parameters

Type parameters are usually written as a single uppercase letter, and often that letter is an abbreviation. E stands for Element of a collection; we’ll also see Key and Value later in the course.

Type parameters, when instantiated (that is, when a generic is declared), must be a non-primitive type. But, Java does something called auto-boxing, so you can generally mix primitives and non-primitives freely using the associated wrapper types, like Integer. (Integer and friends also have many useful static methods.)

The final fact for today about type parameters: Usually we think of them as being declared on classes (and indeed, that’s usually where they are declared). But if you write a particular method that would benefit from type parameterization, you can do so:

public class Util {
    public static <K, V> boolean compare(Pair<K, V> p1, Pair<K, V> p2) {
        return p1.getKey().equals(p2.getKey()) &&
               p1.getValue().equals(p2.getValue());
    }
}

Note the type parameters come immediately before the return type.

Why generics matter

They matter for the reasons listed above (generic re-usable code)! But also, in Java 5, the entire Collections library was re-written to use generics. Before then, all container types (List, etc.) only held things of type Object, and you, the programmer, had to laboriously cast them each time you used them:

List list = new ArrayList();
list.add("hello");
String s = (String) list.get(0);

Not only was this a pain, if you made a mistake:

        List l = new ArrayList();
        l.add("Zero");

        String s = (String)l.get(0);

        //...

        Integer i = (Integer)l.get(0); // throws exception at run-time!
        System.out.print(i);

you’d find out at run-time, not at compile-time. And while I know you hate compiler errors right now, you’ll learn to love them when writing big programs — every error that the compiler catches is one you can fix at your leisure, while run-time errors are erratic, not always reproducible, and generally result in a (much bigger) headache for you.

Using Lists

(Note: we probably won’t get to this in class. It’s not critical but you’re welcome to read over it.)

For the rest of class today we’re going to practice using the generic List to solve problems. You’re welcome to use your computer if you have it, and you’re welcome to work together.

Define a class IceCreamShop. While it might have many instance variables, focus on two: employees and flavors. Each of these is further defined in a class Employee and Flavor which you can assume already exist. Write out the class and constructor for IceCreamShop that defines empty lists for each of these instance variables. You can assume the relevant import statements are present.

class IceCreamShop {
    List<Employee> employees;
    List<Flavor> flavors;
    public IceCreamShop() {
        employees = new ArrayList<Employee>();
        flavors = new ArrayList<Flavor>();
    }
}

Write a method of IceCreamShop with the signature void hire(Employee e) to add an employee from payroll.

void hire(Employee e) {
    employees.add(e);
}

Did you check to make sure they weren’t already on payroll?

void hire(Employee e) {
    if (employees.contains(e)) {
        // do what?
    }
    employees.add(e);
}

This is an example of an underspecified problem. When I underspecify things, you can ignore the problem. Or at least, the underspecified bits. :) But when you are asked to do something “for real,” you should at least consider cases like this. Are they worth dealing with? Did your instructor overlook something? Did your boss? Your customer? Maybe you can ignore them and maybe not; some of that is a judgment call.

How about if you want to hire a whole bunch of new employees? Ignore the re-hire problem, and write void hireAll(List<Employee> l)

Did you write a for loop? Or did you check the List interface for something better?

void hireAll(List<Employee> l) {
    employees.addAll(l);
}

Suppose you want to add items to a list (say, a storeNumbers attribute of type List<Integer>) and you want to keep it sorted. How would you write the public void addStore(int newNumber) method?

Note we don’t need to pass in an Integer object; Java autoboxing handles it for us.

Option 1: Insert the number in order:

public void addStore(int newNumber) {
    if (storeNumbers.isEmpty()) { // turns out you don't actually need this...exercise for the reader: why?
        storeNumbers.add(newNumber);
        return;
    }
    for (int i = 0; i < storeNumbers.size(); i++) {
        if (storeNumbers.get(i).compareTo(newNumber) >= 0) {
            storeNumbers.add(i, newNumber);
            return;
        }
    }
    storeNumbers.add(newNumber);
}

You’ll notice I didn’t use an iterator above. Why not? Because it’s not generally a good idea to …

public void addStore(int newNumber) {
    if (storeNumbers.isEmpty()) {
        storeNumbers.add(newNumber);
        return;
    }
    int i = 0;
    for (Integer storeNumber: storeNumbers) {
        if (storeNumber.compareTo(newNumber) >= 0) {
            storeNumbers.add(i, newNumber);
            return;
        }
        i++;
    }
    storeNumbers.add(newNumber);
}

This could throw a ConcurrentModificationException.

WTF? It turns out that some (most) implementations of collections are very particular about allowing you to modify them while you are iterating. Creating an iterator, then modifying the collection, then trying to iterate is generally not allowed. See, for example, the ArrayList docs: http://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html and note that “…if the list is structurally modified at any time after the iterator is created, in any way except through the iterator’s own remove or add methods, the iterator will throw a ConcurrentModificationException.”

The exception will only be thrown if the iterator (the top of the for loop) is reached again after the list is modified.