06: Array-backed lists

Welcome

Announcements

B[U]ILT’s (Black, Indigenous, and Latinx in Tech) first meeting is TODAY 2/7 4:15PM CS Building Room 303.

Programming assignment 03 is due tomorrow night. Only about 1/5 of the class has submitted at least once to Gradescope. Pro tip: you can get (small) partial credit just for submitting the project file as-is. Might as well do this early!

Also note that PA04 will be shorter in terms of lines of code, but definitely conceptually a little more taxing than PA3. PA3 was a set of exercises; PA4 is a single, concrete problem you need to solve. The solution is spelled out in English in the writeup, but translating it to code will take some time and thought (much like the bus simulator). Students often report that A02 was “the most annoying at the time” assignment, and that A04 was “the most challenging”. So start early and know that if you can handle A04, you will probably be able to handle everything else we plan to do this semester.

Quiz Monday. Anything we’ve done so far is fair game; sample questions will go up Friday most likely.

A note/reminder about grading: course staff are (or should be) generally more forgiving about things you do under time pressure or where you cannot easily check your work (in class exercises, quizzes, exam), and less so otherwise (programming assignments, homeworks). Don’t submit code on the homeworks that “looks mostly right” and expect to get more than a small amount of partial credit. Fire up Eclipse and test it!

If you have a grading question or regrade request: Start with the TAs (via regrade request or private Piazza post). If you are unsatisfied with their answer, then ask me, but the first thing I’m going to do is check with them to see what their reasoning was.

Review: arrays

In our review of 121 material so far, we’ve exclusively used a single container type, the array. Container types are types that “hold” other types, and the array is probably the most basic: a fixed-size sequence of values (that are themselves either primitive types or references to objects) that can be read or written in approximately “constant time”. We call these values “cells” or “elements” in the array.

Constant time means that no matter how big the array is, (to a first approximation) it takes the same amount of time to access (read or write) an element. We expect each of these statements to execute in about the same amount of time:

array[0] = 5;
System.out.println(array[1]);

array[1000000] = 12;
System.out.println(123456789);

(modulo some caching effects, which are COMPSCI 230/335 material).

But arrays have some downsides, as well. For example, they’re fixed size: you need to know how many elements you want in advance. You can cheat here by allocating a giant array, but that’s wasteful for small inputs, and potentially won’t work anyway if your data is of size (giant array + 1).

Instead, we might use a higher-level abstraction; that is, a more general “container type” than an array. This week we’ll describe the List:

  • which is like (but not the same as) an array, and
  • which can be implemented in terms of an array.
  • we’ll do an array-based implementation of the List in lecture (though this is the only data structure where we’ll do this in detail)
  • we’ll briefly discuss but not fully implement a linked-list, an alternative to the array-based list, and talk about its pros and cons

We’ll show how our List compares to the Java API’s List, and this will lead into generics and container types, two topics that will come up again and again in material this semester.

Operations on arrays

To recap, what can you do with an array? You can declare it, allocate it, read or write individual elements, and determine its length at runtime.

Declare an array of Strings called strings.

Allocate a String array of size 50 and assign it to strings.

Set the zeroth element of strings to “Hi”.

Print the zeroth element of strings.

Print the length of strings.

String[] strings;
strings = new String[50];
strings[0] = "Hi";
System.out.println(strings[0]);
System.out.println(strings.length);

That’s it for builtins of the array. If you want to do much else, you gotta build it yourself. (Note that there is a java.util.Arrays that has some helpful methods you can call on arrays, in particular the static Arrays.toString method is helpful when caveman debugging arrays of primitive types; otherwise the debugger can be helpful.)

So let’s talk about the List abstract data type.

Lists

Note I said “abstract data type.” First we’ll talk about the properties and assumptions we might expect from a List, in the abstract. Then we’ll do an actual, concrete implementation of the data type and see how it measures up.

“List” is a very overloaded term; we’ll simplify this by choosing a specific set of assumptions, that implicitly define an abstraction:

  • lists are unbounded, that is they don’t have a fixed size (if implemented with arrays, the arrays dynamically resize)
  • duplicate elements are allowed (when searching for an element, one of several equal elements is as good as any other)
  • lists can contain null elements (I hope you like NullPointerExceptions! though note some implementations might forbid null elements)
  • lists support an add operation, either to the end of the list or to a specific place (sometimes called an “insert”)
  • lists support a remove operation, either of a specific value or of an element at a particular index
  • lists support a get operation to return the element at a specific index
  • lists support a size operation to determine how many elements are currently in the list
  • lists can be in sorted order, but by default are not (you can imagine a SortedList that enforces this property)
  • and many more, but we’ll get to them later when we look at the full API that Java supplies.

Declare an List of Strings called strings.

Allocate a String List and assign it to strings.

Insert the string “Hi” at the front of the list strings.

Append the string “Bye” at the end of the list strings.

Print the zeroth element of strings.

Print the length of strings.

List<String> strings;
strings = new ArrayList<>();
strings.add(0, "Hi");
strings.add("Bye");
System.out.println(strings.get(0));
System.out.println(strings.size());

Some things to notice:

The type of strings is List<String>. It’s a variable of type List; we could also make it of type ArrayList or LinkedList, but what we care about is that it satisfies the List interface (see the javadocs).

Further, it’s a parameterized type. Much like methods can take arguments, so can types! Usually, we see this with container classes (like Lists), where the argument (in <>s) is the type of thing it’s holding. More on this later.

Sorta-like arrays, Lists have an add operation that either appends, or inserts at a specified point – we illustrate both here. They also have a set operation, but it can only replace existing items, not insert new ones. And a get, which is much like array lookups.

Continuing on

Lists are kinda like arrays – in that they’re linear container types that can hold references to zero or more values of the same type.

They’re kinda different, in that instead of special syntactic support (the [] operator), they are plain-old objects where we use their methods to “do things.” And, their semantics differ somewhat: they aren’t of fixed size, and we can add or delete items at arbitrary points, and anything “in the way” gets shifted out of the way.

For example, recall this question that some of you might have seen on the quiz:

A sequence is said to be a doubling sequence if each value is exactly twice the previous value. For example, {3, 6, 12, 24} is a doubling sequence, but {1, 2, 4, 6} is not. Write a method boolean isDoublingSequence(int[] a) that returns true if and only if the array represents such a sequence. Assume the array contains at least two values.

Your answer probably looked something like:

    boolean isDoublingSequence(int[] a) {
        for (int i = 0; i < a.length - 1; i++) {
            if (a[i] * 2  != a[i + 1]) {
                return false;
            }
        }
        return true;
    }

What if, instead, we did this for a list? That is, a List<Integer> as the argument? What would it look like? Almost the same! But we’ll need to adjust the code to access a List rather than an array. Note we can use “refactoring” to intelligently rename the variable:

    boolean isDoublingSequence(List<Integer> list) {
        for (int i = 0; i < list.size() - 1; i++) {
            if (list.get(i) * 2  != list.get(i + 1)) {
                return false;
            }
        }
        return true;
    }

Normally we don’t copy/paste code like this, but instead we just write it using the List methods.

In-class exercise

For example, if you wanted to sum all of the numbers in a List, you’d write almost the same code as you would for an array, including the for-each loop:

    int mystery (List<Integer> list) {
        int s = 0;
        for (int x : list) {
            s += x;
        }
        return s;
    }

Why bother with lists at all? Because the abstraction is nicer, for many algorithms.

That is, because for many real-world problems:

  • We don’t know the size of the intermediate linear collection in advance. Each time we solve one of these problems, we could take the time to write a solution with arrays, and explicitly track the size, and make more space when we need to, but that’s lots of busywork.
  • Or, we can compute the size in advance, but doing so is costly.
  • Or, we need to be able to move elements around easily. Much like the first point, we could explicitly move them around as needed, but it’s a lot nicer just to say `theList.add(0, “the new front of the list!”) to add an element to the front of the list.
  • and so on.

A ListInterface

As much as possible, it’s a good idea to have the computer help us check our assumptions as we go. One way to do this is to leverage the type system. We’ll declare a StringListInterface for a list of Strings as described above, then we’ll implement it using arrays.

Normally, you never do this! You use the built-in List interface and ArrayList implementation. But we’re going to do it here to peel back the curtain a little and show you that there’s no magic, just straightforward application of computer science you already know. You can do this. (And you will do this, in 187!)

public interface StringListInterface {
    public void add(String s);
    public void add(int i, String s);
    public String remove(int i);
    public String get(int i);
    public int size();
}

What happens if a user of this interface does something “bad,” like attempting to remove an element that’s not present, or to add an element past the end of the list? The type doesn’t help us. We can add (optional) unchecked exception declarations to hint at this:

public interface StringListInterface {
    public void add(String s);
    public void add(int i, String s) throws IndexOutOfBoundsException;
    public String remove(int i) throws IndexOutOfBoundsException;
    public String get(int i) throws IndexOutOfBoundsException;
    public int size();
}

..though note that these just give a programmer using your StringListInterface a heads-up. The documentation comments for the class and the method define the actual contract that the class (or interface) offers. If there are no documentation comments, then only the type signature gives the contract. Usually it’s not enough.

In-class exercise

    public int size() {
        return 12;
    }

Strictly speaking, does this this method obey its contract?

    /**
     * @return the number of elements stored in this list
     */
    public int size() {
        return numberOfElements;
    }

What about now?

Note that the compiler can’t enforce all parts of this contract; it’s up to the programmer implementing the StringListInterface to do the right thing.

Also note that a full StringListInterface would have many more methods (set, analogous to array assignment, removeAll, equals, and so on). I don’t want to write all of them in lecture! (Though it should be straightforward to write most of them.)

Writing the StringArrayList

(If you want to implement an interface, use the Eclipse “new class” wizard to note you want to do so, and it will write skeletons of each method for you.)

Let’s think about what we need in our implementation. What instance variables do we need? Certainly an array of Strings to hold the list elements. Anything else? The number of elements actually stored in the array. Remember, one of the reasons we’re writing a List is that arrays are of fixed size, but a List can grow and shrink arbitrarily. (We’ll see how soon.)

So let’s declare an array String[] array and an int size to hold these values.

public class StringArrayList implements StringListInterface {
    String[] array;
    int size;
}

(On board) Conceptually, strings will be added to, gotten, and removed from this array; it’s the implementation’s job to make sure they go in the right place, and that if a user of StringArrayList tries to access an element of the List (that is, of the underlying array) that is invalid, an exception is raised.

Let’s start with a constructor that starts with a small empty array:

public StringArrayList() {
  array = new String[10];
  size = 0;
}

Now let’s do the simple methods:

```java public int size() { return size; }

…and then let’s run out of time :)