06: Array-backed lists
Welcome
Announcements
Programming assignment 03 is due tomorrow night. Only about 1/3 of the class has submitted at least once to Gradescope. Pro tip: you can get (small) partial credit just for submitting the project file as-is. Might as well do this early!
Also note that PA04 will be shorter, but perhaps conceptually a little more taxing than PA3. PA3 was a set of exercises; PA4 is a single, concrete problem you need to solve. The solution is spelled out in English in the writeup, but translating it to code will take some time and thought (much like the bus simulator).
Quiz Monday. Anything we’ve done so far is fair game; sample questions will go up before the end of the day tomorrow.
A note/reminder about grading: course staff are generally more forgiving about things you do under time pressure or where you cannot easily check your work (in class exercises, quizzes, exam), and less so otherwise (programming assignments, homeworks). Don’t submit code on the homeworks that “looks mostly right” and expect to get more than a small amount of partial credit. Fire up Eclipse and test it!
If you have a grading question or regrade request: Start with the TAs (via private Piazza post). If you are unsatisfied with their answer, then ask me, but the first thing I’m going to do is check with them to see what their reasoning was.
Lists
So, as we mentioned last class, lists are kinda like arrays – in that they’re linear container types that can hold references to zero or more values of the same type.
They’re kinda different, in that instead of special syntactic support (the []
operator), they are plain-old objects where we use their methods to “do things.” And, their semantics differ somewhat: they aren’t of fixed size, and we can add or delete items at arbitrary points, and anything “in the way” gets shifted out of the way.
For example, recall this question that some of you might have seen on the quiz:
A sequence is said to be a doubling sequence if each value is exactly twice the previous value. For example, {3, 6, 12, 24} is a doubling sequence, but {1, 2, 4, 6} is not. Write a method
boolean isDoublingSequence(int[] a)
that returns true if and only if the array represents such a sequence. Assume the array contains at least two values.
Your answer probably looked something like:
boolean isDoublingSequence(int[] a) {
for (int i = 0; i < a.length - 1; i++) {
if (a[i] * 2 != a[i + 1]) {
return false;
}
}
return true;
}
What if, instead, we did this for a list? That is, a List<Integer>
as the argument? What would it look like? Almost the same! But we’ll need to adjust the code to access a List rather than an array. Note we can use “refactoring” to intelligently rename the variable:
boolean isDoublingSequence(List<Integer> list) {
for (int i = 0; i < list.size() - 1; i++) {
if (list.get(i) * 2 != list.get(i + 1)) {
return false;
}
}
return true;
}
Normally we don’t copy/paste code like this, but instead we just write it using the List
methods.
In-class exercise
For example, if you wanted to sum all of the numbers in a List
, you’d write almost the same code as you would for an array, including the for-each loop:
int mystery (List<Integer> list) {
int s = 0;
for (int x : list) {
s += x;
}
return s;
}
Why bother with lists at all? Because the abstraction is nicer, for many algorithms.
That is, because for many real-world problems:
- We don’t know the size of the intermediate linear collection in advance. Each time we solve one of these problems, we could take the time to write a solution with arrays, and explicitly track the size, and make more space when we need to, but that’s lots of busywork.
- Or, we can computer the size in advance, but doing so is costly.
- Or, we need to be able to move elements around easily. Much like the first point, we could explicitly move them around as needed, but it’s a lot nicer just to say `theList.add(0, “the new front of the list!”) to add an element to the front of the list.
- and so on.
A ListInterface
As much as possible, it’s a good idea to have the computer help us check our assumptions as we go. One way to do this is to leverage the type system. We’ll declare a StringListInterface
for a list of String
s as described above, then we’ll implement it using arrays.
Normally, you never do this! You use the built-in List
interface and ArrayList
implementation. But we’re going to do it here to peel back the curtain a little and show you that there’s no magic, just straightforward application of computer science you already know. You can do this. (And you will do this, in 187!)
public interface StringListInterface {
public void add(String s);
public void add(int i, String s);
public String remove(int i);
public String get(int i);
public int size();
}
What happens if a user of this interface does something “bad,” like attempting to remove an element that’s not present, or to add an element past the end of the list? The type doesn’t help us. We can add (optional) unchecked exception declarations to hint at this:
public interface StringListInterface {
public void add(String s);
public void add(int i, String s) throws IndexOutOfBoundsException;
public String remove(int i) throws IndexOutOfBoundsException;
public String get(int i) throws IndexOutOfBoundsException;
public int size();
}
..though note that these just give a programmer using your StringListInterface
a heads-up. The documentation comments for the class and the method define the actual contract that the class (or interface) offers. If there are no documentation comments, then only the type signature gives the contract. Usually it’s not enough.
In-class exercise
public int size() {
return 12;
}
Strictly speaking, does this this method obey its contract?
/**
* @return the number of elements stored in this list
*/
public int size() {
return numberOfElements;
}
What about now?
Note that the compiler can’t enforce all parts of this contract; it’s up to the programmer implementing the StringListInterface
to do the right thing.
Also note that a full StringListInterface
would have many more methods (set
, analogous to array assignment, removeAll
, equals
, and so on). I don’t want to write all of them in lecture! (Though it should be straightforward to write most of them.)
Writing the StringArrayList
(If you want to implement an interface, use the Eclipse “new class” wizard to note you want to do so, and it will write skeletons of each method for you.)
Let’s think about what we need in our implementation. What instance variables do we need? Certainly an array of String
s to hold the list elements. Anything else? The number of elements actually stored in the array. Remember, one of the reasons we’re writing a List
is that arrays are of fixed size, but a List
can grow and shrink arbitrarily. (We’ll see how soon.)
So let’s declare an array String[] array
and an int size
to hold these values.
public class StringArrayList implements StringListInterface {
String[] array;
int size;
}
(On board) Conceptually, strings will be added to, gotten, and removed from this array; it’s the implementation’s job to make sure they go in the right place, and that if a user of StringArrayList
tries to access an element of the List (that is, of the underlying array) that is invalid, an exception is raised.
Let’s start with a constructor that starts with a small empty array:
public StringArrayList() {
strings = new String[10];
size = 0;
}
Now let’s do the simple methods:
public int size() {
return size;
}
@Override
public String get(int i) throws IndexOutOfBoundsException {
return array[i];
}
But remember, while the array might be of size 10, there might be fewer than 10 (even no!) strings stored in the array. So a correct definition would instead read:
public String get(int i) throws IndexOutOfBoundsException {
if (i >= size or i < 0) {
throw new IndexOutOfBoundsException();
}
return array[i];
}
This is important to understand: the List
acts like a list of elements with a size equal to the number that have been added (less the number removed). Even though there’s an underlying array of a different size, the user of the List
interface cannot see it! The details are said to be encapsulated. This is a very powerful concept that lets you use data structures (and generally any API) by reading their contract – you don’t need to fully understand every detail of the implementation (though it can be helpful to do so!).
Now let’s turn to some of the more complicated methods, like add
:
public void add(String s) {
array[size] = s;
size++;
}
This sorta works, but what happens once we add the eleventh element? We’ll overflow the array bounds, which we don’t want to do – our list is supposed to be unbounded. Instead, we’ll check to see if the array needs to be expanded, and do so:
public void add(String s) {
if (size == array.length) {
enlarge();
}
array[size] = s;
size++;
}
In-class exercise
public void add(String s) {
if (size == array.length) {
enlarge();
}
size++;
array[size] = s;
}
What should enlarge
do? It should allocate a new, larger array, copy the current array into it, then set the strings
instance variable to point to this new array.
void enlarge() {
String[] larger = new String[array.length * 2];
for (int i = 0; i < array.length; i++) {
larger[i] = array[i];
}
array = larger;
}
Why double, and not, say, just + 10
? The full answer is beyond the scope of this course, but in short: when you don’t know anything else, doubling is the most efficient way to dynamically grow an array. If you do know other things, you might expose ways to grow (or shrink) the underlying array, but that’s has its own problems (like: now users of your code are tied to your specific implementation, even if a better one comes along later).
What about if we want to add in a particular place, rather than just at the end of the array? We need to move each element out of the way. (On board) we have to move the last element forward one, then the previous element into the last element’s place, and so on, to “make space” for the item we’re inserting. We also need to make sure the index is valid, and that there’s space. In code:
public void add(int i, String s) throws IndexOutOfBoundsException {
if (i >= size || i < 0) {
throw new IndexOutOfBoundsException();
}
size++;
if (size == array.length) {
enlarge();
}
for (int j = size; j > i; j--) {
array[j] = array[j-1];
}
array[i] = s;
size++;
}
Notice that we never worry about “checking” what’s already in an array cell. If it’s of index size
, then it’s not in use right now.
Finally, let’s write the code to remove
an element at index i
. Similar to the above, we’ll need to “move” any elements into the space we leave behind (on board). And by convention, return the value we removed.
public String remove(int i) throws IndexOutOfBoundsException {
final String removed = strings[i];
if (i >= size || i < 0) {
throw new IndexOutOfBoundsException();
}
for (int j = i; j < size - 1 ; j++) { // note: this was corrected with the "- 1" !
array[j] = array[j+1];
}
// optional
array[size-1] = null;
size--;
return removed;
}
Setting the unused space to null lets the garbage collector free the memory, but that reference will also be overwritten the next time we add to the list, so it’s not strictly necessary.
As I mentioned earlier, there are many other things you could do. For example, you could write removeAll
method that completely empties the List
.
At least two solutions are possible.
In-class exercise
One is to repeatedly call remove()
until the list is empty; another is to directly manipulate the underlying instance variables.
public void removeAll1() {
while (size() > 0) {
remove(0);
}
}
public void removeAll2() {
size = 0;
array = new String[10]; // this line is optional; do you see why?
}