Week 03: Java review, continued

Welcome

Announcements

UCAs now available, and the LRC is now open. Take advantage of the many resources available to help you!

Programming assignment 03 is due this week. I noticed that last week, less than half of the class has submitted to Gradescope before the regular deadline. Pro tip: you can get (small) partial credit just for submitting the project file as-is. Might as well do this early!

Also note that next week’s PA04 will be shorter in terms of lines of code, but definitely conceptually a little more taxing than PA3. PA3 is a set of exercises; PA4 is a single, concrete problem you need to solve. The solution is spelled out in English in the writeup, but translating it to code will take some time and thought (like the bus simulator, but more thought, fewer if statements). Students often report that A02 was “the most annoying at the time” assignment, and that A04 was “the most challenging at the time”. So start early and know that if you can handle A04, you will probably be able to handle everything else we plan to do this semester.

Lab this Monday; self-assessment next Monday. Anything we’ve done so far is fair game; sample questions will go up Friday most likely.

If you have a grading question or regrade request: Start with the TAs (via regrade request). If you are unsatisfied with their answer, then ask me, but the first thing I’m going to do is check with them to see what their reasoning was.

Classes and objects

Java is said to be “object-oriented”, which means that the language strongly encourages the use of objects as a paradigm for problem solving.

What’s an object? A collection of related state and behavior. In Java, this means objects are typically variables and associated methods. The “blueprint” for objects is a class. Classes are also the place where we stick stuff that’s not necessarily part of an object. (This falls out of Java’s design: everything is part of a class.)

Instance and class variables

Local variables only exist within a method. But instance variables are the state that is part of a class. For each instance of a class (that is, for each object), there is a separate copy of the instance variable around.

public class Dog {
  private String name;
  
  public Dog(String n) {
    name = n;
  }

  public String getName() {
    return name;
  }
  // ... more code ...
}

Two different dogs can have different names:

  Dog dog1 = new Dog("Spot");
  Dog dog2 = new Dog("Ribsy");
  System.out.println(dog1.getName()); // prints "Spot"
  System.out.println(dog2.getName()); // prints "Ribsy"

Generally, things you do to one instance don’t directly affect others – instances variables are separate across instances!

There are several kinds of instance variables: public, private, protected, and package come up most often. Public and private are denoted by keywords public, protected, and private, and package is denoted when you include no keyword at all. A public variable can be accessed (or “is visible”) from anywhere. A private variable can be accessed only by instances of exactly this class. Package variables can be accessed only by classes in the same packages as this class. Protected variables can be accessed as package variables, and also by subclasses of this class.

The good news is that generally you don’t have to worry too much about it! Generally, code you write will keep mutable variables private, and (sometimes) make constants and immutable variables public.

Packages? Mutable? What? Packages I’ll talk about later. Mutability is kinda the opposite of immutability. Immutable variables are, in Java, variables that can be assigned to exactly once, and then are constant. (Constants are immutable variables that are initialized at the program start, and don’t depend upon user input; immutable variables are more general, and can depend upon the program, like the ham and spam values in HamSpam.)

Generally, we mark an immutable variable with the final keyword. This helps to avoid errors – the compiler will now alert us if we try to change it (and in fact will treat it as an error).

Finally, we should mention static variables. Static variables are weird – they are not associated with an instance. Instead, there is only one copy of the variable, associated with the class, not the instance. So imagine we want to track the species of a Dog. All Dogs are of the same species, so we don’t need an instance variable:

public class Dog {
  public static final String SPECIES = "Canis familiaris";
  private String name;
  // ... more code ...
}

(Here, I’ve also marked it as public, so anyone can access it; and final, since it’s never going to change.)

Now, we can ask the dog class(!!) for this information:

System.out.println(Dog.SPECIES); // prints "Canis familiaris"

Mutable static variables, that is, non-final static variables, are the closest thing to “global variables” that Java has. They are typically considered a “code smell” – that is, an indication you’re not doing something “the Java way” or even the right way. If you find yourself madly adding “static” keywords to resolve compiler errors, you should stop and ask course staff for help – it almost certainly means you’ve misinterpreted something. We’ll be happy to get you on the right track.

Methods

So variables are state. Methods, on the other hand, are behavior.

In Java, methods are always attached to a class. A method consists of a declaration and a body, which is just a sequence of statements. Let’s look at a declaration more closely:

public static void main(String[] args) {
  ...
}

WTF is going on here!?!?! is usually the reaction we get in 121 when people first start learning Java. But now, you probably know enough to understand most of it. Let’s tackle it inside-out.

String[] args is the parameter to this method. One or more parameters are passed in, and they look like (and in many respects behave as) variable declarations. The difference is that their values are provided by the calling method (or in the very special case of main, by the JVM). Here, main gets an array of Strings, which are exactly what is passed on the command line to the java interpreter, e.g.,

public class PrintArgs {
  public static void main(String[] args) {
    int i = 0;
    for (String a : args) {
      System.out.println(i + ": " + a);
      i += 1;
    }
  }
}

> javac PrintArgs.java
> java PrintArgs Hello Marc!
0: Hello
1: Marc!

Next is the name of the method, which is by convention “camelCased,” starting with a lowercase letter. Next is the method’s return type, or void if it does not return anything.

Important note: methods that return void typically do something, like print a value, or delete an item from a list, or the like. They affect state, or are stateful. Sometimes but not always, methods that return a value don’t do something (they are more like mathematical functions). The only way to be sure is to read the documentation (or the method code itself)! But a method’s public API should describe what it does.

Next comes one or more method modifiers: either abstract or one or more of static, final, and synchronized. static methods are associated with a class, but not a particular instance of that class (in other words, not with an object). You’ve seen some of these in prior classes, most likely, such as Math.min. We’ll talk more about the other modifiers later as they come up.

Finally there is at most one of public, protected, or private, which are member access modifiers. Which objects can invoke this method is determined by this modifier. public and private are probably familiar to you (any object and only objects of this class, respectively). protected and no modifier (“default access” or “package access”) have special meanings we’ll skip for now.

Classes in the JVM

(Note that like our discussion of the call stack, this is a simplified version to give you a mental model of how things work; the implementation in the JVM differs somewhat.)

A single copy of the class (note: the class! not an object of class!) lives in memory, annotated with its name, each method, and space for each static variable. There also some other stuff: a pointer to the class’s superclass, and a list of interfaces it implements, and the like.

Objects that are instances of the class will be able to refer to these methods and variables here.

Instantiating objects

Objects are defined by a class. But an object does not exist until you instantiate it, that is, “make a new one” using the class as a template for the new object. For example:

Bus bus44 = new Bus();

There’s a class called Bus, and we’ve instantiated an object of type Bus. The object is named by the variable bus44.

Thinking back to our first class, what’s going on in memory when this happens?

Unlike primitive types, which are stored directly in the variable’s allocated space, the variable “holding” an object isn’t really the object’s value. It’s a pointer or reference to the place where the object really lives.

This is really important to understand, so I’ll say it again: what Java stores in primitive type variables and object variables is different, conceptually: The former stores the value itself, the latter stores as its value the memory address of the object. Weird but true!

Let’s do an example. What happens here?

String message = new String("Hi!");
String wassup = message;

First a new String is allocated. Then it’s initialized on the heap. Then a new variable of type String, named message, is created. Its value is set to the address of the actual String object we created.

Note that this String object implicitly refers to a String class somewhere else. When you call methods on an object, it uses this reference to find the method – only one copy of the code for a method exists at a time – all objects of a class share it.

Next, another new variable, again of type String, named wassup is created. The address of message is looked up, and is then assigned to wassup. Both variables now “point to” or “refer to” the same object, which is a String object containing the data "Hi!".

Having two variables refer to the same object is called “aliasing”; it is a common source of program bugs. Always think carefully about the value stored in a variable, not just the name of the variable.

So this leads to an important thing: == vs .equals(). With primitive types (int and so on), you only have one option: ==. What does it do? It looks up the values stored in the variables, and returns true if and only if they are the same. x = 3; y = 3; x==y;

But what happens when we use == on variables that refer to objects? Exactly the same thing! Which might or might not be what we mean. Following from above, let’s add String hello = new String("Hi!");, and ask, does message == wassup? Yes, because they refer to the same object.

Does message == hello? No. Even though the two String objects represent the same value (“Hi!”), they are stored in separate objects at different addresses. So the variables store the value of two different addresses.

But oftentimes we don’t actually want to know if two variables point to the same objects. Instead, we want to know if the two objects’ values are equivalent. In the case of Strings. this means that they hold the same text; you can imagine that for more complicated objects, we might need a more complicated comparison of the two objects’ instance variables. There is a method .equals() that a class can implement to provide a class-specific test of equality. (If you don’t implement it, the Object class’s equals() method, which defaults to ==, is used.)

So we can write message.equals(hello) to check if the two objects store equivalent Strings, rather than the two variables storing the same address as a value.

Example

Suppose we define a Bus class. Note we can use a Code plugin to write a semantically meaningful equals method for us (for now – later in the semester we’ll see what all this means and how to do it ourselves, then go back to letting Code write the boilerplate).

public class Bus {
	private int number;
	
	public Bus(int n) {
		number = n;
	}
	
	public static void main(String[] args) {
		Bus busA = new Bus(44);
		Bus busB = new Bus(44);
		
		Bus busC = busA;
		
		System.out.println(busA == busB);
		System.out.println(busA.equals(busB));
		
		busC.setNumber(13);
		System.out.println(busA == busC);
		System.out.println(busA.equals(busC));
		
		busC = new Bus(13);
	}

	public int getNumber() {
		return number;
	}

	public void setNumber(int number) {
		this.number = number;
	}

	@Override
	public int hashCode() {
		final int prime = 31;
		int result = 1;
		result = prime * result + number;
		return result;
	}

	@Override
	public boolean equals(Object obj) {
		if (this == obj)
			return true;
		if (obj == null)
			return false;
		if (getClass() != obj.getClass())
			return false;
		Bus other = (Bus) obj;
		if (number != other.number)
			return false;
		return true;
	}
}

Using objects and methods

If the flow of control is executing within an object, and we access a variable, say x, what happens?

First the local scope is checked, inside out, for example:

class Bar {
  int x = 0;

  private void foo(int j) {
    for (int i = 0; i<x.length; i++) {
      x[i] = j;
    }
  }
}

x is not declared in the for loop, nor in the method body, nor as a parameter, so the next place that’s checked is the object itself, where it’s found. (The superclasses, etc.) This all happens at compile-time; you can’t fail this lookup in a source that correctly compiles.

Relatedly, when methods are called on an object, the type of the object is examined, and the method is looked up in the corresponding class. If it’s not found, the class’s superclass is checked, and so on. Again, this is checked at compile-time. This is how the default “equals” method works; it’s implemented in Object, which is by default the superclass of all classes that don’t otherwise declare a superclass. In other words, it “inherits” the method from its superclass.

We’ll do (slightly) more on inheritance, but best practices over the years have moved toward a relatively flat class hierarchy. Usually you won’t see (or use) deep inheritance, with some exceptions for older codebases and large libraries (like, say, the Java Platform API).

Packages, namespaces, and the CLASSPATH

Namespaces

Many programming languages, including Java incorporate the idea of a “namespace”. A namespace is a way to provide context to a particular name.

For a real-world analogy, you might think of a person’s name, say, “John”. There might be more than one in this class, so we add some context (a surname, or a student ID, or an address, or all of the above) to disambiguate which we mean.

This is very similar to the idea of a variable’s scope in Java, but slightly different, as it’s how we precisely name and identify classes.

For example, we all write System.out.println() all the time, and we all know that System is probably a class, since it starts with a capital letter. Where does it come from, though?

Packages

It’s part of the java.lang package. Java organizes classes into packages; which are a hierarchical sequence of tokens (words), separated by dots. The built-in parts of the Java standard library all are part of the “java.” package, though it’s further subdivided.

For example, the aforementioned java.lang package defines the classes that are fundamental to the design of the language itself: things like System and String are defined here.

https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/package-summary.html

By default, things in this namespace are automatically “imported” into the local namespace. That is, you don’t have to type java.lang.String to declare a String (though you can); String suffices.

Interestingly, it’s not against the rules to define your own System class. But it’s like if you have both an instance variable named x and a local variable named x. By default, Java assumes you want to “more local” one.

int x = 5;

void test() {
  x = 3;
  System.out.println(x);
  System.out.println(this.x);
}

To get the “outer” one, you need to prefix it with this., which tells Java you want the current instance variable with the same name.

Similarly, with a class, you need to fully specify the class if you want access to it. Inside your custom System class, if you want to access the “normal” System.out.println, you’ll need to refer to it by its full name, java.lang.System.out.println.

java and javax are reserved by the JVM for built-in and extensions classes, but much like anyone can register a domain name on the Internet, anyone can declare a package namespace. There’s a loose convention that you should use your reverse domain name as a prefix (for example, if our department released a package for autograding, we might put it in the edu.umass.cs.autograder package). But many modern java packages declare a top-level namespace – you see this in most assignments for this class, where we just define a similar top-level namespace. In practice, projects that do similar things don’t usually have the same name, and/or agree to avoid namespace collisions.

Importing packages

Sometimes you want to use something not in the current namespace. Then you need to “import” it.

For example, if I want to print a random number between one and six:


public class Dice {
  public static void main(String[] args) {
    Random r = new Random();
    System.out.println(r.nextInt(6) + 1);
  }
}

it won’t compile, because Random is not in the namespace. But I can use an import statement: import java.util.Random; to add it to the namespace.

Code can do this for you in one of its “quick fixes” but beware: there’s sometimes more than one class with the same name! If you import the wrong one, it likely won’t have the behavior you expect!

Finding packages

Where does Java look for packages? By default the JVM has access to a set of “built-in” packages, that form the Java Platform API:

https://docs.oracle.com/en/java/javase/11/docs/api/index.html

Again, mostly in the java and javax namespace, but also some others.

But where do the compiled classes, that is, the virtual machine code for them, actually live? On my machine, for Java8 a big chunk of them live in /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/jre/lib in the JARs there. (The Java11 install I’m actually using has a more complicated system for loading JARs, so I’m going to show you the simpler one here.)

JARs are essentially ZIPfiles of compiled Java classes with some extra stuff (a Java-specific manifest, describing their contents, that the JVM knows how to read). Let’s take a look in rt.jar. Hey look! Our friends System and String!

You may have noticed that the file, say String.class (which is the compiled representation of String.java) lives in a directory java/lang/. That looks a lot like java.lang., doesn’t it?

Not a coincidence! The JVM requires that packages map to (that is, directly correspond to) directories with the same name(s), and that classes map to .class files within those directories.

But there’s still a piece of the puzzle missing. How does the JVM know where to look for these directories? How did it know, for example, that /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/jre/lib/rt.jar was a place to search?

CLASSPATH and friends

There are three mechanisms the JVM uses. One is under your control: the “CLASSPATH”. The other two you can’t (easily) change – there is a “bootstrap CLASSPATH” and an “extensions directory”.

The latter two are configured when your JVM is installed, and they contain classes that are part of the JRE, JDK platform, and vendor-distributed extensions.

But the CLASSPATH you do control. If you run from within code, you can add JARs to the CLASSPATH by adding them to the list of “Referenced libraries” in “Java Projects” under the explorer window; likewise, you can add directories by adding them to the “Java Source Path”.

Putting your own classes into packages

If you want to put a class into a package, like, for example, you want to move our Die class into the nerdy.gaming package, you need to explicitly declare the package at the top of the file, and move it into an appropriate directory, in order for it to compile and be recognized by the JVM. You’ll see a compiler error until you make the package and the directory strucure match.

Notice that now the top of the file has a nerdy.gaming package declaration; it lists just the package, not the classname (unlike imports, which list a full class name). Also notice that the package is rooted at a directory that’s in our CLASSPATH; in this case, the src/ directory in the project.

Another example: PA03

Sometimes this is more clear for people if I give an example based upon files they have. So let’s do that now, using this week’s programming assignment.

You can tell, by looking at the top of each .java file, that there are only two packages in this project: list.exercises and string.exercises.

Let’s look at ExtendedArrayList.java. You can see its enclosing directories are exercises and list – in other words, part its path to it, on your disk’s filesystem, is list/exercises/ExtendedArrayList.java. This is not a coincidence! Just like .java files have to have the same name as the class they define, the directory structure and the package structure have to “match” (with dots . being used in the package, and either slash or backslash in most filesystems).

So ExtendedArrayList (the class) is in the list.exercises packages, and correspondingly, ExtendedArrayList.java (the file) is in list/exercises. This is also true for ListExercises.

How does Java know where to look for classes and packages? It looks on the classpath. On my computer, for example, I have this project in /Users/liberato/working-with-strings-and-lists-student/ Inside there is the project, including the src/ directory, which contains list, which contains exercises, etc.

For this project, /Users/liberato/working-with-strings-and-lists-student/src is “on the classpath”, that is, it’s a place Java looks for packages. So it looks in there for list/exercises/ListExercises.java when it’s trying to find the class ListExercises in the package list.exercises.

But there’s more – you can have more than one directory on the classpath. For our projects, we typically have three directories on the classpath – src, support, and test. All directories on the classpath are searched when Java looks for classes. So the tests (in, say, ListExercisesTest) are in the same package as the things they test, even though they’re in a different directory – because we look in test for packages, too, and there’s a list/exercises/ directory there, too.

Review: arrays

In our review of 121 material so far, we’ve exclusively used a single container type, the array. Container types are types that “hold” other types, and the array is probably the most basic: a fixed-size sequence of values (that are themselves either primitive types or references to objects) that can be read or written in approximately “constant time”. We call these values “cells” or “elements” in the array.

Constant time means that no matter how big the array is, (to a first approximation) it takes the same amount of time to access (read or write) an element. We expect each of these statements to execute in about the same amount of time:

array[0] = 5;
System.out.println(array[1]);

array[1000000] = 12;
System.out.println(array[12345678]);

(modulo some caching effects, which are COMPSCI ²³⁰⁄₃₃₅ material).

But arrays have some downsides, as well. For example, they’re fixed size: you need to know how many elements you want in advance. You can cheat here by allocating a giant array, but that’s wasteful for small inputs, and potentially won’t work anyway if your data is of size (giant array + 1).

Instead, we might use a higher-level abstraction; that is, a more general “container type” than an array. In a bit, we’ll describe the List:

which is like (but not the same as) an array, and
which can be implemented in terms of an array.
we’ll do an array-based implementation of the List in lecture (though we definitely won’t do this for all implementations of all data structures – that’s 187!)
we’ll briefly discuss but not fully implement a linked-list, an alternative to the array-based list, and talk about its pros and cons

We’ll show how our List compares to the Java API’s List, and this will lead into generics and container types, two topics that will come up again and again in material this semester.

Operations on arrays

To recap, what can you do with an array? You can declare it, allocate it, read or write individual elements, and determine its length at runtime.

Declare an array of Strings called strings.

Allocate a String array of size 50 and assign it to strings.

Set the zeroth element of strings to “Hi”.

Print the zeroth element of strings.

Print the length of strings.

String[] strings;
strings = new String[50];
strings[0] = "Hi";
System.out.println(strings[0]);
System.out.println(strings.length);

That’s it for builtins of the array. If you want to do much else, you gotta build it yourself. (Note that there is a java.util.Arrays that has some helpful methods you can call on arrays, in particular the static Arrays.toString method is helpful when caveman debugging arrays of primitive types; otherwise the debugger can be helpful.)

So let’s talk about the List abstract data type.

Properties of lists

Note I said “abstract data type.” First we’ll talk about the properties and assumptions we might expect from a List, in the abstract. Then we’ll do an actual, concrete implementation of the data type and see how it measures up.

“List” is a very overloaded term; we’ll simplify this by choosing a specific set of assumptions, that implicitly define an abstraction:

lists are unbounded, that is they don’t have a fixed size (if implemented with arrays, the arrays dynamically resize)
duplicate elements are allowed (when searching for an element, one of several equal elements is as good as any other)
lists can contain null elements (I hope you like NullPointerExceptions! though note some implementations might forbid null elements)
lists support an add operation, either to the end of the list or to a specific place (sometimes called an “insert”)
lists support a remove operation, either of a specific value or of an element at a particular index
lists support a get operation to return the element at a specific index
lists support a size operation to determine how many elements are currently in the list
lists can be in sorted order, but by default are not (you can imagine a SortedList that enforces this property)
and many more, but we’ll get to them later when we look at the full API that Java supplies.

Declare an List of Strings called strings.

Allocate a String List and assign it to strings.

Insert the string “Hi” at the front of the list strings.

Append the string “Bye” at the end of the list strings.

Print the zeroth element of strings.

Print the length of strings.

List<String> strings;
strings = new ArrayList<>();
strings.add(0, "Hi");
strings.add("Bye");
System.out.println(strings.get(0));
System.out.println(strings.size());

Some things to notice:

The type of strings is List<String>. It’s a variable of type List; we could also make it of type ArrayList or LinkedList, but what we care about is that it satisfies the List interface (see the javadocs).

Further, it’s a parameterized type. Much like methods can take arguments, so can types! Usually, we see this with container classes (like Lists), where the argument (in <>s) is the type of thing it’s holding. More on this later.

Sorta-like arrays, Lists have an add operation that either appends, or inserts at a specified point – we illustrate both here. They also have a set operation, but it can only replace existing items, not insert new ones. And a get, which is much like array lookups.

Comparing lists and arrays

Lists are kinda like arrays – in that they’re linear container types that can hold references to zero or more values of the same type.

They’re kinda different, in that instead of special syntactic support (the [] operator), they are plain-old objects where we use their methods to “do things.” And, their semantics differ somewhat: they aren’t of fixed size, and we can add or delete items at arbitrary points, and anything “in the way” gets shifted out of the way.

For example, recall this question that some of you might have seen on the self-assessment:

A sequence is said to be a Fibonaccish if each value is exactly the sum of the prior two values. For example, {1, 2, 3, 5, 8, 13} and {3, 4, 7, 11, 18} are Fibonaccish sequences, but {1, 1, 2, 4, 8} is not.

Write a method boolean isFibonaccish(int[] a) that returns true if and only if the array a represents such a sequence. Assume the array a contains at least three values.

Your answer probably looked something like:

	boolean isFibonaccish(int[] a) {
		for (int i = 2; i < a.length; i++) {
			if (a[i - 2] + a[i - 1]  != a[i]) {
				return false;
			}
		}
		return true;
	}

What if, instead, we did this for a list? That is, a List<Integer> as the argument? What would it look like? Almost the same! But we’ll need to adjust the code to access a List rather than an array.

	boolean isFibonaccish(List<Integer> l) {
		for (int i = 2; i < l.size(); i++) {
			if (l.get(i - 2) + l.get(i - 1)  != l.get(i)) {
				return false;
			}
		}
		return true;
	}

Normally we don’t copy/paste code like this, but instead we just write it using the List methods.

Another example

For example, if you wanted to sum all of the numbers in a List, you’d write almost the same code as you would for an array, including the for-each loop:

	int mystery (List<Integer> list) {
		int s = 0;
		for (int x : list) {
			s += x;
		}
		return s;
	}

Why lists?

Why bother with lists at all? Because the abstraction is nicer, for many algorithms.

That is, because for many real-world problems:

We don’t know the size of the intermediate linear collection in advance. Each time we solve one of these problems, we could take the time to write a solution with arrays, and explicitly track the size, and make more space when we need to, but that’s lots of busywork.
Or, we can compute the size in advance, but doing so is costly.
Or, we need to be able to move elements around easily. Much like the first point, we could explicitly move them around as needed, but it’s a lot nicer just to say `theList.add(0, “the new front of the list!”) to add an element to the front of the list.
and so on.

As you’re seeing in A03, lists, like Strings, have many additional useful methods to use. We’ll be seeing more of these methods (part of the “Collections API” in Java) as the semester progresses.

A `ListInterface`

How does List actually work? Like, what’s going on behind the scenes? We’re going to dive into it a little bit here in 186 today and next week, so that you have an intuition for how a List is implemented. If you choose to go to 187, you’ll do a more thorough treatement of lists and several other, more complicated data structures.

OK, here we go!

As much as possible, it’s a good idea to have the computer help us check our assumptions as we go. One way to do this is to leverage the type system. We’ll declare a StringList for a list of Strings as described above, then we’ll implement it using arrays.

Now, normally, you never do this! You use the built-in List interface and ArrayList implementation. But we’re going to do it here to peel back the curtain a little and show you that there’s no magic, just straightforward application of computer science you already know. You can do this. (And you will do this, in 187!)

public interface StringList {
	public void add(String s);
	public void add(int i, String s);
	public String get(int i);
	public String remove(int i);
	public int size();
}

What happens if a user of this interface does something “bad,” like attempting to remove an element that’s not present, or to add an element past the end of the list? The type doesn’t help us. We can add (optional) unchecked exception declarations to hint at this:

public interface StringList {
	public void add(String s);
	public void add(int i, String s) throws IndexOutOfBoundsException;
	public String get(int i) throws IndexOutOfBoundsException;
	public String remove(int i) throws IndexOutOfBoundsException;
	public int size();
}

..though note that these just give a programmer using your StringList a heads-up. The documentation comments for the class and the method define the actual contract that the class (or interface) offers. If there are no documentation comments, then only the type signature gives the contract. Usually it’s not enough.

Example

	public int size() {
		return 12;
	}

Strictly speaking, does this this method obey its contract?

	/**
	 * @return the number of elements stored in this list
	 */
	public int size() {
		return numberOfElements;
	}

What about now?

Note that the compiler can’t enforce all parts of this contract; it’s up to the programmer implementing the StringList to do the right thing.

Also note that a full StringList would have many more methods (set, analogous to array assignment, removeAll, equals, and so on). I don’t want to write all of them in lecture! (Though it should be straightforward to write most of them.)

Writing the `StringArrayList`

Let’s think about what we need in our implementation. What instance variables do we need? Certainly an array of Strings to hold the list elements. Anything else? The number of elements actually stored in the array. Remember, one of the reasons we’re writing a List is that arrays are of fixed size, but a List can grow and shrink arbitrarily. (We’ll see how soon.)

So let’s declare an array String[] array and an int size to hold these values.

public class StringArrayList implements StringList {
	String[] array;
	int size;
}

(On board) Conceptually, strings will be added to, gotten, and removed from this array; it’s the implementation’s job to make sure they go in the right place, and that if a user of StringArrayList tries to access an element of the List (that is, of the underlying array) that is invalid, an exception is raised.

Let’s start with a constructor that starts with a small empty array:

public StringArrayList() {
  array = new String[10];
  size = 0;
}

Now let’s do the simple methods:

public int size() {
  return size;
}


@Override
public String get(int i) throws IndexOutOfBoundsException {
  return array[i];
}

But remember, while the array might be of size 10, there might be fewer than 10 (even no!) strings stored in the array. So a correct definition would instead read:

public String get(int i) throws IndexOutOfBoundsException {
  if (i >= size || i < 0) {
    throw new IndexOutOfBoundsException();
  }

  return array[i];
}

This is important to understand: the List acts like a list of elements with a size equal to the number that have been added (less the number removed). Even though there’s an underlying array of a different size, the user of the List interface cannot see it! The details are said to be encapsulated. This is a very powerful concept that lets you use data structures (and generally any API) by reading their contract – you don’t need to fully understand every detail of the implementation (though it can be helpful to do so!).

Now let’s turn to some of the more complicated methods, like add:

public void add(String s) {
  array[size] = s;
  size++;
}

This sorta works, but what happens once we add the eleventh element? We’ll overflow the array bounds, which we don’t want to do – our list is supposed to be unbounded. Instead, we’ll check to see if the array needs to be expanded, and do so:

public void add(String s) {
  if (size == array.length) {
    enlarge();
  }
  array[size] = s;
  size++;
}

What should enlarge do? It should allocate a new, larger array, copy the current array into it, then set the strings instance variable to point to this new array.

void enlarge() {
  String[] larger = new String[array.length * 2];
  for (int i = 0; i < array.length; i++) {
    larger[i] = array[i];
  }
  array = larger;
}

Why double, and not, say, just + 10? The full answer is beyond the scope of this course, but in short: when you don’t know anything else, doubling is the most efficient way to dynamically grow an array. If you do know other things, you might expose ways to grow (or shrink) the underlying array, but that’s has its own problems (like: now users of your code are tied to your specific implementation, even if a better one comes along later).

We’ll stop here for now. More next week!

Welcome

Announcements

Classes and objects

Instance and class variables

Methods

Classes in the JVM

Instantiating objects

Example

Using objects and methods

Packages, namespaces, and the CLASSPATH

Namespaces

Packages

Importing packages

Finding packages

CLASSPATH and friends

Putting your own classes into packages

Another example: PA03

Review: arrays

Operations on arrays

Properties of lists

Comparing lists and arrays

Another example

Why lists?

A ListInterface

Example

Writing the StringArrayList

A `ListInterface`

Writing the `StringArrayList`