Lecture 05: Classes, objects, and namespaces; the List API

Welcome

Announcements

ExSEL info: Wednesdays at 7 in DuBois 1302.

Classes and objects

Java is said to be “object-oriented”, which means that the language strongly encourages the use of objects as a paradigm for problem solving.

What’s an object? A collection of related state and behavior. In Java, this means objects are typically variables and associated methods. The “blueprint” for objects is a class. Classes are also the place where we stick stuff that’s not necessarily part of an object. (This falls out of Java’s design: everything is part of a class.)

Classes in the JVM

(Note that like our discussion of the call stack, this is a simplified version to give you a mental model of how things work; the implementation in the JVM differs somewhat.)

A single copy of the class (note: the class! not an object of class!) lives in memory, annotated with its name, each method, and space for each static variable. There also some other stuff: a pointer to the class’s superclass, and a list of interfaces it implements, and the like.

Objects that are instances of the class will be able to refer to these methods and variables here.

Instantiating objects

Objects are defined by a class. But an object does not exist until you instantiate it, that is, “make a new one” using the class as a template for the new object. For example:

Bus bus44 = new Bus();

There’s a class called Bus, and we’ve instantiated an object of type Bus. The object is named by the variable bus44.

Thinking back to our first class, what’s going on in memory when this happens?

Unlike primitive types, which are stored directly in the variable’s allocated space, the variable “holding” an object isn’t really the object’s value. It’s a pointer or reference to the place where the object really lives.

This is really important to understand, so I’ll say it again: what Java stores in primitive type variables and object variables is different, conceptually: The former stores the value itself, the latter stores as its value the memory address of the object. Weird but true!

Let’s do an example. What happens here?

String message = new String("Hi!");
String wassup = message;

First a new String is allocated. Then it’s initialized on the heap. Then a new variable of type String, named message, is created. Its value is set to the address of the actual String object we created.

Note that this String object implicitly refers to a String class somewhere else. When you call methods on an object, it uses this reference to find the method – only one copy of the code for a method exists at a time – all objects of a class share it.

Next, another new variable, again of type String, named wassup is created. The address of message is looked up, and is then assigned to wassup. Both variables now “point to” or “refer to” the same object, which is a String object containing the data "Hi!".

Having two variables refer to the same object is called “aliasing”; it is a common source of program bugs. Always think carefully about the value stored in a variable, not just the name of the variable.

So this leads to an important thing: == vs .equals(). With primitive types (int and so on), you only have one option: ==. What does it do? It looks up the values stored in the variables, and returns true if and only if they are the same. x = 3; y = 3; x==y;

But what happens when we use == on variables that refer to objects? Exactly the same thing! Which might or might not be what we mean. Following from above, let’s add String hello = new String("Hi!");, and ask, does message == wassup? Yes, because they refer to the same object.

Does message == hello? No. Even though the two String objects represent the same value (“Hi!”), they are stored in separate objects at different addresses. So the variables store the value of two different addresses.

But oftentimes we don’t actually want to know if two variables point to the same objects. Instead, we want to know if the two objects’ values are equivalent. In the case of Strings. this means that they hold the same text; you can imagine that for more complicated objects, we might need a more complicated comparison of the two objects’ instance variables. There is a method .equals() that a class can implement to provide a class-specific test of equality. (If you don’t implement it, the Object class’s equals() method, which defaults to ==, is used.)

So we can write message.equals(hello) to check if the two objects store equivalent Strings, rather than the two variables storing the same address as a value.

Example

Suppose we define a Bus class. Note we can use Eclipse to write a semantically meaningful equals method for us (for now – later in the semester we’ll see what all this means and how to do it ourselves, then go back to letting Eclipse write the boilerplate).

public class Bus {
	private int number;
	
	public Bus(int n) {
		number = n;
	}
	
	public static void main(String[] args) {
		Bus busA = new Bus(44);
		Bus busB = new Bus(44);
		
		Bus busC = busA;
		
		System.out.println(busA == busB);
		System.out.println(busA.equals(busB));
		
		busC.setNumber(13);
		System.out.println(busA == busC);
		System.out.println(busA.equals(busC));
		
		busC = new Bus(13);
	}

	public int getNumber() {
		return number;
	}

	public void setNumber(int number) {
		this.number = number;
	}

	@Override
	public int hashCode() {
		final int prime = 31;
		int result = 1;
		result = prime * result + number;
		return result;
	}

	@Override
	public boolean equals(Object obj) {
		if (this == obj)
			return true;
		if (obj == null)
			return false;
		if (getClass() != obj.getClass())
			return false;
		Bus other = (Bus) obj;
		if (number != other.number)
			return false;
		return true;
	}
}

What are the four lines of output (true/false)? (On board: F/T/T/T, and why.)

In-class Exercise

private String x = new String("Jane");

public void printStrings() {
  String x = new String("Ren");
  
  System.out.println(x);
  System.out.println(this.x);
}

What are the lines output by printStrings()?

Another exercise:

private String x = new String("Jane");

public void printEquals() {
  String x = new String("Jane");
  
  System.out.println(x == this.x);
  System.out.println(x.equals(this.x));
}

What are the lines output by printEquals()?

Using objects and methods

If the flow of control is executing within an object, and we access a variable, say x, what happens?

First the local scope is checked, inside out, for example:

class Bar {
  int x = 0;

  private void foo(int j) {
    for (int i = 0; i<x.length; i++) {
      x[i] = j;
    }
  }
}

x is not declared in the for loop, nor in the method body, nor as a parameter, so the next place that’s checked is the object itself, where it’s found. (The superclasses, etc.) This all happens at compile-time; you can’t fail this lookup in a source that correctly compiles.

Relatedly, when methods are called on an object, the type of the object is examined, and the method is looked up in the corresponding class. If it’s not found, the class’s superclass is checked, and so on. Again, this is checked at compile-time. This is how the default “equals” method works; it’s implemented in Object, which is by default the superclass of all classes that don’t otherwise declare a superclass. In other words, it “inherits” the method from its superclass.

We’ll do (slightly) more on inheritance, but best practices over the years have moved toward a relatively flat class hierarchy. Usually you won’t see (or use) deep inheritance, with some exceptions for older codebases and large libraries (like, say, the Java Platform API).

Namespaces

Many programming languages, including Java incorporate the idea of a “namespace”. A namespace is a way to provide context to a particular name.

For a real-world analogy, you might think of a person’s name, say, “Nicholas”. There might be more than one in this class, so we add some context (a surname, or a student ID, or an address, or all of the above) to disambiguate which we mean.

This is very similar to the idea of a variable’s scope in Java, but slightly different, as it’s how we precisely name and identify classes.

For example, we all write System.out.println() all the time, and we all know that System is probably a class, since it starts with a capital letter. Where does it come from, though?

Packages

It’s part of the java.lang package. Java organizes classes into packages; which are a hierarchical sequence of tokens (words), separated by dots. The built-in parts of the Java standard library all are part of the “java.” package, though it’s further subdivided.

For example, the aforementioned java.lang package defines the classes that are fundamental to the design of the language itself: things like System and String are defined here.

http://docs.oracle.com/javase/8/docs/api/java/lang/package-summary.html#package.description

By default, things in this namespace are automatically “imported” into the local namespace. That is, you don’t have to type java.lang.String to declare a String (though you can); String suffices.

Interestingly, it’s not against the rules to define your own System class. But it’s like if you have both an instance variable named x and a local variable named x. By default, Java assumes you want to “more local” one.

int x = 5;

void test() {
  x = 3;
  System.out.println(x);
  System.out.println(this.x);
}

To get the “outer” one, you need to prefix it with this., which tells Java you want the current instance variable with the same name.

Similarly, with a class, you need to fully specify the class if you want access to it. Inside your custom System class, if you want to access the “normal” System.out.println, you’ll need to refer to it by its full name, java.lang.System.out.println.

java and javax are reserved by the JVM for built-in and extensions classes, but much like anyone can register a domain name on the Internet, anyone can declare a package namespace. There’s a loose convention that you should use your reverse domain name as a prefix (for example, if our department released a package for autograding, we might put it in the edu.umass.cs.autograder package). But many modern java packages declare a top-level namespace – you see this in most assignments for this class, where we just define a similar top-level namespace. In practice, projects that do similar things don’t usually have the same name, and/or agree to avoid namespace collisions.

Importing packages

Sometimes you want to use something not in the current namespace. Then you need to “import” it.

For example, if I want to print a random number between one and six:


public class Dice {
  public static void main(String[] args) {
    Random r = new Random();
    System.out.println(r.nextInt(6) + 1);
  }
}

it won’t compile, because Random is not in the namespace. But I can use an import statement: import java.util.Random; to add it to the namespace.

Eclipse will do this for you in one of its “quick fixes” but beware: there’s sometimes more than one class with the same name! If you import the wrong one, it likely won’t have the behavior you expect!

Finding packages

Where does Java look for packages? By default the JVM has access to a set of “built-in” packages, that form the Java Platform API:

https://docs.oracle.com/javase/8/docs/api/index.html?overview-summary.html

Again, mostly in the java and javax namespace, but also some others.

But where do the compiled classes, that is, the virtual machine code for them, actually live? On my machine, a big chunk of them live in /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/jre/lib in the JARs there.

JARs are essentially ZIPfiles of compiled Java classes with some extra stuff (a Java-specific manifest, describing their contents, that the JVM knows how to read). Let’s take a look in rt.jar. Hey look! Our friends System and String!

You may have noticed that the file, say String.class (which is the compiled representation of String.java) lives in a directory java/lang/. That looks a lot like java.lang., doesn’t it?

Not a coincidence! The JVM requires that packages map to (that is, directly correspond to) directories with the same name(s), and that classes map to .class files within those directories.

But there’s still a piece of the puzzle missing. How does the JVM know where to look for these directories? How did it know, for example, that /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/jre/lib/rt.jar was a place to search?

CLASSPATH and friends

There are three mechanisms the JVM uses. One is under your control: the “CLASSPATH”. The other two you can’t (easily) change – there is a “bootstrap CLASSPATH” and an “extensions directory”.

The latter two are configured when your JVM is installed, and they contain classes that are part of the JRE, JDK platform, and vendor-distributed extensions.

But the CLASSPATH you do control. If you run from within Eclipse, you can add JARs (and directories) to the CLASSPATH by selecting the appropriate menu items, either “Build Path -> Add to Build Path” for JARs or “Build Path -> Use as Source Folder” for a directory. You can also manage the Build Path (“Configure Build Path”) and view what’s on it, which is how I found the rt.jar I showed you earlier.

Putting your own classes into packages

If you want to put a class into a package, like, for example, you want to move our Die class into the nerdy.gaming package, you need to explicitly declare the package at the top of the file, and move it into an appropriate directory, in order for it to compile and be recognized by the JVM. Eclipse will prompt you to do one if you do the other.

Notice that now the top of the file has a nerdy.gaming package declaration; it lists just the package, not the classname (unlike imports, which list a full class name). Also notice that the package is rooted at a directory that’s in our CLASSPATH; in this case, the src/ directory in the Eclipse project. You can tell it’s in the CLASSPATH due to the little “target” on the folder; it’s on the so-called “build path” which Eclipse uses as one component of the CLASSPATH.

In-class exercise

(first, on board:) Suppose I had the following directory hierarchy:

src/marc/liberatore/Banana.java
support/marc/liberatore/Smoothie.java

and the following code in Smoothie.java:

class Smoothie {
  public static void main(String[] args) {
    System.out.println("Add a " + new Banana());
  }
}

and I intended the two classes to both live in the marc.liberatore package.

Questions:

  1. What directories need to be on the classpath in order for this to compile?
  2. What package should we declare at the top of this file?
  3. Do we need to import anything? If so, what?

Review: arrays

In our review of 121 material so far, we’ve exclusively used a single container type, the array. Container types are types that “hold” other types, and the array is probably the most basic: a fixed-size sequence of values (that are themselves either primitive types or references to objects) that can be read or written in approximately “constant time”. We call these values “cells” or “elements” in the array.

Constant time means that no matter how big the array is, (to a first approximation) it takes the same amount of time to access (read or write) an element. We expect each of these statements to execute in about the same amount of time:

array[0] = 5;
System.out.println(array[1]);

array[1000000] = 12;
System.out.println(123456789);

(modulo some caching effects, which are COMPSCI 230335 material).

But arrays have some downsides, as well. For example, they’re fixed size: you need to know how many elements you want in advance. You can cheat here by allocating a giant array, but that’s wasteful for small inputs, and potentially won’t work anyway if your data is of size (giant array + 1).

Instead, we might use a higher-level abstraction; that is, a more general “container type” than an array. This week we’ll describe the List:

We’ll show how our List compares to the Java API’s List, and this will lead into generics and container types, two topics that will come up again and again in material this semester.

Operations on arrays

To recap, what can you do with an array? You can declare it, allocate it, read or write individual elements, and determine its length at runtime.

Declare an array of Strings called strings.

Allocate a String array of size 50 and assign it to strings.

Set the zeroth element of strings to “Hi”.

Print the zeroth element of strings.

Print the length of strings.

String[] strings;
strings = new String[50];
strings[0] = "Hi";
System.out.println(strings[0]);
System.out.println(strings.length);

That’s it for builtins of the array. If you want to do much else, you gotta build it yourself. (Note that there is a java.util.Arrays that has some helpful methods you can call on arrays, in particular the static Arrays.toString method is helpful when caveman debugging arrays of primitive types; otherwise the debugger can be helpful.)

So let’s talk about the List abstract data type.

Lists

Note I said “abstract data type.” First we’ll talk about the properties and assumptions we might expect from a List, in the abstract. Then we’ll do an actual, concrete implementation of the data type and see how it measures up.

“List” is a very overloaded term; we’ll simplify this by choosing a specific set of assumptions, that implicitly define an abstraction:

Declare an List of Strings called strings.

Allocate a String List and assign it to strings.

Insert the string “Hi” at the front of the list strings.

Append the string “Bye” at the end of the list strings.

Print the zeroth element of strings.

Print the length of strings.

List<String> strings;
strings = new ArrayList<>();
strings.add(0, "Hi");
strings.add("Bye");
System.out.println(strings.get(0));
System.out.println(strings.size());

Some things to notice:

The type of strings is List<String>. It’s a variable of type List; we could also make it of type ArrayList or LinkedList, but what we care about is that it satisfies the List interface (see the javadocs).

Further, it’s a parameterized type. Much like methods can take arguments, so can types! Usually, we see this with container classes (like Lists), where the argument (in <>s) is the type of thing it’s holding. More on this later.

Sorta-like arrays, Lists have an add operation that either appends, or inserts at a specified point – we illustrate both here. They also have a set operation, but it can only replace existing items, not insert new ones. And a get, which is much like array lookups.