CMPSCI 383: Artificial Intelligence

Fall 2014 (archived)

Assignment 05 Sample Solution in Java

At multiple students’ requests, I’ve transliterated the Python solution to Assignment 05 into Java. You can download it here: FJDQuery.tar.gz.

This is a fairly direct translation that takes the same approach as the Python previously posted. The entire dataset is loaded, then filtered by the condition in the query. Then, each line of the output is generated by creating another subset of the data, corresponding to exactly one possible setting of the query variables.

I suggest you eyeball the Cartesian product code in Query.java. The implementation I wrote is recursive. Recursion is a convenient way to think about things, but can cause problems in some languages (like Java) if too many recursive calls are nested. With prior knowledge of the data set we were working with, I knew it wouldn’t be a problem here.

You should understand how this method works, and perhaps take the time to translate it to an iterative form if you’re rusty on the equivalence between recursive and iterative algorithms. If you don’t know how to do that, then dust off your 220 notes, or come see me and we can talk about it.

The solution clocks in at about 150 source lines of code, which is within the typical range of expansion I expect when going from Python to Java (1.5x – 3x). The extra length in Java comes from a few things. For example, Python has less boilerplate and supports very useful syntax sugar to reduce code length (such as list comprehensions).

Python also has a more comprehensive standard library. The input code in Python is a line or two, where Java took about a dozen; similarly, generating the Cartesian product of variable settings is a library call in Python but needs to be written in Java. You can add third-party JARs to Java to help cut down on code length (e.g., OpenCSV to read car.data and Guava for many things, including Cartesian product). I chose not to do that so as to keep this solution self-contained, but you might consider it in future assignments. A drawback to this approach is that you have to understand the third-party API well enough to be sure you’re using it correctly.