CMPSCI 187: Programming With Data Structures

David Mix Barrington

Fall, 2011

Programming Project #7: Some Games With Words

Posted 20 November 2011, due at 5:00 p.m. EDT on Friday 9 December 2011, by placing .java files and .class files in your cs187 directory on your edlab account. (I am not allowed to have a non-final assignment due after classes end.) Please place files for this assignment in a subdirectory of cs187 called "project7".

As we get questions on this assignment we will put answers on the Q&A page.

Some useful code is available in this directory on the course web site. We will make a sample driver available when we get to it.

Note that this is a double-sized programming assignment, counting 6% of your final grade rather than 3%.

More detail added on 27 November.

Goals of this project:

  1. Build a prefix tree from a large real-world data set, read from a file, to allow fast checks for membership in the set.
  2. Implement a heap-based priority queue, without using the canned data structures.

You may borrow code from any of L&C's classes, or from our solutions, with specific attribution in a comment.

In this project we are going to play some games with words, using two data structures that you will implement directly -- a prefix tree and a heap-based priority queue.

If I sold koalas, I might want to have the telephone number 1-800-KOALAxx, where xx were any two digits, because this would be easier for my customers to remember. Telephone numbers use digits, but each digit (except 0 and 1) can be used to represent any of three or four letters, so that a telephone number may serve as a phoneword.

You're going to build a utility that will input a five-digit number and output all the possible phonewords for it that are actual five-letter words in English. How do we decide whether a word is legitimate? The legendary computer scientist Donald Knuth has lovingly assembled a list, and made it available on the web here. (I incorrectly said in lecture that "jihad" was not on the list -- it is.)

We don't want to make a linear search through this list of 5757 words every time we need to test membership, so you will put the nodes in a prefix tree or trie. I will put code for the classes PTNode and PrefixTree in the solutions directory for this project. You will need to read the words from the file and add each one to the prefix tree, along with nodes for the prefixes of each word. (If the word "koala" is added to the tree, it will then have nodes for "", "k", "ko", "koa", and "koal" as well as "koala" itself.)

Then you will need to write a method that takes in the five digits, as a String, and outputs the list of possible phonewords. (One optional wrinkle -- allow the input string to have stars in it, so that on input "22*69" you would give all possible phonewords for the numbers "22269", "22369", "22469", "22569", "22669", "22769", "22869", and "22969". There will be extra credit for handling this, so that a perfect program that handles it will be an A+ instead of an A.) One final complication -- the order in which you give the output phonewords must be in decreasing order of Scrabble score.

To accomplish this last, we'd like you to use a priority queue as follows. When you discover a phoneword for your input number, make an entry for the priority queue with the word and its Scrabble score, and add it to the queue. The key for the queue is the score -- "smallest" means the highest score. Once all the words are in the queue you may read them out and give them to the user in the proper order.

You will get partial credit for doing this with a PriorityQueue object from java.util, defining a class of PQNode objects with an appropriate compareTo method. But for full credit we would like you to define a HeapPQ class that implements the PriorityQueueADT interface given on Practice Midterm #3 (and in the solutions directory). L&C's code in Chapter 11 should be useful here.

Added 27 November: I corrected two small errors in the code I put up, and added a stub for the PhonewordLister class. A PhonewordLister object will contain a prefix tree for Knuth's word list -- you should construct this tree in the constructor for this class. I wrote a static utility method to compute the Scrabble score of a word. (It works on upper-case letters, but since Knuth's list is all lower-case you may as well assume that all letters are lower-case.) Finally, there is a stub for the list method, which is to return an array of the phonewords that can be made from its input string, in descending order of Scrabble score. It should return an array of length 0 if the input is invalid or if no phonewords can be made.

The priority queue mentioned above should be a local variable to the list method.

Last modified 30 November 2011