The Assignment 07 submissions have been graded. I have posted grades on Moodle.
The test cases referenced in the autograder’s output are available for download here: assignment07-test-cases.tar.gz
A Java sample solution is available here: NaiveBayesClassifier.tar.gz. The useAlternativeMethod
flag indicates which way the classifier will handle ?
values (see below).
Some notes on the tests and grading:
Correctness was defined as outputting both the correct class and the correct probability of that class. With a few minor exceptions detailed here and that we accounted for when grading, there was no ambiguity in the correct output. Just outputting the right class label was not sufficient for credit, nor was getting anything but close (±0.01) to the correct probability.
There were two sets of tests. The
small
tests were on synthetic data and intended to evoke specific types of errors. For one of them, I used a student’s test that was posted to Moodle; many people used it to validate their submission so I thought you should get credit for it. Thelarge
tests were on subsets of the full vote data.No test explicitly pushed at the boundaries of underflow. (Do not take this to mean that the tests in Assignment 10 won’t!) The log transform can lose a small bit of precision, but not enough to matter to the tests we did (within ±0.01 of the correct value). It’s not actually necessary to use it unless underflow occurs—look at
estimateProbability()
in the solution for an example.We accounted for floating-point roundoff errors around 0.5, for example,
democrat,0.5000000000000004
was acceptable whenrepublican,0.5
was the correct answer.As noted in the assignment question and answer section, we accepted two distinct ways of handling unknown values. If your program’s output didn’t match the way that I specified, the autograder re-checked it against the way that Patrick specified. These are the
.alt
ernative solutions.
As usual, if you think something is amiss, please email or come see me or Patrick.