Kamal Nigam wrote code for test/train splits, for some of the word probability smoothing, and for splitting nodes in hierarchical clustering, as well as many other fixes and improvements.
Sean Slattery wrote heap-based code for iterating through all the documents in an inverted index and implemented k-nearest-neighbor classification.
Jason Rennie added the hdb interface, as well as many other fixes.
Many people have provided bug reports and minor fixes. They are listed in the ChangeLog provided with the source.