Complete result settings for 20 Newsgroup dataset (table columns are sortable)
The experiments are described in Rooshenas and Lowd, Discriminative Structure Learning of Arithmetic Circuits, AIStats 16

Split Penalty Standard Deviation L1 Penalty Train LL Validation LL Test LL Node# Edge# Feature# Learning Time
10 0.5 2 -142.118093 -125.063769 -158.950035 142004 303473 7370 75291.263987
10 0.5 1 -142.057639 -125.072530 -158.915246 121782 255646 7584 75326.145684
10 0.5 0.1 -142.035238 -125.072644 -158.918726 122978 259521 7602 75602.978599
10 0.1 2 -142.085052 -125.069620 -158.820284 197999 435216 7308 75442.958926
2 0.1 1 -142.159419 -124.962992 -158.675351 93427 181974 9420 75219.044966
2 0.5 1 -142.330620 -125.112547 -159.030712 100069 199147 9048 75249.213380
2 0.1 2 -142.024692 -124.856558 -158.602876 92615 180971 9358 75299.384753
2 0.5 0.1 -142.450823 -125.244458 -159.116658 100473 200090 8886 75283.983094
5 0.5 1 -142.144101 -125.029958 -158.940937 89578 177573 8526 75253.093790
2 0.1 0.1 -142.223944 -125.010829 -158.805136 95907 187937 9340 75318.745809
2 0.5 2 -142.212875 -125.045312 -158.845595 97193 191953 9092 75335.587249
5 0.1 0.1 -142.013264 -124.949660 -158.698540 111646 229130 8430 75333.623547
5 0.1 2 -142.116139 -125.034212 -158.743644 109806 224676 8120 75334.374433
5 0.1 1 -142.160713 -125.085464 -158.744773 109044 223662 8206 75346.833539
10 0.1 0.1 -142.310054 -125.338517 -159.011130 105776 219476 7298 75442.831945
5 0.5 0.1 -142.096618 -125.009971 -158.832504 91757 181516 8684 75485.660434
10 0.1 1 -141.967263 -124.990242 -158.781947 202726 445953 7474 75470.860685
5 0.5 2 -142.115104 -125.022996 -158.765749 106005 217209 8244 75521.000062