[Go home] News (08/12/2013): RankLib is now a part of The Lemur Project, which develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including Indri and Galago search engines and the ClueWeb09 dataset. I am still managing RankLib and will continue to do so. And I now have a proper channel for bug reports, feature requests and community contribution. RankLib's license is still BSD, as most (if not all) of the softwares in The Lemur Project. Please visit the new home of RankLib for more details. Overview RankLib is a library of learning to rank algorithms. Currently eight popular algorithms have been implemented: MART (Multiple Additive Regression Trees, a.k.a. Gradient boosted regression tree) [6] RankNet [1] RankBoost [2] AdaRank [3] Coordinate Ascent [4] LambdaMART [5] ListNet [7] Random Forests [8] With appropriate parameters for Random Forests, it can also do bagging several MART/LambdaMART rankers. It also implements many retrieval metrics as well as provides many ways to carry out evaluation. License RankLib is available under BSD license. Mailing list If you find RankLib useful and want to receive email notices when a new version / bug fix comes out, please subscribe to this Google groups mailing list. This list is used strictly for announcement only, so you won't be seeing any discussion/spam emails. Older versions (Source codes & Binary) RankLib-v2.1 [Download] (July 2012) Add ListNet Add Random Forests With little manual work, it can do BagBoo/Bagging LambaMART too. Default value for some parameter has been changed. RankLib-v2.0 [Download] (May 2012) Add MART Add LambdaMART Change the calculation of NDCG to the standard version: (2^{rel_i} - 1) / log_{2} (i+1). Therefore, the absolute NDCG score might be slightly lower than before. Add zscore normalization. Fix the divide-by-zero bug related to the sum normalization ( q->D={d1,d2,...,d_n}; F={f1,f2,...,fm}; fk_di = fk_di / sum_{dj \in D} |fk_dj| ). Add the ability to split the training file to x% train and (100-x)% validation (previous versions only allow train/test split, not train/validation). Add some minor cmd-line parameters. Internal code clean up for slight improvement in efficiency/speed. Some cmd-line parameter strings have been changed. RankLib-v1.2.1 [Download] (April 2012) Fix a bug related to RankBoost not dealing properly with features whose values are negative. RankLib-v1.2 [Download removed] Fix the error with sparse train/test/validate file (with v 1.1, when we do not specify features whose value is 0, the system crashes in some cases) Speedup RankNet with batch learning Change default epochs to 50 for RankNet. RankLib-v1.1 [Download] (November 2010) References  [1] C.J.C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton and G. Hullender. Learning to rank using gradient descent. In Proc. of ICML, pages 89-96, 2005. [2] Y. Freund, R. Iyer, R. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. The Journal of Machine Learning Research, 4: 933-969, 2003. [3] J. Xu and H. Li. AdaRank: a boosting algorithm for information retrieval. In Proc. of SIGIR, pages 391-398, 2007. [4] D. Metzler and W.B. Croft. Linear feature-based models for information retrieval. Information Retrieval, 10(3): 257-274, 2007. [5] Q. Wu, C.J.C. Burges, K. Svore and J. Gao. Adapting Boosting for Information Retrieval Measures. Journal of Information Retrieval, 2007. [6] J.H. Friedman. Greedy function approximation: A gradient boosting machine. Technical Report, IMS Reitz Lecture, Stanford, 1999; see also Annals of Statistics, 2001. [7] Z. Cao, T. Qin, T.Y. Liu, M.F. Tsai, and H. Li. Learning to Rank: From Pairwise Approach to Listwise Approach. ICML 2007 [8] L. Breiman. Random Forests. Machine Learning 45 (1): 5-32, 2001.