Ranking Problems In Machine Learning Theory And Applications
In the last few decades, there has been considerable progress in the understanding of binary classification (learning of binary-valued functions) and regression (learning of real-valued functions), both classical problems in machine learning. Although several questions remain to be answered, there is a well-developed theory in place for these problems, and practical successes have been demonstrated in a variety of applications.
Recently, a new class of learning problems, namely ranking problems, have begun to gain attention. In ranking, one learns a real-valued function that assigns scores to objects, but the scores themselves do not matter; instead, what is important is the relative ranking of objects induced by those scores. Ranking problems arise in a variety of domains: in information retrieval, one wants to rank documents according to relevance to some topic or query; in user-preference modeling, one wants to rank items according to a user's likes and dislikes; in computational biology, one wants to rank genes according to relevance to some disease. Ranking problems are mathematically distinct from both classification and regression, and cannot be analyzed using existing results for these problems.
In this talk, I will describe some recent results in both the theoretical understanding of ranking and its applications. In particular, I will describe generalization bounds for ranking algorithms based on the tools of uniform convergence and algorithmic stability, and some preliminary results on the sample complexity of learning ranking functions. I will conclude with some recent applications to ranking chemical structures for drug discovery.