Machine Learning and Friends Lunch

Learning and Logic: Theoretical Foundations and Efficient Systems

Abstract

An interesting class of problems in machine learning arises when
data points have internal structure. Such problems occur, for example,
when analysing biochemical data where each data point represents a
molecule and where one would like to be able to predict whether unseen
molecules are `active' (e.g. a useful drug or not). In this case the
atom-bond relations and other structural properties of the molecule
are clearly important. A natural representation for such problems uses
first-order Horn logic, or Prolog programs, to describe both the data
and the predictors.

The talk will review recent work on charaterizing the complexity of
such machine learning problems, developing algorithms for tractable
cases, and developing effective systems based on these. Theoretical
analysis is done in a model where, in addition to looking at the data
given to it, the learning algorithm can ask questions, e.g. show a new
molecule and ask whether it is active. The analysis uses properties
of the underlying logic that lead to new algorithms as well as
non-trivial upper and lower bounds for the complexity of
learning. Implementing systems for logic learning also requires
efficient solutions for a matching problem known as subsumption. Our
system LogAnH (that can work with or without the additional learner's
questions) embdeds the learning algorithm together with fast
subsumption tests. The system has been successfully applied to several
challenging datasets.

Back to ML Lunch home