Interactive Machine Learning: Algorithms and Theory

CMPSCI 691E, Fall 2016

Akshay Krishnamurthy


Optional Reading

Active Learning
  • Parametric
    • Minimax bounds for active learning
    • Active and passive learning of linear separators under log-concave distributions
    • Better algorithms for selective sampling
  • Nonparametric
    • Hierarchical Sampling for Active Learning
  • Disagreement-based
    • CAL -- Improving Generalization with Active Learning + Daniel's note
    • Agnostic Active Learning -- BBL
    • Importance-weighted active learning
    • DHM a General agnostic active learning algorithm.
  • Overview
    • Two faces of active learning

Bandit

  • Classical
    • The non-stochastic multi-armed bandit problem
    • Regret analysis of stochastic and nonstochastic multi-armed bandit problems
  • Contextual
    • The epoch-greedy algorithm for contextual mulit-armed bandits
    • Taming the monster: a fast and simple algorihtm for contextual bandits
  • Parametric
    • Improved algorithms for linear stochastic bandits
  • Thompson Sampling
    • Learning to optimize via Posterior Sampling
  • GP-Bandits
    • Gaussian Process Optimization in the Bandit Seting: No Regret and Experimental Design
  • Markovian
    • Mahajan and Teneketzis -- Multi-armed Bandit Problems

Reinforcement Learning

  • Tabular
    • Near-optimal reinforcement learning in polynomial time
    • R-max: a general polynomial time algorithm for near-optimal reinforcement learning
    • PAC model-free reinforcement learning
    • Near-optimal regret bounds for reinforcement learning
    • Generalization and Exploration via Randomized Value Functions
  • Policy Gradient
    • Policy gradient methods for reinforcement learning with function approximation
  • Contextual-MDPS for PAC-reinforcement learning with rich observations

Unsupervised

  • Adaptive Sensing
    • Distilled Sensing
  • Clustering
    • Learning the crowd kernel
    • Efficient Active algorithms for hierarchical clustering
    • Clustering with interactive feedback
  • Network Tomography
    • Network tomography: recent developments