Ian Gemp

Ian Gemp

External Links

Inspired by Sameer Singh; Powered by Bootstrap

Publications

Publications (grouped by year): 2018, 2017, 2016, 2015, 2014, 2011, 2010.
2018
  • I. Gemp, S. Mahadevan. Global Convergence to the Equilibrium of GANs using Variational Inequalities. arXiv preprint arXiv:1808.01531. 2018
    In optimization, the negative gradient of a function denotes the direction of steepest descent. Furthermore, traveling in any direction orthogonal to the gradient maintains the value of the function. In this work, we show that these orthogonal directions that are ignored by gradient descent can be critical in equilibrium problems. Equilibrium problems have drawn heightened attention in machine learning due to the emergence of the Generative Adversarial Network (GAN). We use the framework of Variational Inequalities to analyze popular training algorithms for a fundamental GAN variant: the Wasserstein Linear-Quadratic GAN. We show that the steepest descent direction causes divergence from the equilibrium, and guaranteed convergence to the equilibrium is achieved through following a particular orthogonal direction. We call this successful technique Crossing-the-Curl, named for its mathematical derivation as well as its intuition: identify the game's axis of rotation and move "across" space in the direction towards smaller "curling".
              @article{gemp2018crosscurl,
              author={Gemp, Ian and Mahadevan, Sridhar},
              title = {Global Convergence to the Equilibrium of GANs using Variational Inequalities},
              year = {2018}
              journal = {arXiv preprint arXiv:1808.01531},
              }
  • B. Liu, I. Gemp, M. Ghavamzadeh, J. Liu,S. Mahadevan, M. Petrik. Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity. Journal of Artificial Intelligence Research. 2018
    In this paper, we introduce proximal gradient temporal difference learning, which provides a principled way of designing and analyzing true stochastic gradient temporal difference learning algorithms. We show how gradient TD (GTD) reinforcement learning methods can be formally derived, not by starting from their original objective functions, as previously attempted, but rather from a primal-dual saddle-point objective function. We also conduct a saddle-point error analysis to obtain finite-sample bounds on their performance. Previous analyses of this class of algorithms use stochastic approximation techniques to prove asymptotic convergence, and do not provide any finite-sample analysis. We also propose an accelerated algorithm, called GTD2-MP, that uses proximal ``mirror maps'' to yield an improved convergence rate. The results of our theoretical analysis imply that the GTD family of algorithms are comparable and may indeed be preferred over existing least squares TD methods for off-policy learning, due to their linear complexity. We provide experimental results showing the improved performance of our accelerated gradient TD methods.
              @article{liu2018proximal,
              title={Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity},
              author={Liu, Bo and Gemp, Ian and Ghavamzadeh, Mohammad and Liu, Ji and Mahadevan, Sridhar and Petrik, Marek},
              journal={Journal of Artificial Intelligence Research},
              volume={63},
              pages={461--494},
              year={2018}
              }
2017
  • I. Gemp, M. Parente, S. Mahadevan. Inverting VAEs for Improved Generative Accuracy. NIPS Workshop: Advances in Approximate Bayesian Inference. 2017
    Recent advances in semi-supervised learning with deep generative models have shown promise in generalizing from small labeled datasets ($\mathbf{x}_l,\mathbf{y}_l$) to large unlabeled ones ($\mathbf{x}_u$). When the codomain ($\mathbf{y}$) has known structure, a large un\emph{featured} dataset ($\mathbf{y}_u$) is potentially available. We develop a parameter-efficient, deep semi-supervised generative model for the purpose of exploiting this untapped data source. Empirical results show improved performance in disentangling variable semantics as well as improved discriminative prediction on a new MNIST task.
              @article{gemp2017nips,
              author={Gemp, Ian and Parente, Mario and Mahadevan, Sridhar},
              title = {Inverting VAEs for Improved Generative Accuracy},
              year = {2017}
              journal = {NIPS Workshop: Advances in Approximate Bayesian Inference},
              }
  • I. Gemp, S. Mahadevan. Online Monotone Games. arXiv preprint arXiv:1710.07328. 2017
    Algorithmic game theory (AGT) focuses on the design and analysis of algorithms for interacting agents, with interactions rigorously formalized within the framework of games. Results from AGT find applications in domains such as online bidding auctions for web advertisements and network routing protocols. Monotone games are games where agent strategies naturally converge to an equilibrium state. Previous results in AGT have been obtained for convex, socially-convex, or smooth games, but not monotone games. Our primary theoretical contributions are defining the monotone game setting and its extension to the online setting, a new notion of regret for this setting, and accompanying algorithms that achieve sub-linear regret. We demonstrate the utility of online monotone game theory on a variety of problem domains including variational inequalities, reinforcement learning, and generative adversarial networks.
              @article{gemp2017omg,
              author={Gemp, Ian and Mahadevan, Sridhar},
              title = {Online Monotone Games},
              year = {2017}
              journal = {arXiv preprint arXiv:1710.07328},
              }
  • I. Gemp, M. Parente, I. Durugkar, D. Dyar, S. Mahadevan Inverting Variational Autoencoders for Improved Generative Accuracy. arXiv preprint arXiv:1608.05983. 2017
    Recent advances in semi-supervised learning with deep generative models have shown promise in generalizing from small labeled datasets (x,y) to large unlabeled ones (x). In the case where the codomain has known structure, a large unfeatured dataset (y) is potentially available. We develop a parameter-efficient, deep semi-supervised generative model for the purpose of exploiting this untapped data source. Empirical results show improved performance in disentangling latent variable semantics as well as improved discriminative prediction on Martian spectroscopic and handwritten digit domains.
              @article{gemp2017untapped,
              author={Gemp, Ian and Parente, Mario and Durugkar, Ishan and Dyar, Darby and Mahadevan, Sridhar},
              title = {Inverting Variational Autoencoders for Improved Generative Accuracy},
              year = {2017}
              journal = {arXiv preprint arXiv:1608.05983},
              }
  • I. Durugkar, I. Gemp, S. Mahadevan Generative Multi-Adversarial Networks. ICLR. 2017
    Generative adversarial networks (GANs) are a framework for producing a generative model by way of a two-player minimax game. In this paper, we propose the Generative Multi-Adversarial Network (GMAN), a framework that extends GANs to multiple discriminators. In previous work, the successful training of GANs requires modifying the minimax objective to accelerate training early on. In contrast, GMAN can be reliably trained with the original, untampered objective. We explore a number of design perspectives with the discriminator role ranging from formidable adversary to forgiving teacher. Image generation tasks comparing the proposed framework to standard GANs demonstrate GMAN produces higher quality samples in a fraction of the iterations when measured by a pairwise GAM-type metric.
              @article{durugkargemp2017gman,
              author={Durugkar, Ishan and Gemp, Ian and Mahadevan, Sridhar},
              title = {Generative Multi-Adversarial Networks},
              year = {2017}
              journal = {ICLR},
              }
  • I. Gemp, G. Theocharous, M. Ghavamzadeh. Automated Data Cleansing through Meta-Learning. IAAI Challenge Paper. 2017
    Data preprocessing or cleansing is one of the biggest hurdles in industry for developing successful machine learning applications. The process of data cleansing includes data imputation, feature normalization & selection, dimensionality reduction, and data balancing applications. Currently such preprocessing is manual. One approach for automating this process is meta-learning. In this paper we experiment with state of the art meta-learning methodologies and identify the inadequacies and research challenges for solving such a problem.
              @article{gemp2017datacleanse,
              author={Gemp, Ian and Theocharous, Georgios and Ghavamzadeh, Mohammad},
              title = {Automated Data Cleansing through Meta-Learning},
              year = {2017}
              journal = {IAAI},
              }
2016
  • I. Gemp, S. Mahadevan. Online Monotone Optimization. arXiv preprint arXiv:1608.07888. 2016
    This paper presents a new framework for analyzing and designing no-regret algorithms for dynamic (possibly adversarial) systems. The proposed framework generalizes the popular online convex optimization framework and extends it to its natural limit allowing it to capture a notion of regret that is intuitive for more general problems such as those encountered in game theory and variational inequalities. The framework hinges on a special choice of a system-wide loss function we have developed. Using this framework, we prove that a simple update scheme provides a no-regret algorithm for monotone systems. While previous results in game theory prove individual agents can enjoy unilateral no-regret guarantees, our result proves monotonicity sufficient for guaranteeing no-regret when considering the adjustments of multiple agent strategies in parallel. Furthermore, to our knowledge, this is the first framework to provide a suitable notion of regret for variational inequalities. Most importantly, our proposed framework ensures monotonicity a sufficient condition for employing multiple online learners safely in parallel.
              @article{gemp2016online,
              author={Gemp, Ian and Mahadevan, Sridhar},
              title = {Online Monotone Optimization},
              year = {2016}
              journal = {arXiv preprint arXiv:1608.07888},
              }
2015
  • I. Gemp. Exploring the Dynamics of Variational Inequality Games with Non-Concave Utilities. NIPS Workshop: Learning, Inference, and Control of Multi-Agent Systems. 2015
    Variational inequality (VI) theory has proven useful in modeling and analyzing a variety economic markets. However, in order to ensure the analysis is tractable, models are usually constrained to an unrealistic regime of concave utilities and monotone operators undermining the reliability of real-world conclusions such as the uniqueness and location of equilibria. We argue that machine learning can help address this issue. In this paper, we ignore typical monotonicity requirements and construct a generic, yet more realistic market model possessing several desirable qualities. We then borrow a tool from dynamical systems to cope with our model’s lack of theoretical guarantees. Additionally, in order to handle the large size of standard VI game formulations, we further enhance the tool to accommodate more sophisticated numerical algorithms and propose a heuristic for efficient use of generated trajectories. We illustrate these enhancements by applying the resulting pipeline in the context of cloud services.
              @article{gemp2015nonconcave,
              author={Gemp, Ian},
              title = {Exploring the Dynamics of Variational Inequality Games with Non-Concave Utilities},
              year = {2015}
              booktitle={NIPS Workshop: Learning, Inference, and Control of Multi-Agent Systems},
              }
  • I. Gemp, S. Mahadevan. Finding Equilibria in Large Games using Variational Inequalities. AAAI Spring Symposium. 2015
    In this paper, we explore an approach to computational game theory based on variational inequalities (VIs). VIs represent a comprehensive framework that provides a way to model and analyze both cooperative and non-cooperative games. Given the potentially large size of real-world games, suitable algorithms must be designed that can scale gracefully with the dimension of the problems (e.g., number of players). In this paper, we explore the effectiveness of novel Runge-Kutta methods on finding equilibrium solutions to two real-world games defined by oligopolistic economies.
              @article{gemp2015games,
              author={Gemp, Ian and Mahadevan, Sridhar},
              title = {Finding Equilibria in Large Games using Variational Inequalities},
              year = {2015}
              booktitle={AAAI Spring Symposium},
              }
  • I. Gemp, S. Mahadevan, B. Liu. Solving Large Sustainable Supply Chain Networks using Variational Inequalities. AAAI Workshop: Computational Sustainability. 2015
    In this paper, we explore a new approach to computational sustainability based on variational inequalities (VIs). Our challenge is to compute the steady state behaviors of networks of sustainable supply chains with possibly conflicting objectives. VIs provide a way to model large networks with numerous conflicting goals. Given the size of real-world networks, suitable algorithms must be selected that can scale with the dimension of the problems. In this paper, we explore the effectiveness of novel Runge-Kutta methods on finding equilibrium solutions to two real-world sustainable supply chain problems.
              @article{gemp2015supplychain,
              author={Gemp, Ian and Mahadevan, Sridhar and Liu, Bo},
              title = {Solving Large Sustainable Supply Chain Networks using Variational Inequalities},
              year = {2015}
              booktitle={AAAI Workshop: Computational Sustainability},
              }
2014
  • I. Gemp, S. Mahadevan. Modeling Context in Cognition using Variational Inequalities. AAAI Fall Symposium. 2014
    Important aspects of human cognition, like creativity and play, involve dealing with multiple divergent views of objects, goals, and plans. We argue in this paper that the current model of optimization that drives much of modern machine learning research is far too restrictive a paradigm to mathematically model the richness of human cognition. Instead, we propose a much more flexible and powerful framework of equilibration, which not only generalizes optimization, but also captures a rich variety of other problems, from game theory, complementarity problems, network equilibrium problems in economics, and equation solving. Our thesis is that creative activity involves dealing not with a single objective function, which optimization requires, but rather balancing multiple divergent and possibly contradictory goals. Such modes of cognition are better modeled using the framework of variational inequalities (VIs). We provide a brief review of this paradigm for readers unfamiliar with the underlying mathematics, and sketch out how VIs can account for creativity and play in human and animal cognition.
              @article{gemp2014context,
              author={Gemp, Ian and Mahadevan, Sridhar},
              title = {Modeling Context in Cognition using Variational Inequalities},
              year = {2014}
              booktitle={AAAI Fall Symposium},
              }
  • S. Mahadevan, B. Liu, P. Thomas, W. Dabney, S. Giguere, N. Jacek, I. Gemp, J. Liu. Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces. arXiv preprint arXiv:1405.6757. 2014
    In this paper, we set forth a new vision of reinforcement learning developed by us over the past few years, one that yields mathematically rigorous solutions to longstanding important questions that have remained unresolved: (i) how to design reliable, convergent, and robust reinforcement learning algorithms (ii) how to guarantee that reinforcement learning satisfies pre-specified "safety" guarantees, and remains in a stable region of the parameter space (iii) how to design "off-policy" temporal difference learning algorithms in a reliable and stable manner, and finally (iv) how to integrate the study of reinforcement learning into the rich theory of stochastic optimization. In this paper, we provide detailed answers to all these questions using the powerful framework of proximal operators. The key idea that emerges is the use of primal dual spaces connected through the use of a Legendre transform. This allows temporal difference updates to occur in dual spaces, allowing a variety of important technical advantages. The Legendre transform elegantly generalizes past algorithms for solving reinforcement learning problems, such as natural gradient methods, which we show relate closely to the previously unconnected framework of mirror descent methods. Equally importantly, proximal operator theory enables the systematic development of operator splitting methods that show how to safely and reliably decompose complex products of gradients that occur in recent variants of gradient-based temporal difference learning. This key technical innovation makes it possible to finally design "true" stochastic gradient methods for reinforcement learning. Finally, Legendre transforms enable a variety of other benefits, including modeling sparsity and domain geometry. Our work builds extensively on recent work on the convergence of saddle-point algorithms, and on the theory of monotone operators.
              @article{mahadevan2014proximal,
              author={Mahadevan, Sridhar and Liu, Bo and Thomas, Philip and Dabney, Will and Giguere, Steve and Jacek, Nicholas and Gemp, Ian and Liu, Ji},
              title = {Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces},
              year = {2014},
              journal={arXiv preprint arXiv:1405.6757},
              }
2011
  • I. Gemp, R. Carthew, S. Hilgenfeldt. Cadherin-dependent cell morphology in an epithelium: constructing a quantitative dynamical model. PLoS Computational Biology. 2011
    Cells in the Drosophila retina have well-defined morphologies that are attained during tissue morphogenesis. We present a computer simulation of the epithelial tissue in which the global interfacial energy between cells is minimized. Experimental data for both normal cells and mutant cells either lacking or misexpressing the adhesion protein N-cadherin can be explained by a simple model incorporating salient features of morphogenesis that include the timing of N-cadherin expression in cells and its temporal relationship to the remodeling of cell-cell contacts. The simulations reproduce the geometries of wild-type and mutant cells, distinguish features of cadherin dynamics, and emphasize the importance of adhesion protein biogenesis and its timing with respect to cell remodeling. The simulations also indicate that N-cadherin protein is recycled from inactive interfaces to active interfaces, thereby modulating adhesion strengths between cells.
              @article{gemp2011cadherin,
              title={Cadherin-dependent cell morphology in an epithelium: constructing a quantitative dynamical model},
              author={Gemp, Ian M and Carthew, Richard W and Hilgenfeldt, Sascha},
              journal={PLoS computational biology},
              volume={7},
              number={7},
              pages={e1002115},
              year={2011},
              publisher={Public Library of Science}
              }
2010
  • G. Duncan, K. Rabl, I. Gemp, R. Heidelberger, W.B. Thoreson. Quantitative analysis of synaptic release at the photoreceptor synapse. Biophysical Journal. 2010
    Exocytosis from the rod photoreceptor is stimulated by submicromolar Ca2+ and exhibits an unusually shallow dependence on presynaptic Ca2+. To provide a quantitative description of the photoreceptor Ca2+ sensor for exocytosis, we tested a family of conventional and allosteric computational models describing the final Ca2+-binding steps leading to exocytosis. Simulations were fit to two measures of release, evoked by flash-photolysis of caged Ca2+: exocytotic capacitance changes from individual rods and postsynaptic currents of second-order neurons. The best simulations supported the occupancy of only two Ca2+ binding sites on the rod Ca2+ sensor rather than the typical four or five. For most models, the on-rates for Ca2+ binding and maximal fusion rate were comparable to those of other neurons. However, the off-rates for Ca2+ unbinding were unexpectedly slow. In addition to contributing to the high-affinity of the photoreceptor Ca2+ sensor, slow Ca2+ unbinding may support the fusion of vesicles located at a distance from Ca2+ channels. In addition, partial sensor occupancy due to slow unbinding may contribute to the linearization of the first synapse in vision.
              @article{duncan2010quantitative,
              title={Quantitative analysis of synaptic release at the photoreceptor synapse},
              author={Duncan, Gabriel and Rabl, Katalin and Gemp, Ian and Heidelberger, Ruth and Thoreson, Wallace B},
              journal={Biophysical journal},
              volume={98},
              number={10},
              pages={2102--2110},
              year={2010},
              publisher={Elsevier}
              }