SELECTED PAPERS

I am no longer updating this page. See my list of publications at the DBLP web page link.

2023

Yichuan Deng, Zhihang Li, Sridhar Mahadevan, and Zhao Song Zero-th Order Algorithm for Softmax Attention Optimization, Arxiv, July 2023.

Yichuan Deng, Sridhar Mahadevan, and Zhao Song Randomized and Deterministic Attention Sparsification Algorithms for Over-parameterized Feature Dimension, Arxiv, April 2023.

Yeqi Gao,, Sridhar Mahadevan, and Zhao Song An Over-parameterized Exponential Regression, Arxiv, March 2023.

Sridhar Mahadevan, Universal Causality, Entropy, vol. 25, Number 4, March 2023.

Shiv Shankar, Ritwik Sinha, Saayan Mitra, Vishwanathan Swaminathan, Sridhar Mahadevan, and Moumita Sinha Privacy Aware Experiments without Cookies, Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining (WSDM '23) Febrary 2023.

Kai Wang, Zhao Song, Georgios Theocharous, and Sridhar Mahadevan, Smooth Online Combninatorial Optimization using Imperfect Predictions, Proceedings of the AAAI, February 2023.

2022

Md. Mehrab Tanjim, Ritwik Sinha, Krishna Kumar Singh, Sridhar Mahadevan, David Arbour, Moumita Sinha, Garrison W. Cottrell, Generating and Controlling Diversity in Image Search, Winter Conference on Applications of Computer Vision (WACV), 2022.

Kai Wang, Zhao Song, Georgios Theocharous, Sridhar Mahadevan, Smoothed Online Combinatorial Otpimization using Imperfect Predictions, Arxiv, 2022.

2021

Sridhar Mahadevan, Universal Decision Models, Arxiv , 2021

Sridhar Mahadevan, Asymptotic Causal Inference, Arxiv, 2021.

Sridhar Mahadevan, Causal Homotopy, Arxiv, 2021.

Sridhar Mahadevan, Causal Inference in Network Economics, Arxiv, 2021.

Sridhar Mahadevan, Anup Rao, Jennifer Healey, Georgios Theocharous, Multi-scale Manifold Warping, Arxiv, 2021.

2020

Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip Thomas, Optimizing for the Future in Non-Stationary MDPs, Proceedings of the ICML, 2020.

2019

Georgios Theocharous, Jennifer Healey, Sridhar Mahadevan, Michele Saad, Personalizing with Human Cognitive Biases, 27th Conference on User Modeling, Adaptation and Personalization, 2019.

2018

Ian Gemp and Sridhar Mahadevan Global Convergence to the Equilibrium of GANs using Variational Inequalities, Arxiv, 2018.

Bo Liu, Ian Gemp, Mohammad Ghamvamzadeh, Ji Liu, Sridhar Mahadevan, and Marek Petrik, Proximal gradient temporal difference learning algorithms, Journal of AI Research (JAIR), November, pp. 461-494, 2018.

Sridhar Mahadevan, Imagination Machines: A New Challenge for Artificial Intelligence, AAAI Conference, 2018.

Sridhar Mahadevan, Bamdev Mishra, and Shalini Ghosh, A Unified Framework for Domain Adaptation using Metric Learning on Manifolds, European Conference on Machine Learning (ECML), September 2018, Dublin, Ireland.

2017

Ian Gemp, Ishan Durugkar, and Sridhar Mahadevan, Generative Multi-Adversarial Networks, International Conference on Learning Representations (ICLR), 2017.

Ian Gemp, Mario Parente and Sridhar Mahadevan, Inverting Variational Autoencoders for Improved Generative Accuracy, NIPS Workshop: Advances in Approximate Bayesian Inference, 2017.

Ian Gemp and Sridhar Mahadevan, Online Monotone Games, Arxiv 1710.07328, 2017.

Thomas Boucher, Darby Dyar, and Sridhar Mahadevan, Proximal Methods in Calibration Transfer, Journal of Chemometrics, vol. 31, Issue 4, April 2017

Steve Giguere, Thomas Boucher, Darby Dyar, C. J. Carey, Sridhar Mahadevan, and Darby Dyar, A Fully Customized Baseline Removal Framework for Spectroscopic Applications, Applied Spectroscopy, vol. 71, Issue 7, March 2017

2016

Bo Liu, Ji Liu, Mohammad Ghavamzadeh, Sridhar Mahadevan, and Marek Petrik, Proximal Gradient Temporal Difference Learning Algorithms, Proceedings of the International Joint Conference on AI (IJCAI), 2016.

M.D. Dyar, C. Fassett, S. Giguere, K. Lepore, S. Byrne, T. Boucher, C. Carey, and S. Mahadevan, Comparison of univariate and multivariate models for prediction of major and minor elements from laser-induced breakdown spectra with and without masking, Applied Spectroscopy, vol. 71, Issue 7, March 2017

2015

Bo Liu, Ji Liu, Mohammad Ghamvamzadeh, Sridhar Mahadevan, and Marek Petrik, Finite-sample analysis of proximal gradient temporal difference learning, Proceedings of the Conference on Uncertainty in AI (UAI), July 12-16th, Amsterdam, Holland, 2015. Received Facebook Best (Student) Paper Award.

Lidan Wang, Minwei Feng, Bowen Zhou, Bing Xiang, and Sridhar Mahadevan, Efficient Hyper-parameter optimization for NLP applications, Empirical Methods in Natural language processing (EMNLP), Portugal, September 2015.

Sridhar Mahadevan and Sarath Chandar, Reasoning about Linguistic Regularities in Word Embeddings using Matrix Manifolds, Arxiv, July 2015.

Thomas Boucher, Marie Ozanne, Marco Carmosino, Darby Dyar, Sridhar Mahadevan, Elly Breves, Katherine Lepore, and Samuel Clegg, A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy, Spectrochimica Acta Part B, vol. 107, pp. 1-10, 2015.

Ian Gemp and Sridhar Mahadevan, Finding Equilibria in Large Games using Variational Inequalities, AAAI Spring Symposium on Applied Computational Game Theory, March 23-25th, Stanford, CA, 2015.

Thomas Boucher, CJ Carey, Sridhar Mahadevan, and Darby Dyar, Aligning Mixed Manifolds, Proceedings of the AAAI Conference, January 25-30, Austin, Texas, 2015.

Ian Gemp and Sridhar Mahadevan, Solving Large Scale Substainable Supply Chain Networks using Variational Inequalities, AAAI Workshop on Computational Sustainability, January 26, Austin, Texas, 2015.

Giguere, S., Carey, C., Boucher, T., Mahadevan, S. and Dyar, M.D, An Optimization Prospective on Baseline Removal for Spectroscopy, Proceedings of the 5th IJCAI Workshop on Artificial Intelligence in Space, 2015.

Boucher, T., Carey, C., Giguere, S., Mahadevan, S., Dyar, M.D., Clegg, S. and Wiens, R. Manifold Learning for Regression of Mars Spectra Proceedings of the 5th IJCAI Workshop on Artificial Intelligence in Space, 2015.

Carey, C., Boucher, T., Giguere, S., Mahadevan, S. and Dyar, M.D. Automatic Whole-Spectrum Matching Proceedings of the 5th IJCAI Workshop on Artificial Intelligence in Space, 2015.

2014

Sridhar Mahadevan, Bo Liu, Philip Thomas, Will Dabney, Steve Giguere, Nicholas Jacek, Ian Gemp and Ji Liu, Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces, Arxiv, May 26, 2014 (126 pages)

C.J. Carey and Sridhar Mahadevan, “ Manifold spanning graphs , Proceedings of the AAAI Conference on Artificial Intelligence, July 27-31, 2014, Quebec City, Canada.

Boucher, T., Dyar, M.D., Carmosino, M., Mahadevan, S., Clegg, S., and Wiens, R. (2013) Manifold regression of LIBS data from geological samples for application to ChemCam on Mars. Sci-X 2013, Milwaukee, Sci-X 2013, Milwaukee, Abstract #24.

Boucher, T., Dyar, M.D., Carey, C., and Mahadevan, S. (2014) Using manifold embeddings to preprocess LIBS spectra to improve regression model performance. Sci-X 2014, Reno, NV, in press.

Boucher, T., Dyar, M.D., Carey, C., Mahadevan, S., Mezzacappa, A., and Melikechi, N. (2014) Recognizing the contribution of dust to ChemCam spectra of rocks and minerals on Mars. Sci-X 2014, Reno, NV, in press.

Dyar, M.D., Breves, E.A., Boucher, T.F., and Mahadevan, S. (2014) Successes and challenges of laser-induced breakdown spectroscopy (LIBS) applied to chemical analyses of geological samples. Microscopy and Microanalysis 2014, Hartford, CT, in press.

2013

Philip Thomas, Will Dabney, Sridhar Mahadevan and Stephen Giguere, “ Projected Natural Actor-Critic , Proceedings of the Conference on Neural Information Processing Systems (NIPS), December 5-10, 2013, Lake Tahoe, CA.

Sridhar Mahadevan, Stephen Giguere, and Nicholas Jacek, “ Basis Adaptation for Sparse Nonlinear Reinforcement Learning ", Proceedings of the AAAI Conference, July 14-18, 2013, Bellevue, Washington

Chang Wang and Sridhar Mahadevan, “ Multiscale Manifold Learning ", Proceedings of the AAAI Conference, July 14-18, 2013, Bellevue, Washington

Chang Wang and Sridhar Mahadevan, “ Manifold Alignment Preserving Global Geometry ", Proceedings of the IJCAI Conference, August 3-9, 2013, Beijing, China.

2012

Bo Liu and Sridhar Mahadevan and Ji Liu, “ Regularized Off-Policy TD-Learning " , Proceedings of the Conference on Neural Information Processing Systems (NIPS), December 1-3, 2012, Lake Tahoe, CA.

Sridhar Mahadevan and Bo Liu, “ Sparse Q-learning with Mirror Descent " , Proceedings of the Conference on Uncertainty in AI (UAI), August 15-17, 2012, Catalina Island, CA.

Hoa Vu, CJ Carey, and Sridhar Mahadevan, “ Manifold Warping: Manifold Alignment over Time " , Proceedings of the 26th Conference on Artificial Intelligence (AAAI), July 22-26, 2012, Toronto, Canada.

Chang Wang and Sridhar Mahadevan, “ Manifold Alignment Preserving Global Geometry " , Technical Report, UMass Computer Science Department UM-CS-2012-031, 2012.

Chang Wang, Bo Liu, Hoa Vu, and Sridhar Mahadevan, “ Sparse Manifold Alignment " , Technical Report, UMass Computer Science UM-2012-030, 2012.

2011

Chang Wang and Sridhar Mahadevan, “ Heterogeneous Domain Adaptation using Manifold Alignment " , Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), July 18-23, 2011, Barcelona, Spain.

Chang Wang and Sridhar Mahadevan, “ Jointly Learning Data-Depdendent Label and Locality-Preserving Projections " , Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), July 18-23, 2011, Barcelona, Spain.

Blake Foster, Sridhar Mahadevan, and Rui Wang, “ GPU-Based Approximate SVD Algorithm " , 9th International Conference on Parallel Programming and Mathematics, Torun, Poland, September 11-14, 2011 (also available as Technical Report UM-CS-2011-025, Univ. of Massachusetts, Amherst).

Chang Wang, Peter Krafft, and Sridhar Mahadevan, “ Manifold Alignment ", appearing in Manifold Learning: Theory and Applications, Taylor and Francis CRC Press.

Bo Liu and Sridhar Mahadevan, “ Compressive Reinforcement Learning with Oblique Random Projections " , Univ. of Massachusetts Technical Report UM-CS-2011-024

2010

Sridhar Mahadevan and Bo Liu, “Basis Construction from Power Series Expansions of Value Functions " , Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS), December 6-8, 2010.

Chang Wang and Sridhar Mahadevan, "Multiscale Manifold Alignment" , Univ. of Massachusetts TR UM-CS-2010-049, 2010.

Chang Wang and Sridhar Mahadevan, "Learning Locality Preserving Discriminative Features" , Univ. of Massachusetts TR UM-CS-2010-048, 2010.

Sarah Osentoski and Sridhar Mahadevan, Basis Function Construction in Hierarchical Reinforcement Learning , 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Toronto, May 10-14, 2010.

Sridhar Mahadevan, Representation Discovery in Sequential Decision Making , 24th Conference on Artificial Intelligence (AAAI) , Atlanta, July 11-15, 2010.

Georgios Theocharous and Sridhar Mahadevan, Compressing POMDPs using Locality Preserving Non-Negative Matrix Factorization , 24th Conference on Artificial Intelligence (AAAI) , Atlanta, July 11-15, 2010.

2009

Chang Wang and Sridhar Mahadevan, A General Framework for Manifold Alignment , AAAI Fall Symposium on Manifold Learning, Washington, D.C., 2009.

Jeff Johns and Sridhar Mahadevan, Sparse Approximate Policy Evaluation using Graph-based Basis Functions , U.Mass Technical Report, UM-CS-2009-041, 2009.

Sridhar Mahadevan, Learning Representation and Control in Markov Decision Processes: New Frontiers" , Foundations and Trends in Machine Learning (editor, Michael, Jordan), vol 1, No. 4, pp. 403-565 (163 pages), 2009. ( A printed and bound book version of this article is available at a 50% discount from Now Publishers. This can be obtained by entering the promotional code MAL001004 on the order form at now publishers).

Jeff Johns, Marek Petrik and Sridhar Mahadevan, Hybrid Least-Squares Algorithms for Approximate Policy Evaluation , Machine Learning journal, vol. 76, Nos. 2-3, September, 2009. (1 of only 7 papers selected to appear in the ML journal from those to be presented at European Conference on Machine Learning (ECML)) , Bled, Slovenia, 2009.

Chang Wang and Sridhar Mahadevan, "Manifold Alignment without Correspondence", 21st International Joint Conference on Artificial Intelligence (IJCAI), pp. 1273-1278, 2009.

Chang Wang and Sridhar Mahadevan, "Multiscale Analysis of Document Corpora based upon Diffusion Models" , 21st International Joint Conference on Artificial Intelligence (IJCAI), 2009.

Chang Wang and Sridhar Mahadevan, "Multiscale Dimensionality Reduction with Diffusion Wavelets" , Univ. of Massachusetts TR UM-CS-2009-030, 2009.

Sarah Osentoski and Sridhar Mahadevan, "Basis Function Construction for Hierarchical Reinforcement Learning" , ICML Workshop on Abstraction in Reinforcement Learning, 2009.

2008

Sridhar Mahadevan, "Representation Discovery using Harmonic Analysis" , Synthesis Lectures on Artificial Intelligence and Machine Learning (edited by Ron Brachman and Tom Dietterich), Morgan Claypool Publishers, 2008.

Jeff Johns, Marek Petrik and Sridhar Mahadevan, "Hybrid Least-Squares Algorithms for Approximate Policy Evaluation" , Univ. of Massachusetts, Amherst, Technical Report UMASS-CS-2008-044.

Chang Wang and Sridhar Mahadevan, "Multiscale Analysis of Document Corpora using Diffusion Models" , University of Massachusetts, Technical Report 16, 2008.

Sridhar Mahadevan, "Fast Spectral Learning using Lanczos Eigenspace Projections" , National Conference on Artificial Intelligence (AAAI), 2008, Chicago.

Chang Wang and Sridhar Mahadevan, "Manifold Alignment using Procrustes Analysis" , International Conference on Machine Learning (ICML), 2008, Helsinki, Finland.

2007

Sridhar Mahadevan and Mauro Maggioni, "Proto-Value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes" , Journal of Machine Learning Research, pp. 2169-2231, vol. 8, 2007, MIT Press. ( revised version with some errors corrected!)

Mohammad Ghavamzadeh and Sridhar Mahadevan, Hierarchical Average-Reward Reinforcement Learning" , Journal of Machine Learning Research, vol. 8, pp. 2629-2669, 2007, MIT Press.

Jeff Johns, Sarah Osentoski, and Sridhar Mahadevan "Representation Discovery in Planning using Harmonic Analysis" , AAAI Fall Symposium on Computational Approaches to Representation Change during Learning and Development, Nov. 8-11, 2007, Washington, D.C.

Sridhar Mahadevan, Sarah Osentoski, Jeff Johns, Kimberly Ferguson, and Chang Wang "Learning to Plan using Harmonic Analysis of Diffusion Models" , International Conference on Automated Planning and Scheduling (ICAPS), September 22-26, 2007, Brown Univ, Providence.

Sridhar Mahadevan, "Adaptive Mesh Compression in 3D Computer Graphics using Multiresolution Manifold Learning" , International Conference on Machine Learning (ICML), 2007, June 20-24, 2007, Corvallis, Oregon.

Jeff Johns and Sridhar Mahadevan, "Constructing Basis Functions from Directed Graphs for Value Function Approximation" , International Conference on Machine Learning (ICML), 2007, June 20-24, 2007, Corvallis, Oregon.

Sarah Osentoski and Sridhar Mahadevan, "Learning State-Action Basis Functions for Hierarchical MDPs" , International Conference on Machine Learning (ICML), 2007, June 20-24, 2007, Corvallis, Oregon.

Jeff Johns, Sridhar Mahadevan, and Chang Wang, "Compact Spectral Bases for Value Function Approximation using Kronecker Factorization", National Conference on Artificial Intelligence (AAAI), July 22-26,2007, Vancouver, Canada.

Sridhar Mahadevan, "New Frontiers in Representation Discovery" , Tutorial at the National Conference on Artificial Intelligence (AAAI), July 23, 2007, Vancouver, Canada.

Ivon Arroyo, Kimberly Ferguson, Jeff Johns, Toby Dragon, Hasmik Meheranian, Don Fisher, Andrew Barto, Sridhar Mahadevan, and Beverly Woolf, "Repairing Disengagement With Non Invasive Interventions", Artificial Intelligence in Education (AIED), July 9-13, 2007, Marina Del Rey, CA.

2006

Sridhar Mahadevan and Mauro Maggioni, "Proto-Value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes", University of Massachusetts, Department of Computer Science Technical Report TR-2006-35, 2006.

Mauro Maggioni and Sridhar Mahadevan, "A Multiscale Framework For Markov Decision Processes using Diffusion Wavelets" , University of Massachusetts, Department of Computer Science Technical Report TR-2006-36, 2006.

Mauro Maggioni and Sridhar Mahadevan, Fast Direct Policy Evaluation using Multiscale Analysis of Markov Diffusion Processes" , ICML 2006, June, CMU, Pittsburgh.

Sridhar Mahadevan, Mauro Maggioni, Kimberly Ferguson, and Sarah Osentoski, "Learning Representation and Control in Continuous Markov Decision Processes" , AAAI 2006, Boston, July.

Mohammad Ghavamzadeh, Sridhar Mahadevan, and Rajbala Makar, "Hierarchical Multiagent Reinforcement Learning", Journal of Autonomous Agents and Multiagent Systems, April, 2006

Sridhar Mahadevan and Mauro Maggioni, "Value Function Approximation using Diffusion Wavelets and Laplacian Eigenfunctions" , Neural Information Processing Systems (NIPS), MIT Press, 2006.

Kimberly Ferguson and Sridhar Mahadevan, "Proto-Transfer Learning in Markov Decision Processes using Spectral Methods" , ICML Workshop on Transfer Learning, June 29th, 2006.

Kimberly Ferguson, Ivon Arroyo, Sridhar Mahadevan, Beverly Woolf, and Andrew Barto, "Improving Intelligent Tutoring Systems: Using EM to Learn Student Skill Levels" , Intelligent Tutoring Systems, 2006, Lecture Notes in Computer Science, No 4053, pp. 453-462,2006.

Jeffrey Johns, Sridhar Mahadevan, Beverly Woolf, "Estimating Student Proficiency using an Item Response Theory Model" , Intelligent Tutoring Systems, Lecture Notes in Computer Science, No. 4053, pp. 473-480,2006.

2005

Sridhar Mahadevan, "Representation Policy Iteration: A Unified Framework for Learning Representation and Behavior" , Invited talk given at National Conference on Artificial Intelligence (AAAI 2005) (includes several recent unpublished results on representation discovery in discrete and continuous stochastic domains).

Mauro Maggioni and Sridhar Mahadevan, "Fast Direct Policy Evaluation Using Multiscale Markov Diffusion Processes" , University of Massachusetts, Department of Computer Science Technical Report TR-2005-39, 2005.

Sridhar Mahadevan and Mauro Maggioni, "Value Function Approximation using Diffusion Wavelets and Laplacian Eigenfunctions" , Department of Computer Science Technical Report TR-2005-38, 2005).

Georgios Theocharous, Sridhar Mahadevan and Leslie Kaelbling, "Spatial and Temporal Abstraction in POMDPs for Robot Navigation" , submitted (soon to appear as MIT CSAIL TR), 2005.

Jeff Johns and Sridhar Mahadevan, "A Variational Learning Algorithm for the Abstract Hidden Markov Model" , Proceedings of the National Conference on Artificial Intelligence (AAAI-2005), Pittsburgh, PA, July 9-13, 2005.

Sridhar Mahadevan, "Samuel Meets Amarel: Automating Value Function Approximation using Global State Space Analysis" , Proceedings of the National Conference on Artificial Intelligence (AAAI-2005), Pittsburgh, PA, July 9-13, 2005.

Sridhar Mahadevan, "Representation Policy Iteration" , Proceedings of the 21st Conference on Uncertainty in AI (UAI-2005), Edinburgh, Scotland, July 26-29, 2005.

Sridhar Mahadevan, "Proto-Value Functions: Developmental Reinforcement Learning" , Proceedings of the International Conference on Machine Learning (ICML-2005), Bonn, Germany, August 7-13, 2005.

Khashayar Rohanimanesh and Sridhar Mahadevan, "Coarticulation: An Approach for Generating Concurrent Plans in Markov Decision Processes" , Proceedings of the International Conference on Machine Learning (ICML-2005), Bonn, Germany, August 7-13, 2005.

Victoria Manfredi and Sridhar Mahadevan, "Hierarchical Reinforcement Learning using Graphical Models" Workshop on Rich Representation for Reinforcement Learning, Bonn, August 7th, 2005.

Victoria Manfredi and Sridhar Mahadevan, "Dynamic Abstraction Networks" , University of Massachusetts, Amherst, Technical Report TR 2005-33, 2005.

Victoria Manfredi and Sridhar Mahadevan, "Kalman Filters for Prediction and Tracking in an Adaptive Sensor Network" , University of Massachusetts, Amherst, Technical Report 2005-7, 2005.

Anders Jonsson, Jeff Johns, Hasmik Mehranian, Ivon Arroyo, Beverly Woolf, Andrew Barto, Donald Fisher, and Sridhar Mahadevan, "Evaluating the Feasibility of Learning Student Models from Data" , AAAI Workshop on Educational Data Mining, Pittsburgh, PA, July 9, 2005.

2004

Khashayar Rohanimanesh, Robert Platt, Sridhar Mahadevan, and Roderic Grupen, "A Framework for Coarticulation in Markov Decision Processes", Technical Report 04-33, Department of Computer Science, University of Massachusetts, Amherst, Massachusetts, 2004.

Mohammad Ghavamzadeh and Sridhar Mahadevan. "Hierarchical Multiagent Reinforcement Learning". Technical Report UM-CS-2004-02. Department of Computer Science, University of Massachusetts Amherst, 2004.

Khashayar Rohanimanesh, Rob Platt, Sridhar Mahadevan, and Rod Grupen, "Coarticulation in Markov Decision Processes" , Eighteenth International Conference on Neural Information Processing Systems (NIPS), 2004

Sarah Osentoski, Victoria Manfredi, and Sridhar Mahadevan, Learning Hierarchical Models of Activity , IEEE/RSJ International Conference on Robots and Systems (IROS), 2004.

Mohammad Ghavamzadeh and Sridhar Mahadevan, Learning to Act and Communicate in Cooperative Multiagent Systems using Hierarchical Reinforcement Learning , Autonomous Agents and Multiagent Systems (AAMAS), 2004.

Suchi Saria and Sridhar Mahadevan, Probabilistic Plan Recognition in Multiagent Systems , International Conference on AI and Planning Systems (ICAPS), 287-296, 2004.

2003

Mohammad Ghavamzadeh and Sridhar Mahadevan. "Hierarchical Average Reward Reinforcement Learning". Technical Report UM-CS-2003-19, Department of Computer Science, University of Massachusetts Amherst, 2003.

Mohammad Ghavamzadeh, Sridhar Mahadevan and Rajbala Makar. "Extending Hierarchical Reinforcement Learning to Continuous-Time, Average-Reward, and Multi-Agent Models". Technical Report UM-CS-2003-23, Department of Computer Science, University of Massachusetts Amherst, 2003.

Sridhar Mahadevan, Mohammad Ghavamzadeh, Khashayar Rohanimanesh, Georgios Theocharous, "Hierarchical Approaches to Concurrency, Multiagency, and Partial Observability" , Learning and Approximate Dynamic Programming: Scaling up to the Real World, Edited by Jennie Si, Andy Barto,Warren Powell,and Donald Wunsch, John Wiley & Sons, New York.

Mohammad Ghavamzadeh and Sridhar Mahadevan, "Hierarchical Policy Gradient Algorithms" , Twentieth International Conference on Machine Learning , Washington, D.C., 2003 .

Andrew Barto and Sridhar Mahadevan, "Recent Advances in Hierarchical Reinforcement Learning", volume 13, Special Issue on Reinforcement Learning, Discrete Event Systems journal, pp. 41-77, 2003

Khashayar Rohanimanesh and Sridhar Mahadevan, "Learning To Take Concurrent Actions", Sixteenth International Conference on Neural Information Processing Systems (NIPS), , MIT Press, 2003

2002

Georgios Theocharous and Sridhar Mahadevan, "Learning the Hierarchical Structure of Spatial Environments using Multiresolution Spatial Models", , IEEE/RSJ International Conference on Intelligent Robots and Systems , Lausanne, Switzerland, September 30th - October 4th, 2002. .

Sridhar Mahadevan, "Spatiotemporal Abstraction of Stochastic Sequential Processes" , Symposium on Abstraction, Reformulation, and Approximation (SARA) , Lectures Notes in Artificial Intelligence, vol. 2371, Sven Koenig and Robert Holte (editors), Springer-Verlag, pp. 33-50, 2002.

Mohammad Ghavamzadeh and Sridhar Mahadevan, "Hierarchically Optimal Average Reward Reinforcement Learning" , Nineteenth International Conference on Machine Learning , Sydney, Australia, July 8-12, 2002 .

Georgios Theocharous and Sridhar Mahadevan, "Approximate Planning with Hierarchical Partially Observable Markov Decision Processes for Robot Navigation" , IEEE Conference on Robotics and Automation (ICRA) , Washington, D.C. May 2002. .

Mohammad Ghavamzadeh and Sridhar Mahadevan, "A Multiagent Reinforcement Learning Algorithm by Dynamically Merging Markov Decision Processes" , First International Conference on Autonomous Agents and Multiagent Systems (AAMAS) , Bologna, Italy, 2002

2001

John Henderson, Richard, Falk, Silviu Minut, Fred Dyer, and Sridhar Mahadevan, "Gaze Control for Face Learning and Recognition by Humans and Machines", in T. Shipley and P. Kellman (editors), From Fragments to Objects: Segmentation and Grouping in Vision, Advances in Psychology, vol. 130, North-Holland Press, pp. 463-481.

Khashayar Rohanimanesh and Sridhar Mahadevan, "Decision-Theoretic Planning with Concurrent Temporally Extended Actions" , Seventeenth Conference on Uncertainty in Artificial Intelligence , August 3-5, 2001

Mohammad Ghavamzadeh and Sridhar Mahadevan "Continuous-time Hierarchical Reinforcement Learning", Eighteenth International Conference on Machine Learning (ICML) , June 28-July 1, 2001, Williams College, Massachusetts

Silviu Minut and Sridhar Mahadevan, "A Reinforcement Learning Model of Selective Visual Attention", Fifth International Conference on Autonomous Agents, Montreal, 2001.

Rajbala Makar, Sridhar Mahadevan, and Mohammad Ghavamzadeh "Hierarchical Multi-Agent Reinforcement Learning", Fifth International Conference on Autonomous Agents, Montreal, 2001. Best Student Paper Award

Georgios Theocharous, Khashayar Rohanimanesh, and Sridhar Mahadevan "Learning Hierarchical Partially Observable Markov Decision Processes for Robot Navigation", IEEE Conference on Robotics and Automation , (ICRA), 2001, Seoul, South Korea.

2000

Natalia Hernandez and Sridhar Mahadevan, "Hierarchical Memory-based Reinforcement Learning" , Fifteenth International Conference on Neural Information Processing Systems, Nov. 27-December 2nd, Denver 2000.

Georgios Theocharous, Khashayar Rohanimanesh, and Sridhar Mahadevan "Learning and Planning with Hierarchical Stochastic Models for Robot Navigation", ICML 2000 Workshop on Machine Learning of Spatial Knowledge, July 2, 2000, Stanford University

Khashyar Rohanimanesh, Georgios Theocharous, Sridhar Mahadevan "Hierarchical Map Learning for Robot Navigation", AIPS Workshop on Decision-Theoretic Planning, April 14, 2000, Breckenridge, Colorado.

Silviu Minut, Sridhar Mahadevan, John Henderson, Fred Dyer "Face Recognition using Foveal Vision", First IEEE International Workshop on Biologically Inspired Vision , Seoul, S. Korea, May 17-20, 2000. (Appeared in Lecture Notes in Computer Science, vol. 1181, pp. 424-433, Springer-Verlag, 2000.)

1999

Gang Wang, Sridhar Mahadevan "Hierarchical Optimization of Policy-Coupled Semi-Markov Decision Processes", Proceedings of the 16th International Conference on Machine Learning (ICML '99), Bled, Slovenia, June 27-30, 1999. (nominated for best paper award at ICML-99)

Tapas Das, Abhijit Gosavi, Sridhar Mahadevan, and Nicholas Marchalleck "Solving Semi-Markov Decision Problems using Average Reward Reinforcement Learning", Management Science, April, 1999.

R. Tummala, R. Mukherjee, D. Aslam, N. Xi, S. Mahadevan, J. Weng, "Reconfigurable Adaptable Micro-Robot", Proceedings of the IEEE Conference on Systems, Man, and Cybernetics (SMC), Tokyo, Japan, Oct. 12-15, 1999. (Word document)

1998

Constantinos Papaconstantinou, Georgios Theocharous, Sridhar Mahadevan, An Expert System for Assigning Patients to Clinical Trials based on Bayesian Networks , Journal of Medical Systems, vol. 22, No. 3, pp. 189-202, June 1998.

Sridhar Mahadevan, Georgios Theocharous, Nikfar Khaleeli, "Rapid Concept Learning For Mobile Robots", Autonomous Robots Journal, (also to appear in Machine Learning Journal), Joint special Issue on Learning in Autonomous Robots , vol. 5, pp. 239-251, 1998.

Gang Wang and Sridhar Mahadevan "A Greedy Divide-and-Conquer Approach to Optimizing Large Manufacturing Systems using Reinforcement Learning", NIPS '98 Workshop on Abstraction and Hierarchy in Reinforcement Learning , December 1998.

Sridhar Mahadevan and Georgios Theocharous Optimizing Production Manufacturing using Reinforcement Learning , Eleventh International FLAIRS conference , pp. 372-377, AAAI Press, May 1998.

1997

Sridhar Mahadevan, Nicholas Marchalleck, Tapas Das, and Abhijit Gosavi, "Self-Improving Factory Simulation using Continuous-Time Average-Reward Reinforcement Learning", Proceedings of the 14th International Conference on Machine Learning (IMLC '97), Nashville, TN, July 1997.

Sridhar Mahadevan, Nikfar Khaleeli, Nicholas Marchalleck, "Designing Agent Controllers using Discrete-Event Markov Models", AAAI Fall Symposium on Model-Directed Autonomous Systems, Nov. 8th-10th, MIT, Cambridge, 1997.

1996

Sridhar Mahadevan, "Optimality Criteria in Reinforcement Learning", Proceedings of the AAAI Fall Symposium on Learning Complex Behaviors for Intelligent Adaptive Systems , MIT, Boston, Nov. 9-11, 1996.

Sridhar Mahadevan, "Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results", Machine Learning , Special Issue on Reinforcement Learning (edited by Leslie Kaebling), vol. 22, pp. 159-196, 1996.

Sridhar Mahadevan, "An Average-Reward Reinforcement Learning Algorithm for Learning Bias-Optimal Policies", Proceedings of the 13th National Conference on Artificial Intelligence (AAAI '96), August 6th-8th, Portland, Oregon.

Sridhar Mahadevan, "Sensitive-Discount Optimality: Unifying Discounted and Average-Reward Reinforcement Learning", Proceedings of the 13th International Conference on Machine Learning (IMLC '96), July 3rd-6th, Bari, Italy.

Sridhar Mahadevan, "Machine Learning for Robots: A Comparison of Different Paradigms", Workshop on Towards Real Autonomy , IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-96), Osaka, Japan, November 1996.

Sridhar Mahadevan and Leslie Kaelbling, "The NSF Workshop on Reinforcement Learning: Summary and Observations", AI Magazine, Winter 1996, 89-97.

1995

1994

Sridhar Mahadevan and Prasad Tadepalli, "Quantifying Prior Determination Knowledge using the PAC Learning Model", Machine Learning , vol. 17, pp. 69-105, 1994 .

Sridhar Mahadevan, "To Discount or Not to Discount in Reinforcement Learning: A Case Study Comparing R-learning and Q-learning", Proceedings of the 11th International Conference on Machine Learning , New Brunswick, N.J., pp. 164-172, July, 1994 .

1993

Sridhar Mahadevan, Tom Mitchell, Jack Mostow, Louis Steinberg, and Prasad Tadepalli, "An Apprentice Based Approach to Knowledge Acquisition", Artificial Intelligence , vol 64, No. 1, November 1993, pp. 1-52.

Jonathan Connell and Sridhar Mahadevan (editors), "Robot Learning", Kluwer Academic Press , June 1993.

1992

Sridhar Mahadevan, "Enhancing Transfer in Reinforcement Learning by Building Stochastic Models of Robot Actions", Proceedings of the Ninth International Conference on Machine Learning, Aberdeen, Scotland, pp. 290-299, July, 1992.

Sridhar Mahadevan and Jonathan Connell, "Automatic Programming of Behavior-based Robots using Reinforcement Learning", Artificial Intelligence , vol. 55, Nos. 2-3, pp. 311-365, June, 1992 .