Akshay Krishnamurthy

Publications

Is Best-of-N the best of them? Coverage, scaling, and optimality in inference-time alignment.
Audrey Huang, Adam Block, Qinghua Liu, Nan Jiang, Akshay Krishnamurthy, Dylan J. Foster.
To appear in International Conference on Machine Learning, ICML 2025.
[Arxiv version]

The role of environment access in agnostic reinforcement learning.
Akshay Krishnamurthy, Gene Li, Ayush Sekhari.
To appear in Conference on Learning Theory, COLT 2025.
[Arxiv version]

Computational-statistical tradeoffs at the next-token prediction barrier: Autoregressive and imitation learning under misspecification.
Dhruv Rohatgi, Adam Block, Audrey Huang, Akshay Krishnamurthy, Dylan J. Foster.
To appear in Conference on Learning Theory, COLT 2025.
[Arxiv version]

Self-improvement in language models: The sharpening mechanism.
Audrey Huang, Adam Block, Dylan J. Foster, Dhruv Rohatgi, Cyril Zhang, Max Simchowitz, Jordan T. Ash, Akshay Krishnamurthy.
To appear in International Conference on Learning Representations, ICLR 2025. Oral presentation
[Arxiv version]

Correcting the mythos of KL-regularization: Direct alignment without overoptimization via chi-squared preference optimization.
Audrey Huang, Wenhao Zhan, Tengyang Xie, Jason D. Lee, Wen Sun, Akshay Krishnamurthy, Dylan J. Foster.
To appear in International Conference on Learning Representations, ICLR 2025. Spotlight presentation
[Arxiv version][blog]

Computationally efficient RL under linear bellman completeness for deterministic dynamics.
Runzhe Wu, Ayush Sekhari, Akshay Krishnamurthy, Wen Sun.
To appear in International Conference on Learning Representations, ICLR 2025. Oral presentation
[Arxiv version]

Exploratory preference optimization: Harnessing implicit Q*-approximation for sample-efficient RLHF.
Tengyang Xie, Dylan J. Foster, Akshay Krishnamurthy, Corby Rosset, Ahmed Awadallah, Alexander Rakhlin.
To appear in International Conference on Learning Representations, ICLR 2025.
[Arxiv version]

Reinforcement learning under latent dynamics: Toward statistical and algorithmic modularity.
Philip Amortila, Dylan J Foster, Nan Jiang, Akshay Krishnamurthy, Zakaria Mhammedi.
In Advances in Neural Information Processing Systems, NeurIPS 2024. Oral presentation
[Arxiv version]

Can large language models explore in-context?
Akshay Krishnamurthy, Keegan Harris, Dylan J. Foster, Cyril Zhang, Aleksandrs Slivkins.
In Advances in Neural Information Processing Systems, NeurIPS 2024.
[Arxiv version]

Mitigating covariate shift in misspecified regression with applications to reinforcement learning.
Philip Amortila, Tongyi Cao, Akshay Krishnamurthy.
In Conference on Learning Theory, COLT 2024.
[Arxiv version]

Scalable online exploration via coverability.
Philip Amortila, Dylan J. Foster, Akshay Krishnamurthy.
In International Conference on Machine Learning, ICML 2024.
[Arxiv version]

Rich-observation reinforcement learning with continuous latent dynamics.
Yuda Song, Lili Wu, Dylan J. Foster, Akshay Krishnamurthy.
In International Conference on Machine Learning, ICML 2024.
[Arxiv version]

Oracle-efficient pessimism: Offline policy optimization in contextual bandits.
Lequn Wang, Akshay Krishnamurthy, Aleksandrs Slivkins.
In International Conference in Artificial Intelligence and Statistics, AISTATS 2024.
[Arxiv version]

Butterfly effects of SGD noise: Error amplification in behavior cloning and autoregression.
Adam Block, Dylan J. Foster, Akshay Krishnamurthy, Max Simchowitz, Cyril Zhang.
In International Conference on Learning Representations, ICLR 2024.
[Arxiv version]

Model-free representation learning and exploration in low-rank MDPs.
Aditya Modi, Jinglin Chen, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal.
In Journal of Machine Learning Research, JMLR 2024.
[Arxiv version]

Exposing attention glitches with Flip-Flop language modeling.
Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang.
In Advances in Neural Information Processing Systems, NeurIPS 2023. Spotlight presentation
[Arxiv version]

Robust dynamic assortment optimization in the presence of outlier customers.
Xi Chen, Akshay Krishnamurthy, Yining Wang.
In Operations Research, 2023.
[Arxiv version]

A complete characterization of linear estimators for offline policy evaluation.
Juan C. Perdomo, Akshay Krishnamurthy, Peter Bartlett, Sham M. Kakade.
In Journal of Machine Learning Research, JMLR 2023.
[Arxiv version]

Learning hidden Markov models using conditional samples.
Sham M. Kakade, Akshay Krishnamurthy, Gaurav Mahajan, Cyril Zhang.
In Conference on Learning Theory, COLT 2023.
[Arxiv version]

Statistical learning under heterogenous distribution shift.
Max Simchowitz, Anurag Ajay, Pulkit Agrawal, Akshay Krishnamurthy.
In International Conference on Machine Learning, ICML 2023.
[Arxiv version]

Streaming active learning with deep neural networks.
Akanksha Saran, Safoora Yousefi, Akshay Krishnamurthy, John Langford, Jordan T. Ash.
In International Conference on Machine Learning, ICML 2023.
[Arxiv version]

Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models.
Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Didolkar, Dipendra Misra, Dylan J. Foster, Lekan Molu, Rajan Chari, Akshay Krishnamurthy, John Langford.
In Transactions on Machine Learning Research, TMLR 2023.
[Arxiv version]

Transformers learn shortcuts to automata.
Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang.
In International Conference on Learning Representations, ICLR 2023. Oral presentation
[Arxiv version]

Hybrid RL: Using both offline and online data can make RL efficient.
Yuda Song, Yifei Zhou, Ayush Sekhari, J. Andrew Bagnell, Akshay Krishnamurthy, Wen Sun.
In International Conference on Learning Representations, ICLR 2023.
[Arxiv version]

On the statistical efficiency of reward-free exploration in non-linear RL.
Jinglin Chen, Aditya Modi, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal.
In Advances in Neural Information Processing Systems, NeurIPS 2022.
[Arxiv version]

Contextual search in the presence of adversarial corruptions.
Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, Robert E. Schapire.
In Operations Research, 2022.
Conference version: Contextual search in the presence of irrational agents in Symposium on Theory of Computing, STOC 2021.
[Arxiv version]

Offline reinforcement learning: Fundamental barriers for value function approximation.
Dylan J. Foster, Akshay Krishnamurthy, David Simchi-Levi, Yunzong Xu.
In Conference on Learning Theory, COLT 2022.
[Arxiv version]

Sample-efficient reinforcement learning in the presence of exogenous information.
Yonathan Efroni, Dylan J. Foster, Dipendra Misra, Akshay Krishnamurthy, John Langford.
In Conference on Learning Theory, COLT 2022.
[Arxiv version]

Understanding contrastive learning requires incorporating inductive biases.
Nikunj Saunshi, Jordan T. Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy.
In International Conference on Machine Learning, ICML 2022.
[Arxiv version]

Provable reinforcement learning with a short-term memory.
Yonathan Efroni, Chi Jin, Akshay Krishnamurthy, Sobhan Miryoosefi.
In International Conference on Machine Learning, ICML 2022.
[Arxiv version]

Universal and data-adaptive algorithms for model selection in linear contextual bandits.
Vidya Muthukumar, Akshay Krishnamurthy.
In International Conference on Machine Learning, ICML 2022.
[Arxiv version]

Sparsity in partially controllable linear systems.
Yonathan Efroni, Sham Kakade, Akshay Krishnamurthy, Cyril Zhang.
In International Conference on Machine Learning, ICML 2022.
[Arxiv version]

Anti-concentrated confidence bonuses for scalable exploration.
Jordan T. Ash, Cyril Zhang, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade.
In International Conference on Learning Representations, ICLR 2022.
[Arxiv version]

Provable RL with exogenous distractors via multistep inverse dynamics.
Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford.
In International Conference on Learning Representations, ICLR 2022. Oral presentation
[Arxiv version][blog]

Investigating the role of negatives in contrastive representation learning.
Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Dipendra Misra.
In International Conference on Artificial Intelligence and Statistics, AISTATS 2022.
[Arxiv version]

Efficient and optimal algorithms for contextual dueling bandits under realizability.
Aadirupa Saha, Akshay Krishnamurthy.
In International Conference on Algorithmic Learning Theory, ALT 2022.
[Arxiv version]

Contrastive estimation reveals topic posterior information to linear models.
Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu.
In Journal of Machine Learning Research, JMLR 2021.
[Arxiv version]

Efficient first order contextual bandits: Prediction, allocation, and triangular discrimination.
Dylan J. Foster, Akshay Krishnamurthy.
In Neural Information Processing Systems, NeurIPS 2021. Oral presentation.
[Arxiv version]

Bayesian decision-making under misspecified priors with applications to meta-learning.
Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu, Thodoris Lykouris, Miroslav Dudík, Robert E. Schapire.
In Neural Information Processing Systems, NeurIPS 2021. Spotlight presentation.
[Arxiv version]

Gone fishing: Neural active learning with fisher embeddings.
Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade.
In Neural Information Processing Systems, NeurIPS 2021.
[Arxiv version]

Trace reconstruction: Generalized and parameterized.
Akshay Krishnamurthy, Arya Mazumdar, Andrew McGregor, Soumyabrata Pal.
In IEEE Transactions on Information Theory, Levenshtein memorial issue, 2021.
Conference version in European Symposium on Algorithms, ESA 2019.
[Arxiv version]

Optimism in reinforcement learning with generalized linear function approximation.
Yining Wang, Ruosong Wang, Simon S. Du, Akshay Krishnamurthy.
In International Conference on Learning Representations, ICLR 2021.
[Arxiv version]

Contrastive learning, multi-view redundancy, and linear models.
Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu.
In International Conference on Algorithmic Learning Theory, ALT 2021.
[Arxiv version]

Learning the linear quadratic regulator from nonlinear observations.
Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford.
In Neural Information Processing Systems, NeurIPS 2020.
[Arxiv version]

FLAMBE: Structural complexity and representation learning of low rank MDPs.
Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun.
In Neural Information Processing Systems, NeurIPS 2020. Oral presentation.
[Arxiv version][poster]

Information theoretic regret bounds for online nonlinear control.
Sham Kakade, Akshay Krishnamurthy, Kendall Lowrey, Motoya Ohnishi, Wen Sun.
In Neural Information Processing Systems, NeurIPS 2020.
[Arxiv version]

Sample-efficient reinforcement learning of undercomplete POMDPs.
Chi Jin, Sham Kakade, Akshay Krishnamurthy, Qinghua Liu.
In Neural Information Processing Systems, NeurIPS 2020. Spotlight presentation.
[Arxiv version]

Provably adaptive reinforcement learning in metric spaces.
Tongyi Cao, Akshay Krishnamurthy.
In Neural Information Processing Systems, NeurIPS 2020.
[Arxiv version fixes an error in the NeurIPS version.]

Efficient contextual bandits with continuous actions.
Maryam Majzoubi, Chicheng Zhang, Rajan Chari, Akshay Krishnamurthy, John Langford, Aleksandrs Slivkins.
In Neural Information Processing Systems, NeurIPS 2020.
[Arxiv version][code]

Contextual bandits with continuous actions: Smoothing, zooming, and adapting.
Akshay Krishnamurthy, John Langford, Aleksandrs Slivkins, Chicheng Zhang.
In Journal of Machine Learning Research, JMLR 2020.
Conference version in Conference on Learning Theory, COLT 2019.
[Arxiv version][poster]

Kinematic state abstraction and provably efficient rich-observation reinforcement learning.
Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford.
In International Conference on Machine Learning, ICML 2020.
[Arxiv version][blog]

Adaptive estimator selection for off-policy evaluation.
Yi Su, Pavithra Srinath, Akshay Krishnamurthy.
In International Conference on Machine Learning, ICML 2020.
[Arxiv version]

Doubly robust off-policy evaluation with shrinkage.
Yi Su, Maria Dimakopoulou, Akshay Krishnamurthy, Miroslav Dudik.
In International Conference on Machine Learning, ICML 2020.
[Arxiv version]

Reward-free exploration for reinforcement learning.
Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu.
In International Conference on Machine Learning, ICML 2020.
[Arxiv version]

Private reinforcement learning with PAC and regret guarantees.
Giuseppe Vietri, Borja Balle, Zhiwei Steven Wu, Akshay Krishnamurthy.
In International Conference on Machine Learning, ICML 2020.
[Arxiv version]

Open problem: Model selection for contextual bandits.
Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo.
In Conference on Learning Theory, COLT 2020.
[Arxiv version]

Deep batch active learning by diverse, uncertain gradient lower bounds.
Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, Alekh Agarwal.
In International Conference on Learning Representations, ICLR 2020. Oral presentation.
[Arxiv version]

Algebraic and analytic approaches for parameter learning in mixture models.
Akshay Krishnamurthy, Arya Mazumdar, Andrew McGregor, Soumyabrata Pal.
In International Conference on Algorithmic Learning Theory, ALT 2020.
[Arxiv version]

Model selection for contextual bandits.
Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo.
In Neural Information Processing Systems, NeurIPS 2019. Spotlight presentation.
[Arxiv version]

Sample complexity of learning mixtures of sparse linear regressions.
Akshay Krishnamurthy, Arya Mazumdar, Andrew McGregor, Soumyabrata Pal.
In Neural Information Processing Systems, NeurIPS 2019.
[Arxiv version]

Scalable hierarchical clustering via tree grafting.
Nicholas Monath, Ari Kobren, Akshay Krishnamurthy, Michael Glass, Andrew McCallum.
In Knowledge Discovery and Data Mining, KDD 2019. Oral presentation.
[Arxiv version]

Provably efficient RL with rich observations via latent state decoding.
Simon S. Du, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal, Miroslav Dudik, John Langford.
In International Conference on Machine Learning, ICML 2019.
[Arxiv version][code][blog]

Myopic posterior sampling for adaptive goal oriented design of experiments.
Kirthevasan Kandasamy, Willie Neiswanger, Reed Zhang, Akshay Krishnamurthy, Jeff Schneider, Barnabas Poczos.
In International Conference on Machine Learning, ICML 2019.
[Arxiv version]

Disagreement-based combinatorial pure exploration: Sample complexity bounds and an efficient algorithm.
Tongyi Cao, Akshay Krishnamurthy.
In Conference on Learning Theory, COLT 2019.
[Arxiv version][poster]

Model-based reinforcement learning in contextual decision processes: PAC bounds and exponential improvements over model-free approaches.
Wen Sun, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford.
In Conference on Learning Theory, COLT 2019.
[Arxiv version][poster]

Active learning for cost-sensitive classification.
Akshay Krishnamurthy, Alekh Agarwal, Tzu-Kuo Huang, Hal Daume III, John Langford.
In Journal of Machine Learning Research, JMLR 2019.
Conference version in International Conference on Machine Learning, ICML 2017.
[Arxiv version][code]

Contextual bandits with surrogate losses: Margin bounds and efficient algorithms.
Dylan J. Foster, Akshay Krishnamurthy.
In Neural Information Processing Systems, NeurIPS 2018.
[Arxiv version][poster]

On oracle-efficient PAC reinforcement learning with rich observations.
Christoph Dann, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire.
In Neural Information Processing Systems, NeurIPS 2018. Spotlight presentation.
[Arxiv version][poster]

Extreme compressive sampling for covariance estimation.
Martin Azizyan, Akshay Krishnamurthy, Aarti Singh.
In IEEE Transactions on Information Theory, 2018.
[Arxiv version]

Semiparametric contextual bandits.
Akshay Krishnamurthy, Zhiwei Steven Wu, Vasilis Syrgkanis.
In International Conference on Machine Learning, ICML 2018.
[Arxiv version][code]

Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning.
Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum.
In International Conference on Learning Representations, ICLR 2018.
[Arxiv version][code ]

Asynchronous parallel bayesian optimisation via Thompson Sampling.
Kirthevasan Kandasamy, Akshay Krishnamurthy, Jeff Schneider, Barnabas Poczos.
In Artificial Intelligence and Statistics, AISTATS 2018. Oral presentation.
[Arxiv version]`

Off-policy evaluation for slate recommendation.
Adith Swaminathan, Akshay Krishnamurthy, Alekh Agarwal, Miroslav Dudik, John Langford, Damien Jose, Imed Zitouni.
In Neural Information Processing Systems, NIPS 2017. Oral presentation.
[Arxiv version][code]

An online hierarchical algorithm for extreme clustering.
Ari Kobren, Nicholas Monath, Akshay Krishnamurthy, Andrew McCallum.
In Knowledge Discovery and Data Mining, KDD 2017. Oral presentation.
[Arxiv version][code]

Contextual decision processes with low Bellman rank are PAC-learnable.
Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire.
In International Conference on Machine Learning, ICML 2017.
[Arxiv version]

Open problem: First-order regret bounds for contextual bandits.
Alekh Agarwal, Akshay Krishnamurthy, John Langford, Haipeng Luo, Robert E. Schapire.
In Conference on Learning Theory, COLT 2017.

PAC reinforcement learning with rich observations.
Akshay Krishnamurthy, Alekh Agarwal, John Langford.
In Neural Information Processing Systems, NeurIPS 2016.
[Arxiv version]

Contextual semibandits via supervised learning oracles.
Akshay Krishnamurthy, Alekh Agarwal, Miroslav Dudik.
In Neural Information Processing Systems, NeurIPS 2016.
[Arxiv version][code]

Improved regret bounds for oracle-based adversarial contextual bandits.
Vasilis Syrgkanis, Haipeng Luo, Akshay Krishnamurthy, Robert E. Schapire.
In Neural Information Processing Systems, NeurIPS 2016.
[Arxiv version]

Efficient algorithms for adversarial contextual learning.
Vasilis Syrgkanis, Akshay Krishnamurthy, Robert E. Schapire.
In International Conference on Machine Learning, ICML 2016.
[Arxiv version]

Minimax structured normal means inference.
Akshay Krishnamurthy.
In International Symposium on Information Theory, ISIT 2016.
[Arxiv version]

Nonparametric von Mises estimators for entropies, divergences, and mutual informations.
Kirthevasan Kandasamy, Akshay Krishnamurthy, Barnabas Poczos, Larry Wasserman, and James M. Robins.
In Neural Information Processing Systems, NeurIPS 2015.
[Arxiv version][code]

Learning to search better than your teacher.
Kai-Wei Chang, Akshay Krishnamurthy, Alekh Agarwal, Hal Daume III, John Langford.
International Conference on Machine Learning, ICML 2015.
[Arxiv version][code]

On estimating L_2^2 divergence.
Akshay Krishnamurthy, Kirthevasan Kandasamy, Barnabas Poczos and Larry Wasserman.
In Artificial Intelligence and Statistics, AISTATS 2015.
[Arxiv version][code]

Subspace learning from extremely compressed measurements.
Akshay Krishnamurthy, Martin Azizyan, and Aarti Singh.
In Asilomar Conference on Signals, Systems and Computers, 2014.
[Arxiv version]

Nonparametric estimation of Renyi divergence and friends.
Akshay Krishnamurthy, Kirthevasan Kandasamy, Barnabas Poczos, and Larry Wasserman.
In International Conference on Machine Learning, ICML 2014.
[Arxiv version][code]

Recovering graph-structured activations using adaptive compressive measurements.
Akshay Krishnamurthy, James Sharpnack, and Aarti Singh.
In Asilomar Conference on Signals, Systems and Computers, 2013.
Winner of the Best Student Paper Award.
[Arxiv version]

Low-rank matrix and tensor completion via adaptive sampling.
Akshay Krishnamurthy and Aarti Singh.
In Neural Information Processing Systems, NeurIPS 2013.
[Arxiv version]

Near-optimal anomaly detection in graphs using Lovasz extended scan statistic.
James Sharpnack, Akshay Krishnamurthy, and Aarti Singh.
In Neural Information Processing Systems, NeurIPS 2013.
[Arxiv version]

Detecting activations over graphs using spanning tree wavelet bases.
James Sharpnack, Akshay Krishnamurthy and Aarti Singh.
In Artificial Intelligence and Statistics, AISTATS 2013. Oral presentation.
[Arxiv version]

Completion of high-rank ultrametric matrices using selective entries.
Aarti Singh, Akshay Krishnamurthy, Sivaraman Balakrishnan and Min Xu.
In International Conference on Signal Processing and Communications, SPCOM 2012.
[pdf]

Efficient active algorithms for hierarchical clustering.
Akshay Krishnamurthy, Sivaraman Balakrishnan, Min Xu, and Aarti Singh.
In International Conference on Machine Learning, ICML 2012.
[Arxiv version][code]

Robust multi-source network tomography using selective probes.
Akshay Krishnamurthy and Aarti Singh.
In International Conference on Computer Communication, INFOCOM 2012.
[pdf]

Noise thresholds for spectral clustering.
Sivaraman Balakrishnan, Min Xu, Akshay Krishnamurthy, Aarti Singh.
In Neural Information Processing Systems, NeurIPS 2011. Spotlight presentation.
[pdf]

DEGAS: De novo discovery of dysregulated pathways in human diseases.
Igor Ulitsky, Akshay Krishnamurthy, Richard Karp, Ron Shamir.
In PLoS ONE. October 2010.
[pdf]

Fine-grained privilege separation for web applications.
Akshay Krishnamurthy, Adrian Mettler, and David Wagner.
In International World Wide Web Conference, WWW 2010.
[pdf]

Old Preprints

Exploratory gradient boosting for reinforcement learning in complex domains.
David Abel, Alekh Agarwal, Fernando Diaz, Akshay Krishnamurthy, Robert E. Schapire.
[Arxiv version]

On the power of adaptivity in matrix completion and approximation.
Akshay Krishnamurthy and Aarti Singh.
[Arxiv version]

PhD Thesis

Interactive algorithms for unsupervised machine learning. [pdf][Proposal].

Akshay Krishnamurthy

Main

Publications

Teaching

Miscellaneous

Publications

Old Preprints

PhD Thesis