Scott Niekum

Titles in bold indicate strongly peer-reviewed conference or journal papers
Regular font titles denote preprints, technical reports, workshop papers, and other documents

Preprints

T. Tripathi, M. Wadhwa, G. Durrett, S. Niekum.
Pairwise or Pointwise? Evaluating Feedback Protocols for Bias in LLM-Based Evaluation.
arXiv 2504.14716.

H. Sikchi, S. Agarwal, P. Jajoo, S. Parajuli, C. Chuck, M. Rudolph, P. Stone, A. Zhang, S. Niekum.
RL Zero: Zero-Shot Language to Behaviors Without Any Supervision.
arXiv 2412.05718.
[Website and Code]

C. Chuck, S. Vaidyanathan, S. Giguere, A. Zhang, D. Jensen, S. Niekum.
Automated Discovery of Functional Actual Causes in Complex Environments.
arXiv 2404.10883.

2025

H. Sikchi, A. Tirinzoni, A. Touati, Y. Xu, A. Kanervisto, S. Niekum, A. Zhang, A. Lazaric, M. Pirotta.
Fast Adaptation with Behavioral Foundation Models.
Reinforcement Learning Conference (RLC), August 2025.

R. Boldi, L. Ding, L. Spector, S. Niekum.
Pareto-Optimal Learning from Preferences with Hidden Context.
Reinforcement Learning Conference (RLC), August 2025.

H.T. Tse, P.S. Thomas, S. Niekum.
High-Confidence Policy Improvement from Human Feedback.
Reinforcement Learning Conference (RLC), August 2025.

Y. Chittepu, B. Metevier, W. Schwarzer, S. Niekum, P.S. Thomas.
Reinforcement Learning from Human Feedback with High-Confidence Safety Guarantees.
Reinforcement Learning Conference (RLC), August 2025.

C. Chuck, F. Feng, C. Qi, C. Shi, S. Agarwal, A. Zhang, S. Niekum.
Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning.
International Conference on Learning Representations (ICLR), April 2025.

H. Xu, S. Li, H. Sikchi, S. Niekum, A. Zhang.
An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning.
International Conference on Learning Representations (ICLR), April 2025.

2024

R. Rafailov, Y. Chittepu, R. Park, H. Sikchi, J. Hejna, W. Knox, C. Finn, S. Niekum.
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms.
Neural Information Processing Systems (NeurIPS), December 2024.

Z. Wang, J. Hu, C. Chuck, S. Chen, R. Martín-Martín, A. Zhang, S. Niekum, P. Stone.
SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions.
Neural Information Processing Systems (NeurIPS), December 2024.

S. Chung, S. Niekum, D. Krueger.
Predicting Future Actions of Reinforcement Learning Agents.
Neural Information Processing Systems (NeurIPS), December 2024.

H. Sikchi, C. Chuck, A. Zhang, S. Niekum.
A Dual Approach to Imitation Learning from Observations with Offline Datasets.
Conference on Robot Learning (CoRL), November 2024.

P. Singhal, N. Lambert, S. Niekum, T. Goyal, G. Durrett.
D2PO: Discriminator-Guided DPO with Response Evaluation Models.
Conference on Language Modeling (COLM), October 2024.

M. Rudolph, C. Chuck, K. Black, M. Lvovsky, S. Niekum, and A. Zhang
Learning Action-based Representations Using Invariance.
Reinforcement Learning Conference (RLC), August 2024.

H. Sikchi, R. Chitnis, A. Touati, A. Geramifard, A. Zhang, S. Niekum
Score Models for Offline Goal-Conditioned Reinforcement Learning.
International Conference on Learning Representations (ICLR), May 2024.

H. Sikchi, Q. Zheng, A. Zhang, S. Niekum.
Dual RL: Unification and New Methods for Reinforcement and Imitation Learning.
International Conference on Learning Representations (ICLR), May 2024.
Spotlight: top 5% of accepted papers
[f-DVL Code] [ReCOIL Code]

J. Hejna, R. Rafailov, H. Sikchi, C. Finn, S. Niekum, W. Bradley Knox, D. Sadigh
Contrastive Preference Learning: Learning from Human Feedback without RL.
International Conference on Learning Representations (ICLR), May 2024.

C. Chuck, K. Black, A. Arjun, Y. Zhu, S. Niekum.
Granger-Causal Hierarchical Skill Discovery.
Transactions on Machine Learning Research (TMLR), March 2024.

W. Knox, S. Hatgis-Kessell, S.O. Adalgeirsson, S. Booth, A. Dragan, P. Stone, S. Niekum.
Learning Optimal Advantage from Preferences and Mistaking it for Reward.
AAAI Conference on Artificial Intelligence, February 2024.

W. Knox, S. Hatgis-Kessell, S. Booth, S. Niekum, P. Stone, A. Allievi.
Models of Human Preference for Learning Reward Functions.
Transactions on Machine Learning Research (TMLR), January 2024.

2023

S. Booth, W. Knox, J. Shah, S. Niekum, P. Stone, and A. Allievi.
The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications.
AAAI Conference on Artificial Intelligence, February 2023.

H. Sikchi, A. Saran, W. Goo, and S. Niekum.
A Ranking Game for Imitation Learning.
Transactions on Machine Learning Research (TMLR), January 2023.

2022

A. Saran, K. Desai, M.L. Chang, R. Lioutikov, A. Thomaz, and S. Niekum.
Understanding Acoustic Patterns of Human Teachers Demonstrating Manipulation Tasks to Robots.
International Conference on Intelligent Robots and Systems (IROS), October 2022.

Y. Cui, S. Niekum, A. Gupta, V. Kumar, and A. Rajeswaran.
Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?
Learning for Dynamics and Control Conference (L4DC), June 2022.

S. Giguere, B. Metevier, B. Castro da Silva, Y. Brun, P.S. Thomas, and S. Niekum.
Fairness Guarantees under Demographic Shift.
International Conference on Learning Representations (ICLR), April 2022.

2021

C. Yuan, Y. Chandak, S. Giguere, P.S. Thomas, and S. Niekum.
SOPE: Spectrum of Off-Policy Estimators.
Neural Information Processing Systems (NeurIPS), December 2021.

I. Durugkar, M. Tec, S. Niekum, and P. Stone.
Adversarial Intrinsic Motivation for Reinforcement Learning.
Neural Information Processing Systems (NeurIPS), December 2021.
[Code]

Y. Chandak, S. Niekum, B. Castro da Silva, E. Learned-Miller, E. Brunskill, and P.S. Thomas.
Universal Off-Policy Evaluation.
Neural Information Processing Systems (NeurIPS), December 2021.
Winner of the RLDM 2022 Best Paper Award.
[Code]

A. Jain, S. Giguere, R. Lioutikov, and S. Niekum.
Distributional Depth-Based Estimation of Object Articulation Models.
Conference on Robot Learning (CoRL), November 2021.
[Project Page and Code]

M. Kim, S. Niekum, and A. Deshpande.
SCAPE: Learning Stiffness Control from Augmented Position Control Experiences.
Conference on Robot Learning (CoRL), November 2021.
[Code]

W. Goo and S. Niekum.
You Only Evaluate Once: A Simple Baseline Algorithm for Offline RL.
Conference on Robot Learning (CoRL), November 2021.
[Code]

F. Memarian, W. Goo, R. Lioutikov, S. Niekum, and U. Topcu.
Self-Supervised Online Reward Shaping in Sparse-Reward Environments.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2021.

Y. Cui, P. Koppol, H. Admoni, S. Niekum, R. Simmons, A. Steinfeld, and T. Fitzgerald.
Understanding the Relationship between Interactions and Outcomes in Human-in-the-Loop Machine Learning.
International Joint Conference on Artificial Intelligence (IJCAI), August 2021.

D.S. Brown, J. Schneider, A. Dragan, and S. Niekum.
Value Alignment Verification.
International Conference on Machine Learning (ICML), July 2021.
[Project Page and Code]

J.P. Hanna, S. Niekum, and P. Stone.
Importance Sampling in Reinforcement Learning with an Estimated Behavior Policy.
Machine Learning Journal (MLJ), June 2021.

A. Jain, R. Lioutikov, C. Chuck, and S. Niekum.
ScrewNet: Category-Independent Articulation Model Estimation From Depth Images Using Screw Theory.
IEEE International Conference on Robotics and Automation (ICRA), June 2021.
[Code]

A. Saran, R. Zhang, E.S. Short, and S. Niekum.
Efficiently Guiding Imitation Learning Algorithms with Human Gaze.
International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2021.
[Code]

O. Kroemer, S. Niekum, and G. Konidaris.
A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms.
Journal of Machine Learning Research, 22(30):1-82, January 2021.

2020

D.S. Brown, S. Niekum, and M. Petrik.
Bayesian Robust Optimization for Imitation Learning.
Neural Information Processing Systems (NeurIPS), December 2020.

Y. Cui, Q. Zhang, A. Allievi, P. Stone, S. Niekum, and W. Knox.
The EMPATHIC Framework for Task Learning from Implicit Human Feedback.
Conference on Robot Learning (CoRL), November 2020.
[Project Page and Code]

P. Goyal, S. Niekum, and R. Mooney.
PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards.
Conference on Robot Learning (CoRL), November 2020.
[Project Page and Code]

C. Chuck, S. Chockchowwat, and S. Niekum.
Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2020.

A. Jain and S. Niekum.
Learning Hybrid Object Kinematics for Efficient Hierarchical Planning Under Uncertainty.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2020.

D.S. Brown, R. Coleman, R. Srinivasan, and S. Niekum.
Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences.
International Conference on Machine Learning (ICML), July 2020.
[Project Page and Code]

R. Zhang, A. Saran, B. Liu, Y. Zhu, S. Guo, S. Niekum, D. Ballard, M. Hayhoe.
Human Gaze Assisted Artificial Intelligence: A Review.
International Joint Conference on Artificial Intelligence (IJCAI), July 2020.

2019

D.S. Brown, W. Goo, and S. Niekum.
Better-than-Demonstrator Imitation Learning via Automatically-Ranked Demonstrations.
Conference on Robot Learning (CoRL), October 2019.
[Project Page and Code]

A. Saran, E.S. Short, A.L. Thomaz, and S. Niekum.
Understanding Teacher Gaze Patterns for Robot Learning.
Conference on Robot Learning (CoRL), October 2019.
[Code]

P. Goyal, S. Niekum, and R. Mooney.
Using Natural Language for Reward Shaping in Reinforcement Learning.
International Joint Conference on Artificial Intelligence (IJCAI), August 2019.

D.S. Brown, W. Goo, P. Nagarajan, and S. Niekum.
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations.
International Conference on Machine Learning (ICML), June 2019.
[Project Page and Code]

J.P. Hanna, S. Niekum, and P. Stone.
Importance Sampling Policy Evaluation with an Estimated Behavior Policy.
International Conference on Machine Learning (ICML), June 2019.

W. Goo and S. Niekum.
One-Shot Learning of Multi-Step Tasks from Observation via Activity Localization in Auxiliary Video.
IEEE International Conference on Robotics and Automation (ICRA), May 2019.

Y. Cui, D. Isele, S. Niekum, and K. Fujimura.
Uncertainty-Aware Data Aggregation for Deep Imitation Learning.
IEEE International Conference on Robotics and Automation (ICRA), May 2019.

D.S. Brown and S. Niekum.
Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications.
AAAI Conference on Artificial Intelligence, February 2019.

2018

A. Jain and S. Niekum.
Efficient Hierarchical Robot Motion Planning Under Uncertainty and Hybrid Dynamics.
Conference on Robot Learning (CoRL), October 2018.
[Code] [Video]

D.S. Brown, Y. Cui, and S. Niekum.
Risk-Aware Active Inverse Reinforcement Learning.
Conference on Robot Learning (CoRL), October 2018.

A. Saran, S. Majumdar, E.S. Short, A.L. Thomaz, and S. Niekum.
Human Gaze Following for Human-Robot Interaction.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2018.
[Code]

Y. Cui and S. Niekum.
Active Reward Learning from Critiques.
IEEE International Conference on Robotics and Automation (ICRA), May 2018.

R.A. Gutierrez, V. Chu, A.L. Thomaz, and S. Niekum.
Incremental Task Modification via Corrective Demonstrations.
IEEE International Conference on Robotics and Automation (ICRA), May 2018.

D.S. Brown and S. Niekum.
Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning.
AAAI Conference on Artificial Intelligence, February 2018.
[Code]

M. Alshiekh, R. Bloem, R. Ehlers, B. Könighofer, S. Niekum, and U. Topcu.
Safe Reinforcement Learning via Shielding.
AAAI Conference on Artificial Intelligence, February 2018.

2017

A. Saran, B. Lakic, S. Majumdar, J. Hess, and S. Niekum.
Viewpoint Selection for Visual Failure Detection.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017.

H.A. Poonawala, M. Alshiekh, S. Niekum, and U. Topcu.
Classification Error Correction: A Case Study in Brain-Computer Interfacing.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017.

J.P. Hanna, P.S. Thomas, P. Stone, and S. Niekum.
Data-Efficient Policy Evaluation Through Behavior Policy Search.
Proceedings of the 34th International Conference on Machine Learning (ICML), August 2017.

J.P. Hanna, P. Stone, and S. Niekum.
Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation.
Proceedings of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2017.

2016

P. Khandelwal, E. Liebman, S. Niekum, and P. Stone.
On the Analysis of Complex Backup Strategies in Monte Carlo Tree Search.
Proceedings of the 33rd International Conference on Machine Learning (ICML), June 2016.

2015

P.S. Thomas, S. Niekum, G. Theocharous, and G.D. Konidaris.
Policy Evaluation Using the Omega-Return.
Advances in Neural Information Processing Systems 29 (NeurIPS), pages 334–342, December 2015.

S. Niekum, S. Osentoski, C.G. Atkeson, and A.G. Barto.
Online Bayesian Changepoint Detection for Articulated Motion Models.
IEEE International Conference on Robotics and Automation (ICRA), May 2015.
[Code] [bibtex]

K. Hausman, S. Niekum, S. Osentoski, and G. Sukhatme.
Active Articulation Model Estimation through Interactive Perception.
IEEE International Conference on Robotics and Automation (ICRA), May 2015.
[Code] [bibtex]

S. Niekum, S. Osentoski, G.D. Konidaris, S. Chitta, B. Marthi, and A.G. Barto.
Learning Grounded Finite-State Representations from Unstructured Demonstrations.
International Journal of Robotics Research (IJRR), Vol. 34(2), pages 131-157, February 2015.
[Video] [Code] [bibtex] [Freely accessible draft]

S. Niekum.
A Brief Introduction to Bayesian Nonparametric Methods for Clustering and Time Series Analysis.
Technical report CMU-RI-TR-15-02, Robotics Institute, Carnegie Mellon University, January 2015.
[bibtex]

2014

S. Niekum, S. Osentoski, C.G. Atkeson, and A.G. Barto.
Learning Articulation Changepoint Models from Demonstration.
R:SS Workshop on Learning Plans with Context from Human Signals, July 2014.
[Code]

S. Niekum, S. Osentoski, C.G. Atkeson, and A.G. Barto.
CHAMP: Changepoint Detection Using Approximate Model Parameters.
Technical report CMU-RI-TR-14-10, Robotics Institute, Carnegie Mellon University, June 2014.
[Code] [bibtex]

2013

S. Niekum.
Semantically Grounded Learning from Unstructured Demonstrations.
Doctoral Dissertation, Department of Computer Science, University of Massachusetts Amherst, September 2013.

S. Niekum, S. Osentoski, S. Chitta, B. Marthi, and Andrew G. Barto.
Incremental Semantically Grounded Learning from Demonstration.
Robotics: Science and Systems 9 (RSS), June 2013.
[Video] [bibtex]

G.D. Konidaris, S. Kuindersma, S. Niekum, R.A. Grupen and A.G. Barto.
Robot Learning: Some Recent Examples.
In Proceedings of the Sixteenth Yale Workshop on Adaptive and Learning Systems, pages 71-76, June 2013.

S. Niekum.
An Integrated System for Learning Multi-Step Robotic Tasks from Unstructured Demonstrations.
AAAI Spring Symposium: Reintegrating AI II, March 2013.

2012

S. Niekum, S. Osentoski, G.D. Konidaris, and Andrew G. Barto.
Learning and Generalization of Complex Tasks from Unstructured Demonstrations.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5239-5246, October 2012.
[Video] [Code] [bibtex]

S. Niekum.
Complex Task Learning from Unstructured Demonstrations.
AAAI Doctoral Consortium, July 2012.

2011

G.D. Konidaris, S. Niekum, and P.S. Thomas.
TD γ: Reevaluating Complex Backups in Temporal Difference Learning.
Advances in Neural Information Processing Systems 24 (NeurIPS), pages 2402-2410, December 2011.
[bibtex]

S. Niekum and A.G. Barto.
Clustering via Dirichlet Process Mixture Models for Portable Skill Discovery.
Advances in Neural Information Processing Systems 24 (NeurIPS), pages 1818-1826, December 2011.
[bibtex]

S. Niekum and A.G. Barto.
Clustering via Dirichlet Process Mixture Models for Portable Skill Discovery.
AAAI Workshop on Lifelong Learning from Sensorimotor Experience, August 2011.

2010

S. Niekum, A.G. Barto, and L. Spector.
Genetic Programming for Reward Function Search.
IEEE Transactions on Autonomous Mental Development, vol.2, no.2, pages 83-90, June 2010.
[Code] [bibtex]

S. Niekum.
Evolved Intrinsic Reward Functions for Reinforcement Learning.
Proceedings of the Twenty-Fourth Conference on Artificial Intelligence (AAAI), July 2010. (Extended abstract)

2005

S. Niekum.
Reliable Rock Detection and Classification for Autonomous Science.
Carnegie Mellon Senior Thesis, April 2005.

D.R. Thompson, S. Niekum, T. Smith, and D. Wettergreen.
Automatic Detection and Classification of Geological Features of Interest.
IEEE Aerospace Conference Proceedings, March 2005.
[bibtex]

T. Smith, S. Niekum, D.R. Thompson, and D. Wettergreen.
Concepts for Science Autonomy During Robotic Traverse and Survey.
IEEE Aerospace Conference Proceedings, March 2005.
[bibtex]