Beyond Prediction: Managing the Repercussions of Machine Learning Applications"/> Beyond Prediction: Managing the Repercussions of Machine Learning Applications"/>
Machine learning models are often designed to maximize a primary goal, such as accuracy. However, as these models are increasingly used to inform decisions that affect people's lives or well-being, it is often unclear what the real-world repercussions of their deployment might be — making it crucial to understand and manage such repercussions effectively. Models maximizing user engagement on social media platforms, e.g., may inadvertently contribute to the spread of misinformation and content that deepens political polarization. This issue is not limited to social media — it extends to other applications where machine learning-informed decisions can have real-world repercussions, such as education, employment, and lending. Existing methods addressing this issue require prior knowledge or estimates of analytical models describing the relationship between a classifier's predictions and their corresponding repercussions. We introduce Theia, a novel classification algorithm capable of optimizing a primary objective, such as accuracy, while providing high-confidence guarantees about its potential repercussions. Importantly, Theia solves the open problem of providing such guarantees based solely on existing data with observations of previous repercussions. We prove that it satisfies constraints on a model's repercussions with high confidence and that it is guaranteed to identify a solution, if one exists, given sufficient data. We empirically demonstrate, using real-life data, that Theia can identify models that achieve high accuracy while ensuring, with high confidence, that constraints on their repercussions are satisfied.
@inproceedings{Weber25neurips,
author = {Aline Weber and Blossom Metevier and Yuriy Brun and Philip S. Thomas and Bruno Castro da Silva},
title =
{Beyond Prediction:
Managing the Repercussions of Machine Learning Applications},
booktitle = {Proceedings of the 39th Annual Conference on Neural
Information Processing Systems (NeurIPS), Advances in Neural Information
Processing Systems 38},
venue = {NeurIPS},
address = {San Diego, CA, USA and Mexico City, Mexico},
month = {December},
date = {2--7},
year = {2025},
abstract = {Machine learning models are often designed to maximize a primary goal, such
as accuracy. However, as these models are increasingly used to inform
decisions that affect people's lives or well-being, it is often unclear what
the real-world repercussions of their deployment might be --- making it
crucial to understand and manage such repercussions effectively. Models
maximizing user engagement on social media platforms, e.g., may inadvertently
contribute to the spread of misinformation and content that deepens political
polarization. This issue is not limited to social media --- it extends to
other applications where machine learning-informed decisions can have
real-world repercussions, such as education, employment, and lending.
Existing methods addressing this issue require prior knowledge or estimates
of analytical models describing the relationship between a classifier's
predictions and their corresponding repercussions. We introduce Theia, a
novel classification algorithm capable of optimizing a primary objective,
such as accuracy, while providing high-confidence guarantees about its
potential repercussions. Importantly, Theia solves the open problem of
providing such guarantees based solely on existing data with observations of
previous repercussions. We prove that it satisfies constraints on a model's
repercussions with high confidence and that it is guaranteed to identify a
solution, if one exists, given sufficient data. We empirically demonstrate,
using real-life data, that Theia can identify models that achieve high
accuracy while ensuring, with high confidence, that constraints on their
repercussions are satisfied.},
accept = {$\frac{5,290}{21,575} \approx 25\%$},
}