by Ivo Krka, Yuriy Brun, Nenad Medvidovic
Abstract:
Software library documentation often describes individual methods' APIs, but not the intended protocols and method interactions. This can lead to library misuse, and restrict runtime detection of protocol violations and automated verification of software that uses the library. Specification mining, if accurate, can help mitigate these issues, which has led to significant research into new model-inference techniques that produce FSM-based models from program invariants and execution traces. However, there is currently a lack of empirical studies that, in a principled way, measure the impact of the inference strategies on model quality. To this end, we identify four such strategies and systematically study the quality of the models they produce for nine off-the-shelf, real-world libraries. We find that (1) using invariants to infer an initial model significantly improves model quality, increasing precision by 4% and recall by 41%, on average; (2) effective invariant filtering is crucial for quality and scalability of strategies that use invariants; and (3) using traces in combination with invariants greatly improves robustness to input noise. We present our empirical evaluation, implement new and extend existing model-inference techniques, and make public our implementations, subject libraries, ground-truth models, and experimental data. Our work can lead to higher-quality model inference, and directly improve the name techniques and tools that rely on model, specification, and API inference.
Citation:
Ivo Krka, Yuriy Brun, and Nenad Medvidovic, Automatic Mining of Specifications from Invocation Traces and Method Invariants, in Proceedings of the 22nd ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), 2014, pp. 178–189.
Related:
A previous version appeared as University of Southern
California, Center for Software Engineering technical report
USC-CSSE-2013-509.
Bibtex:
@inproceedings{Krka14fse,
author = {Ivo Krka and Yuriy Brun and Nenad Medvidovic},
title =
{\href{http://people.cs.umass.edu/brun/pubs/pubs/Krka14fse.pdf}{Automatic
Mining of Specifications from Invocation Traces
and Method Invariants}},
booktitle = {Proceedings of the 22nd ACM SIGSOFT Symposium on the
Foundations of Software Engineering (FSE)},
venue = {FSE},
month = {November},
year = {2014},
date = {16--22},
address = {Hong Kong, China},
accept = {$\frac{61}{273} \approx 22\%$},
pages = {178--189},
doi = {10.1145/2635868.2635890},
note = {A previous version appeared as University of Southern
California, Center for Software Engineering technical report
USC-CSSE-2013-509.
\href{https://doi.org/10.1145/2635868.2635890}{DOI:
10.1145/2635868.2635890}},
previous = {A previous version appeared as University of Southern
California, Center for Software Engineering technical report
USC-CSSE-2013-509.},
abstract = {Software library documentation often describes
individual methods' APIs, but not the intended protocols and method
interactions. This can lead to library misuse, and restrict runtime
detection of protocol violations and automated verification of
software that uses the library. Specification mining, if accurate,
can help mitigate these issues, which has led to significant
research into new model-inference techniques that produce FSM-based
models from program invariants and execution traces. However, there
is currently a lack of empirical studies that, in a principled way,
measure the impact of the inference strategies on model quality. To
this end, we identify four such strategies and systematically study
the quality of the models they produce for nine off-the-shelf,
real-world libraries. We find that (1) using invariants to infer an
initial model significantly improves model quality, increasing
precision by 4% and recall by 41%, on average; (2) effective
invariant filtering is crucial for quality and scalability of
strategies that use invariants; and (3) using traces in combination
with invariants greatly improves robustness to input noise. We
present our empirical evaluation, implement new and extend existing
model-inference techniques, and make public our implementations,
subject libraries, ground-truth models, and experimental data. Our
work can lead to higher-quality model inference, and directly
improve the name techniques and tools that rely on model,
specification, and API inference.},
fundedBy = {NSF CCF-1117593, NSF CCF-1218115, NSF CCF-1321141,
Infosys Technologies Ltd},
}