Leveraging existing instrumentation to automatically infer invariant-constrained models
by Ivan Beschastnikh, Yuriy Brun, Sigurd Schneider, Michael Sloan, Michael D. Ernst
Abstract:

Computer systems are often difficult to debug and understand. A common way of gaining insight into system behavior is to inspect execution logs and documentation. Unfortunately, manual inspection of logs is an arduous process and documentation is often incomplete and out of sync with the implementation.

This paper presents Synoptic, a tool that helps developers by inferring a concise and accurate system model. Unlike most related work, Synoptic does not require developer-written scenarios, specifications, negative execution examples, or other complex user input. Synoptic processes the logs most systems already produce and requires developers only to specify a set of regular expressions for parsing the logs.

Synoptic has two unique features. First, the model it produces satisfies three kinds of temporal invariants mined from the logs, improving accuracy over related approaches. Second, Synoptic uses refinement and coarsening to explore the space of models. This improves model efficiency and precision, compared to using just one approach.

In this paper, we formally prove that Synoptic always produces a model that satisfies exactly the temporal invariants mined from the log, and we argue that it does so efficiently. We empirically evaluate Synoptic through two user experience studies, one with a developer of a large, real-world system and another with 45 students in a distributed systems course. Developers used Synoptic-generated models to verify known bugs, diagnose new bugs, and increase their confidence in the correctness of their systems. None of the developers in our evaluation had a background in formal methods but were able to easily use Synoptic and detect implementation bugs in as little as a few minutes.

Citation:
Ivan Beschastnikh, Yuriy Brun, Sigurd Schneider, Michael Sloan, and Michael D. Ernst, Leveraging existing instrumentation to automatically infer invariant-constrained models, in Proceedings of the 8th Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), 2011, pp. 267–277.
Bibtex:
@inproceedings{Beschastnikh11fse,
  author = {Ivan Beschastnikh and Yuriy Brun and Sigurd Schneider and Michael
  Sloan and Michael D. Ernst},
  title =
  {\href{http://people.cs.umass.edu/brun/pubs/pubs/Beschastnikh11fse.pdf}{Leveraging
  existing instrumentation to automatically infer invariant-constrained
  models}},
  booktitle = {Proceedings of the 8th Joint Meeting of the European Software
  Engineering Conference and ACM SIGSOFT Symposium on the Foundations of
  Software Engineering (ESEC/FSE)},
  venue = {ESEC/FSE},
  month = {September},
  year = {2011},
  date = {5--9},
  pages = {267--277},
  address = {Szeged, Hungary},
  accept = {$\frac{34}{203} \approx 17\%$},
  doi = {10.1145/2025113.2025151},

  note = {\href{http://dx.doi.org/10.1145/2025113.2025151}{DOI:
  10.1145/2025113.2025151}},

  abstract = {<p>Computer systems are often difficult to debug and understand. A
  common way of gaining insight into system behavior is to inspect execution
  logs and documentation. Unfortunately, manual inspection of logs is an
  arduous process and documentation is often incomplete and out of sync with
  the implementation.</p>

  <p>This paper presents Synoptic, a tool that helps developers by inferring a
  concise and accurate system model. Unlike most related work, Synoptic does
  not require developer-written scenarios, specifications, negative execution
  examples, or other complex user input. Synoptic processes the logs most
  systems already produce and requires developers only to specify a set of
  regular expressions for parsing the logs.</p>

  <p>Synoptic has two unique features. First, the model it produces satisfies
  three kinds of temporal invariants mined from the logs, improving accuracy
  over related approaches. Second, Synoptic uses refinement and coarsening to
  explore the space of models. This improves model efficiency and precision,
  compared to using just one approach.</p>

  <p>In this paper, we formally prove that Synoptic always produces a model that
  satisfies exactly the temporal invariants mined from the log, and we argue
  that it does so efficiently. We empirically evaluate Synoptic through two
  user experience studies, one with a developer of a large, real-world system
  and another with 45 students in a distributed systems course. Developers
  used Synoptic-generated models to verify known bugs, diagnose new bugs, and
  increase their confidence in the correctness of their systems. None of the
  developers in our evaluation had a background in formal methods but were
  able to easily use Synoptic and detect implementation bugs in as little as a
  few minutes.</p>},

  fundedBy = {NSF CNS-0937060 to the CRA for the CIFellows Project, NSF CNS-0963754,
  Saarbr{\"{u}}cken Graduate School of Computer Science},
}