Fault and adversary tolerance as an emergent property of distributed systems' software architectures
by Yuriy Brun, Nenad Medvidovic
Abstract:
Fault and adversary tolerance have become not only desirable but required properties of software systems because mission-critical systems are commonly distributed on large networks of insecure nodes. In this paper, we describe how the tile style, an architectural style designed to distribute computation, can inject fault and adversary tolerance. The result is a notion of tolerance that is entirely abstracted away from the functional properties of the software system. The client may specify what fraction of the network is faulty or malicious (e.g., $25\%$) and the acceptable system failure rate (e.g., $2^-10$), and the system's architecture adjusts automatically to ensure a failure rate no higher than the one specified. The technique is entirely automated and consists of a ``smart redundancy'' mechanism that brings the failure rate exponentially close to $0$ by slowing down the execution speed linearly.
Citation:
Yuriy Brun and Nenad Medvidovic, Fault and adversary tolerance as an emergent property of distributed systems' software architectures, in Proceedings of the 2nd International Workshop on Engineering Fault Tolerant Systems (EFTS), 2007, pp. 38–43.
Bibtex:
@inproceedings{Brun07efts,
  author = {Yuriy Brun and Nenad Medvidovic},
  title = {\href{http://people.cs.umass.edu/brun/pubs/pubs/Brun07efts.pdf}{Fault
  and adversary tolerance as an emergent property of distributed systems'
  software architectures}},
  booktitle = {Proceedings of the 2nd International Workshop on Engineering
  Fault Tolerant Systems (EFTS)},
  venue = {EFTS},
  month = {September},
  date = {4},
  year = {2007},
  pages = {38--43},
  address = {Dubrovnik, Croatia},
  doi = {10.1145/1316550.1316557},

  note = {\href{http://dx.doi.org/10.1145/1316550.1316557}{DOI:
  10.1145/1316550.1316557}},

  abstract = {Fault and adversary tolerance have become not only desirable but
  required properties of software systems because mission-critical systems are
  commonly distributed on large networks of insecure nodes. In this paper, we
  describe how the tile style, an architectural style designed to distribute
  computation, can inject fault and adversary tolerance. The result is a notion
  of tolerance that is entirely abstracted away from the functional properties
  of the software system. The client may specify what fraction of the network is
  faulty or malicious (e.g., $25\%$) and the acceptable system failure rate
  (e.g., $2^{-10}$), and the system's architecture adjusts automatically to
  ensure a failure rate no higher than the one specified. The technique is
  entirely automated and consists of a ``smart redundancy'' mechanism that
  brings the failure rate exponentially close to $0$ by slowing down the
  execution speed linearly.},

  fundedBy = {NSF ITR-0312780},
}