Machine Learning and Friends Lunch

Using Structure Indices for Efficient Approximation of Network Properties

Abstract

Statistics on networks have become vital to the study of relational data drawn from areas including bibliometrics, fraud detection, bioinformatics, and the Internet. Calculating many of the most important measures~Wsuch as betweenness centrality, closeness centrality, and graph diameter~Wrequires identifying short paths in these networks. However, finding these short paths can be intrac-table for even moderate-size networks. We introduce the concept of a network structure index (NSI), a composition of (1) a set of annotations on every node in the network and (2) a function that uses the annotations to estimate graph distance between pairs of nodes. We present several varieties of NSIs, examine their time and space complexity, and analyze their performance on synthetic and real data sets. We show that creating an NSI for a given net-work enables extremely efficient and accurate estimation of a wide variety of network statistics on that network.

Back to ML Lunch home