Server-side traffic analysis reveals mobile location information over the Internet

Abstract

Users can attempt to thwart third-party services from discovering their location by disabling location services on their mobile device. In this paper, we show that web services can use throughput information to reveal the path taken by the phone and its owner among a set of possibilities. For example, a TCP-based music streaming service can compile a sequence of throughputs over several minutes. We collected hundreds of traces of music that we streamed to phones in two different scenarios: a user traveling to four different towns from campus (or the reverse direction); and a user traveling within our campus. We evaluate three classifiers: k-Nearest Neighbors (k-NN), which compares a test sequence with respective time points of training sequences; a Hidden Markov Model (HMM), which computes the transition and emission probabilities of different geographic areas and chooses the most likely sequence of a test trace; and a Naive Bayes Classifier with KDE-based throughput estimates (NB-KDE), which looks at the density of throughputs at each time point along a path. In our study, the k-NN, HMM and NB-KDE approaches can distinguish between a small number of geographic routes taken by mobile users using only throughput measurements. The NB-KDE method performed best, using throughput alone to identify the path and direction among two roads within a University campus (4 classes) with 77% accuracy, and the path and direction among 4 roads (8 classes) out of town with 83% accuracy. Furthermore, it was able to classify among 8 paths with greater than 59% accuracy after one minute. We examine the limitations of these techniques.

Publication
IEEE Transactions on Mobile Computing