14: OneSwarm

OneSwarm is a “privacy-preserving p2p data sharing” system. Forensic investigation is in some ways the mirror image of privacy-preservation; there is an implicit arms race between them.

We’re reading this paper and talking about (forensic) problems with the system as a way to guide your thinking about privacy-preserving systems. How private are they? How private are they in the face of various attacker models? Does the evidence one can gather from them point to legal culpability? Did the authors of the system consider these angles? And so on.

OneSwarm

OneSwarm (OS) has a clear motivator. At the time it was released, your choices for p2p file sharing were fast and public (BitTorrent) or slow and janky (something-over-Tor, or maybe Freenet). OS is an attempt at a novel-ish set of design choices to give users a better choice in the tradeoffs here.

Goals are to reduce the costs (make setup easy, make different trust models straightforward, allow interop with BitTorrent when acceptable, be efficient and robust).

In OneSwarm, data objects are located and transferred through a mesh of untrusted and trusted peers populated from user social networks. Content lookup and transfer is anonymous, congestion-aware, and multipath.

Example

Figure 1 illustrates the range of privacy preserving op- tions supported by OneSwarm. Bob downloads public data using OneSwarm’s backwards compatibility with existing BitTorrent implementations, and makes the downloaded file available to other OneSwarm users. Alice downloads the file from Bob without attribution using OneSwarm’s privacy- preserving overlay, but she is then free to advertise the data to friends. Advertisements include a cryptographic capability, which allows only permitted friends to observe the file at Alice.

During the download, Bob also acts as a replica for sharing without attribution using an overlay consisting of OneSwarm peers only. This overlay acts as a mix [9], using source-address rewriting and multi-hop overlay forwarding to obscure the identities of a path’s source and destination. Alice is one such destination. She is free to advertise the file explicitly to friends who may also be interested in the content.

Each case shown in Figure 1 imposes a different tradeoff between privacy and efficiency.

Public distribution: All data sharing need not be private. This is the case for which existing P2P systems excel, and OneSwarm draws on this strength by serv- ing as a fully backwards compatible BitTorrent client.
With permissions: Persistent identities allow OneSwarm users to define per-file permissions. In this case, access to files is restricted (rather than attribution of source or destination). In OneSwarm, capabilities restrict access to protected files, allowing all permitted users to recognize one another and engage inswarmingdownloadsforscalability.
Without attribution: When sharing sensitive data, privacy depends on obscuring attribution of source and/or destination. Unlike data shared with permissions, which is directly advertised, data shared without attribution is located using privacy-preserving keyword search, and data transfers are relayed through an unknown number of intermediaries to obscure source and destination.

Identities and Trust in OneSwarm

Each OneSwarm user is named using a cryptographic key that identifies that user among its peers. OneSwarm identities are persistent, allowing two users that have exchanged keys to locate and connect to one another whenever both are online. Long-term identities are linked to transient IP-addresses and port numbers via a distributed hash table (DHT) maintained among all users.

(Aside: How do DHTs work? Very high-level handwavy: each peer is responsible for some part of an address space, and stores the key-value pairs that fall into that address space. There are lots of details related to how they bootstrap, how you provide redundnacy, failover, etc., but they’re not super material here.)

DHT entries for a client P are signed by P and encrypted with the public key of a given peer. Each client’s location in the DHT is independent of its identity and is determined by hashing the client’s current IP address and DHT port. This inhibits systematic monitoring of targeted regions of the DHT key space since the region for which each client is responsible is determined by that client’s network address and port, which is certified during DHT operations by other OneSwarm peers. (Editorial by Marc: Well, you can at least vary port to jump around the DHT space. And any institutional actor has access to many IPs; this argument is kinda weak as a result.)

Linking peers

Between two OneSwarm users that share a real-world trust relationship, OneSwarm automates key exchange in three ways: Local area network discovery; existing social networks (Facebook, etc.); and email invitations.

Key management

You can do per-key per-file management in OS, but you can also “subscribe” to a community of keys. Most details not important here, but one is: there are centralized “community servers” that maintain lists of registered users. You can also use these servers to find a set of untrusted peers that increase robustness and privacy when sharing data without attribution. This can be useful for “without attribution” style sharing.

Since registration with public community servers is unrestricted, all peers obtained from one are treated as untrusted by default. Registration itself is a three step process. First, the OneSwarm client provides its public key, which the server then verifies by issuing a challenge nonce value and verifying the incremented, encrypted response. Finally, the server uses consistent hashing of the key to compute a subset of peers to return to the client.

Community server registration is designed to inhibit systematic crawling of the membership list of a public community server. Verifying keys with a challenge/response allows the server to limit the number of registrations by a single IP address, and consistent hashing limits the information obtained from repeated membership queries. Although an attacker with significant resources can evade these restrictions and obtain a complete view, doing so is of limited value. The overlay topology is an amalgam of links from community servers, manual exchanges, email invitations, and other social networks; a crawl of community servers provides only a partial view, and more privacy conscious users need not subscribe to any community server whatsoever.

Locating and transferring data

How does OS name, search for, and transfer data?

Public data is just torrented.

Data shared with trusted peers, likewise.

But there are some differences. First, instead of sharing all data publicly with a dynamic set of peers, OneSwarm users explicitly define the trust level of a persistent set of peers (by default peers are untrusted). Second, instead of centralizing information about which peers have which data objects, e.g., at a coordinating tracker as in BitTorrent, OneSwarm peers locate distant data sources by flooding object lookups through the overlay. Third, instead of sources sending data directly to receivers, data transfers occur over the reverse overlay search path, using address rewriting to obscure sender and receiver identities. (at least when you need non-attributable transfers).

Naming

Shared files (or groups of files) are named in OneSwarm using the 160 bit SHA-1 hash of their name and content. The low order 64 bits of this hash are used to identify swarms in search messages that are flooded to discover potential data sources. For public data, users obtain content hashes 1) out-of-band, e.g., from an email or website, 2) from file list messages exchanged with peers, or 3) from keyword search in the overlay. For private data the user must obtain both the hash of the data as well as capability used for decryption. We describe transfer setup via search since this subsumes the other cases.

Search

To discover shortest paths, OneSwarm relies on flooding. Keyword search messages include a randomly generated search ID and list of keywords. Unlike flooding search in other P2P file sharing networks, OneSwarm search messages do not include a time-to-live value since this information would allow intermediaries nearby the source or destination to easily reason about behavior. Instead, OneSwarm forwards searches to trusted peers provided the forwarder has idle capacity and the search has not been forwarded previously. Clients maintain a history of search messages to avoid forwarding duplicates.

Among untrusted peers, forwarding is randomized to prevent collusion attacks. Instead of forwarding unmatched search messages to all peers, OneSwarm forwards searches to untrusted peers probabilistically. This inhibits colluding untrusted peers from inferring a data source by observing the lack of a forwarded search message. To prevent information leakage through repeated queries, the decision to forward a search is made randomly —but deterministically— so repeated queries for the same data will yield the same result.

To avoid the propagation of every search to every client in the overlay, each client delays each search message for at least 150 milliseconds before forwarding it to peers. The search source (or any forwarder) may termi- nate popular searches for which many data sources have already been discovered by sending a search cancel mes- sage to nodes to which they have sent or forwarded a search message. (Search cancels are also sent if the up- stream peer disconnects.) The search cancel message is forwarded along the same paths as the corresponding search message but without any forwarding delay, allowing cancel messages to quickly reach the search frontier.

In addition to the fixed forwarding delay for search cancellation, OneSwarm also delays messages based on the load at each intermediary. Where load is high, search propagation will tend to route around it, improving performance. When excess capacity exists, search messages will follow the shortest path, reducing transfer overhead.

Path setup

If a node is sharing a file that matches a search query, it does not forward the search and instead responds with a search reply message. Among trusted peers, this response is immediate.

But, receiving a search reply message in less than 150 ms would reveal the responder as a data source to potentially untrusted peers. (Do you see why?)

To prevent this, users delay search reply messages (and all protocol messages) sent to untrusted peers in order to emulate the delay of a longer path. This value is chosen randomly between 150-300 ms (i.e., 1–2 hops). As with forward- ing of search messages, the delay value is persistent for a particular file and a particular peer to prevent information leakage from repeated queries. (Again, do you see why?)

(Some details about PathIDs elided.)

Transfers can start as soon as a one path is discovered, and new searches can be launched to replace paths that fail.

Transfers

OneSwarm uses the wire-level protocol from BitTorrent file to transfer data, first obtaining a list of block hashes corresponding to the metadata stored in .torrent files. But, rather than connecting directly to peers, OneSwarm tunnels BitTor- rent traffic through overlay paths. Each overlay path is treated as a virtual peer, even those that terminate at the same endpoint. Of course, the receiver has no definitive way to know which paths terminate where. Rather than obtaining a list of peers from a centralized tracker, as in BitTorrent, OneSwarm discovers new paths by periodi- cally flooding search messages for active downloads.

Security analysis

High-level analysis:

Persistent peering relationships limit monitoring power: In BitTorrent,peersaredynamicallyassigned, allowing attackers to become a peer of virtually ev- eryone, given enough time. By contrast, OneSwarm peers are persistent, improving contribution incentives but also limiting the ability of attackers to inject nodes at arbitrary locations in the overlay.

Heterogeneity of trust relationships foils timing attacks: OneSwarm users define links as either trusted or untrusted and keep this information private. As the protocol behavior varies with link type, the combined use of trusted and untrusted links greatly diminishes an attacker’s ability to infer the length of an overlay path based on timing information.

Lack of source routing limits correlation attacks: OneSwarm does not provide peers with the ability to construct arbitrary overlay paths. Attackers could use this to correlate performance with ongoing transfers. Such an attack is known to degrade privacy in Tor. Individual clients have a limited view of the overlay and cannot control path setup beyond directly connected neighbors.

Constrained randomness frustrates statistical attacks: The uncertainty arising from random perturbations in the protocol could be reduced through statistical analysis if repeated probes yielded different draws. OneSwarm prevents such analysis by making all random decisions deterministically with respect to a given query and link.

Network dynamics limit value of historical data: While relationships in OneSwarm are long lived, the end-to-end paths between senders and receivers change rapidly due to churn and transient congestion. This reduces the window of opportunity for advesaries to combine data from multiple observations in order to reverse-engineer user behavior.

Inferring data sources

(This is from the original OS paper, not the one that breaks it.)

Timing:

By measuring the round trip time (RTT) of search / response pairs, an attacker can estimate the proximity of a data source. Usually, paths are lengthy, making the chances of being next to any particular data source quite low. For a small number of requests, however, an attacker might be directly connected to a data source and also be able to identify it as such based on the low RTT of response messages. To frustrate this attack, OneSwarm artificially inflates delays for queries received from untrusted peers; all responses to untrusted peers are delayed by a random but deterministic amount (computed based on the content hash) in order to emulate the delay profile of forwarded traffic from one or more hops away.

Collusion:

A sends a targeted search to T , receives a search re- sponse, and observes whether the search was forwarded to colluding peers C1 , …, Ck . Recall that forwarding search messages is probabilistic to provide deniability. Each search message has a configurable probability, pf , of not being forwarded to a particular peer. As a re- sult, a lack of forwarding does not definitively identify a data source; missing search messages may arise from random chance. But, a lack of forwarding observed by many colluding peers is highly suggestive of T sourcing the object. Assuming a fixed forwarding probability of pf and k colluders, Pr[Not source|response received] = (1 − pf )k . With just a few colluders, an attacker can gain very high confidence.

This is “prevented” by community servers giving a fixed random set of peers to each client.

Deconstructing overlay paths:

(Empirically they show it’s hard to do this, because of path length, stability, and diversity.)

Appendix analyzes a few other attacks.

A Forensic Analysis of OneSwarm

The investigator is essentially an attacker, attempting to violate OneSwarm’s privacy promises, though more limited in ability than is typically assumed. Their goal is to identify a subset of all OneSwarm peers that are each sharing (or conspiring to share) child pornography, and it represents a small fraction of files shared on the network. The overriding goal of the attacker is to gather evidence sufficient for a search warrant; i.e., probable cause.

Probable cause is a lower standard than the beyond a reasonable doubt standard needed for conviction. There is no quantitative standard for probable cause, and courts have defined it only qualitatively. Accordingly, for the purposes of our study, we say a peer has been identified if the investigator’s statistical confidence is above a sufficient level.

Model and Assumptions. The general approach adopted by law enforcement for investigating CP trafficking is based on a series of legal restrictions depending on the country of jurisdiction. We assume our attacker is an investigator following common restrictions. Specifically, we assume that the investigation can gather only information available publicly, which includes all traffic sent to other peers (since anyone in the public can be a peer). We do not allow attackers to seize or compromise peers through privilege escalation.

Timing attacks

In OS:

File search queries require message flooding.
Query propagation is stopped by a flooded cancelation message.

The query flooding and forwarding delay have a profound impact on commu- nication cost in the network. Specifically, from these properties alone it is possible to derive lower bounds on query response round trip time (RTT).

Naive Timing Attack. Here attacker A compares the application-level response time to the network-based roundtrip time; if they are similar, then the attacker concludes that T is the source. A’s reasoning is that a query that takes no longer than the network-level round trip time to T, could not have been forwarded. And since T forwards all requests that it cannot fulfill itself, T must be a source for file f.

To defeat the attack, T introduces a randomly chosen (but consistent for a given file) response delay. The delay should be long enough that the application and network level round trip times appear to vary significantly. Thus the attacker cannot easily determine if the difference is due to network jitter or if the query is actually being forwarded to other peers. OneSwarm employs this defense by introducing a response delay between of 150– 300ms before answering queries.

Simple Timing Attack

While it’s relatively easy to defeat the naive timing attack described above, we define a slightly more sophisticated attack that can be carried out in the event that the maximum response delay rmax is not chosen carefully. The attack refers of the following quantities. Each is in units of milliseconds.

r: query response delay between rmin and rmax
q: query forwarding delay between qmin and qmax
l: one-way network-layer latency between two peers
δ application-layer roundtrip query response latency

The simple timing attack consists of the observation that if T did in fact forward A’s query on to S, then the total query response latency, δ, must be larger than the sum of network- and application-level delays associated with that path.

OBSERVATION 1:

T is the source of file f if its query response time to A is such that δ < qmin + rmin + 4l ms.

(See Figure 1.)

Observation 1 establishes a test that A can use to decide if T hosts file f. Naturally, T would like to disguise the fact that it hosts f by delaying its query response in order to confuse A. But T is constrained by rmax, which is the maximum amount of time it can wait to respond to a query request. So rmax should be chosen so that it’s possible that the query request was forwarded to S. Specifically, the value of r chosen by T must exceed the minimum application-level query response time from T to S, which is equivalent to the bound in Equation 1 less the network delay from A to T. Based on this reasoning and Fig. 1 (left), we have the following constraint.

OBSERVATION 2: In order for T to disguise the fact that it hosts f it must be the case that,

r ≥ qmin + rmin + 2l ms.

Based on its technical description and Observation 2, the OneSwarm network is susceptible to the Simple Timing Attack. Recall from Section 2, that query response delay r and query forwarding delay q are bounded as follows.

150ms≤q≤300ms
150ms≤r≤300ms

This means that

r ≤ 300 ms < 300 ms + 2l = qmin + rmin + 2l, which violates the constraint in Equation 2.

On trust

Messages sent to a “trusted friend” are direct and without added delays. The Simple Timing Attack will not distinguish between three cases: (a) the target is the source of a CP file; (b) the target is a direct proxy for a trusted friend that it knows is sourcing CP files; (c) the target is a direct proxy for a trusted friend that it does not know is sourcing CP files. In all cases, the machine contains evidence of a crime and a warrant is appropriate1. The Simple Timing Attack far exceeds the probable cause standard required to grant a warrant whose execution will clarify which case applies.

Notably: Case (b) occurs when the user of the target machine is a trusted friend of the possessor and views the filenames he shares. Filenames are shared by the possessor to its trusted peers by default and shown in each trusted peer’s GUI. And though not a source of the content, in this case, a target proxying for a trusted friend is part of a conspiracy to knowingly distribute CP. Furthermore, by participating in the trusted relationship with the source, the target gains the non- pecuniary benefits of bandwidth and better performance, which can incur a greater punishment in some jurisdic- tions. In short, trusting a neighbor can raise significant criminal liability.

Best-Case Query Response Latency

(A nice analysis of performance problems in OS, but not pertinent here.)

Simple Timing Attack in Practice

Fig 3.: highly separable distributions observed in practice.

Collusion attack, revisited

OneSwarm aims to thwart traffic analysis of a peer by its neighbor with a particular defense: queries are forwarded to its many neighbors only with probability p when the peer doesn’t have requested content. Attacks can defeat this defense as follows: one Sybil sends a request for content and one or more colluding Sybils listen to see if the request is forwarded. This collusion attack gains precision as more colluders are involved. If all other attacks we present in this paper are patched, the collusion attack remains.

Suppose that after C1 sends an initial query message for f to T that none of the Sybils C2,…,Ck receive a query message from T. The probability that T possesses f can now be bounded by observing that each Sybil has independent probability p of receiving a query when T does not possess f. Therefore, the chance that T will forward the query to at least one of the k − 1 colluders when it does not have f (i.e., the attack precision) is 1 − (1 − p)^k−1.

Achieving 95% precision requires at least k = 6 attackers (a querier and 5 colluders) to be directly connected to target T. Isdal et al. made some modeling errors. When corrected, we see that P{A ≥ 6} = 0.07. It turns out that as f’s popularity increases, fewer colluders are actually necessary to achieve the same attack precision. (details in paper)

TCP-based attacks

In this section, we demonstrate a novel adaptation of a known TCP-based attack [20] that can identify whether a OneSwarm peer is the source of data or a proxy. Peers that do not rate limit outgoing traffic are vulnerable.

The attack leverages optimistic acking, where a receiver sends TCP acknowledgements for data before it is received, increasing throughput. Fig. 9 illustrates our TCP attack scenario. An attacker requests a file from a target. The attacker induces a higher- bandwidth connection between itself and target than between the target and a potential source of data. If t can be made greater than any potential s and the target is not the source, it will stall out. The stall occurs because the target’s application level buffers will run out before the actual source can fill them.