I was recently quoted in a story in the New Scientist about a new attack on Tor. The quote was a combination of somewhat sloppy wording on my part and a lack of context on the reporter’s part, so I’d like to provide context and more precise wording here. The quote is:
“There are lots of vulnerabilities in Tor, and Tor has always been open about the various vulnerabilities in its system,” says Hal Roberts at Harvard University, who studies censorship and privacy technologies. “Tor is far from perfect but better than anything else widely available.”
The basic idea of the attack described in the article is to use a rogue Tor exit node to insert an address owned by the attacker into a BitTorrent stream to fool the client into connecting to that address via UDP, which is not anonymized by Tor. So when the BitTorrent client connects to the UDP address, the attacker can discover the attacker’s real IP address. This sort of attack on Tor is well known — the paper’s authors call it a ‘bad apple’ attack. Tor’s core job is just to provide a secure TCP tunnel, but most real world applications do much more than just communicating via a single TCP connection. For example, in addition to HTTP requests for web pages, web browsers make DNS requests to lookup host names, so any end user packaging of Tor has to make sure that DNS lookups happen over the Tor tunnel (as does TorButton). Tor does not ultimately control the applications that use its tunnels but relies on those applications to use its TCP tunnel exclusively to maintain the privacy of the user.
Tor’s conundrum is that at the end of the day what end users need is anonymous communications through applications, not secure TCP tunnels. So even though Tor can’t be responsible for making every application in existence behave nicely with it, to be actually useful it has to take some responsibility for the most common end user applications. To this end, Tor works closely with the Firefox developers to make Firefox work as well as possible with Tor, and Tor and associated folks have invested lots of effort into tools that improve the interface between the browser, the user, and Tor. But there’s only so much that Tor can do here in the world of all applications.
These attacks might not be considered ‘vulnerabilities in Tor’, as I say above, so I should have been more careful with my language (though most folks who do these press interviews struggle with the danger of any given sentence out of an hour long conversation not having precise language that can stand out of context of the rest of the conversation). But the basic point remains — there are lots of ways to break through the privacy of Tor as it is used in the real world, and Tor has been completely open about those in an effort to educate its user base and provide ‘open research questions’ (Roger Dingledine’s favorite phrase!) for its developer community. Roger’s response to the specific BitTorrent problem is simply to tell Tor users not to use BitTorrent over Tor because there’s no way that Tor itself can fix all of the broken BitTorrent clients in the world, but one of the core findings of the above paper is that lots of people do use BitTorrent clients over Tor. So that’s a really hard problem.
The attack described in the paper has a second component that is more directly a vulnerability of Tor than a ‘bad apple’ application attack. The second component is that Tor does not create a new circuit of nodes for every connection, but instead re-uses the same circuit for several connections from the same client to improve performance. This behavior makes it possible to identify the origin IP address of not just the one ‘bad apple’ connection (the BitTorrent connection in the paper’s attack) but also the origin IP address of other current connections by the same user. So a user who is using BitTorrent and browsing the web at the same time exposes not just her BitTorrent activities but also her web browsing activities to the attacker (the paper’s authors say ‘one bad apple spoils the bunch’).
This attack can be more traditionally described as a ‘vulnerability in Tor.’ Claiming ‘lots’ of these is sloppy language, but there is certainly a whole class of timing / tagging attacks that allow an attacker who has control of an entry and an exit node to identify users (and I think the risk of these attacks is more than theoretical in a world in which one ISP in China controls about 63% of the country’s IP addresses).
So to return to the quote and story, I spoke to the author of the piece for about an hour, most of which I spent trying to convince him not to write a ‘TOR IS BROKEN!’ piece that hyped this attack as the one, new chink in Tor’s otherwise pristine armor. I walked through the above, trying to explain that Tor is intended to do a single specific thing (anonymize communication through a TCP tunnel) but that there are various attacks that exploit the layer between Tor and the applications that use it. And there are also attacks like the circuit association described above that are more properly vulnerabilities in Tor itself. But many examples of both of these sorts of attacks have been around for as long as Tor has been around, and Tor has been very vocal about them.
I was trying (unsuccessfully!) to steer the reporter toward explaining the vulnerability as an example of how it is important that users understand that even a project like Tor that is very strongly focused on anonymity over other properties can’t provide perfect privacy for its users, that there are some things it does well but not perfectly (setting up anonymous TCP tunnels) and other things it does not as well (automagically make any application using Tor anonymous). To borrow Roger’s favorite phrase, how to explain complex social / technical issues like this one to reporters is still an open research question for which I’m eager to hear solutions!
Update: The reporter who wrote the article reminded me nicely that the only contact he had with me for this article was a single email exchange, so evidently I made up the long conversation with the reporter in my mind. In my defense, I give a lot of interviews on circumvention related topics, and I can actually still (falsely!) remember standing in the my house having this call with the reporter.
6 Comments
Hi Hal,
I’d like to reply to a few points you’ve made and I’ll do so in order of your statements.
First – on the topic of the “bad apple” attack.
There have been many attacks on BitTorrent clients and this is another of those attacks. The way that this specific attack was carried out is only one of many ways to convince a client to connect with an attacker’s BitTorrent client. This is an application specific issue and Tor actually does protect users who properly use Tor – they may use the TransPort option; rather than the SOCKS proxy. They may also use TAIL(S); this ensures that no traffic will leave the computer except Tor traffic; no UDP, TCP or other traffic can leak out with this method of utilizing Tor.
To the best of my knowledge the authors of this paper did not de-anonymize any of the users who properly utilized Tor – be it with a privacy preserving BitTorrent client and SOCKS, with the TransPort option, with TAIL(S) and the system-wide transparent proxy setup, or with another method that protects against this kind of issue.
The paper deanonymized people who were not safely using Tor or using privacy insensitive BitTorrent clients. This is not a vulnerability in Tor or even in a supported Tor setup. There are ways to use BitTorrent safely with Tor and the de-anonymized users did not use Tor with BitTorrent safely.
To suggest that this is an open research question is disingenuous – we have long stated that if you do not use Tor for all of your application traffic, your traffic cannot be protected by the same guarantees that Tor promises to provide. To that end, the Tor community has created many solutions for users who want to use all kinds of applications without configuration of any kind – TAIL(S) is the most comprehensive but far from the only solution:
https://tails.boum.org/about/
You’ll also notice that our Tor Browser bundles do not include a BitTorrent client and you’ll also notice that this “vulnerability” does not impact any of our shipping software.
Secondly the stream and circuit correlation issues only seems really an issue if you have application leaks, a reasonable concern of course, or if you’re really misusing Tor. We’ve covered future directions for people who are concerned about this issue in prop 171:
https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/171-separate-streams.txt
Still – if you’re going to take advantage of the changes in prop 171, you’ll need to understand what you wish to defend against. Just as you do with any system – there are no exception to this problem – with or without circumvention, VPN, or anonymity systems.
Using the TransPort option or using TAIL(S) is an example of how this is not a vulnerability in Tor; consider a user who utilizes a VPN but does not route all their traffic through the VPN – would it be reasonable to call this a vulnerability in the VPN? I don’t think that would be reasonable at all. If a user configures a VPN incorrectly, I hardly think it’s fair to blame the VPN software – it does the job but it isn’t ever put to work on the task at hand.
None of these attacks “break through the privacy of Tor” as you have stated. People successful use BitTorrent with Tor and are not de-anonymized by these kinds of attacks.
You might at best be able to correlate some web browsing traffic (which is perhaps encrypted or perhaps not) for a small window of time but you will not de-anonymize the user’s actual IP address. This risk holds true for every VPN and every circumvention system – the exit node or exit network can sniff insecure protocols. Unlike the other systems, Tor will properly anonymize the users – even more than a VPN or other circumvention systems.
In general, Tor users are not impacted by this “vulnerability” at all. Your quote might have been more appropriate if you had explained that BitTorrent clients are not privacy preserving. That is an entirely true statement. Combining safe and unsafe applications puts you at risk – so web browsing with improperly anonymized BitTorrent is unsafe for the same reason that improperly anonymized BitTorrent is unsafe.
Putting the burden on Tor is factually incorrect – there are no patches for us to ship or changes for us to make; we’re not shipping the BitTorrent clients. While we’re aware that streams and circuits issue is a bit more nuanced – it’s only news because of a much larger mistake that won’t really change unless the user takes educated action. This is the same for all systems in this category of software, no?
Calling it “lots of vulnerabilities” is a sensationalist quote and I hope you push for them to retract it. I’m pretty disappointed all around, I expected better from you and the Berkman Center. 🙁
Hi Jake,
I think the core of our disagreement here lies where you quote me about using application attacks to “break through the privacy of Tor.” The rest of that quote is: “break through the privacy of Tor as it is used in the real world.” The paper authors claim that they were able to discover the IP addresses of 9% of all Tor streams over the course of their research. I think that 9% finding qualifies for “as it is used in the real world.” And I think that 9% finding contradicts your argument that “In general, Tor users are not impacted by this ‘vulnerability’ at all”, unless you think 9% of streams does not qualify for ‘in general’. You argue that the BitTorrent attack described in the paper has nothing to do with Tor, but for the users of those 9% of Tor streams I think it is very likely that, despite the efforts of the Tor team to educate them otherwise, the privacy that they thought they were getting from Tor is not working as they expect it to (one open research question here is whether those impacted users actually care about privacy and indeed why their are using BitTorrent through Tor at all, but I think a reasonable null hypothesis here is that they thought their use of Tor was granting them anonymity).
You say that “there are no patches for us to ship or changes for us to make,” but that statement assumes that Tor is merely a technical system and that the only responsibility of the Tor project is to build tools. If that were the case, the Tor project would not expend so much of its energy trying to explain what Tor does and does not do to its end users. The Tor project does spend a lot of its energy on education, because for Tor to actually work in the real world with real users, the project has to undertake that social work as well as the technical work. And so the question of what changes the Tor project can make extends beyond changes to the code. Unfortunately, this sort of user education is really, really hard — that’s what I mean by it being an open research problem. I am not being at all disingenuous with that phrase (I am in fact a researcher whose primary interest is exactly those kind of hard social / technical problems!).
So for an example of one change Tor might make, Tor might change the language on its website. The flow for a user to download Tor from the website includes a lot of very broad, very strong statements of privacy protection on the home page, including “# Tor prevents anyone from learning your location or browsing habits” and “# Tor is for web browsers, instant messaging clients, remote logins, and more.” Clicking on the download link below those statements leads to a page with a large disclaimer that “You need to change some of your habits, and reconfigure your software! Tor by itself is NOT all you need to maintain your anonymity. Read the full list of warnings.” Of the full list of warnings linked by that statement, only the first could arguably warn a user not to use Tor with some BitTorrent clients: “Tor only protects Internet applications that are configured to send their traffic through Tor — it doesn’t magically anonymize all your traffic just because you install it.” Even if a user clicks through to the warnings and reads them carefully (which anyone who designs websites will tell you that most users will unfortunately not), it’s perfectly reasonably to come away from that statement thinking that as long as an application has a SOCKS configuration option, it will work with Tor. I and you and any Tor expert know that not to be the case, but most folks don’t. The really hard part here is that the list is already too long to expect most folks to read through it, so any addition (like “don’t use BitTorrent with Tor!”) is likely to hurt the cause of educating users more than help it by discouraging even more folks from reading the list at all. You can argue that the moral responsibility here lies on the lazy user who doesn’t bother to educate herself (or likely even to click on the link to the list of disclaimers), but the moral responsibility is beside the point. The goal is to design a system that works for the general public, regardless of how lazy or not they may be.
Again, getting folks as they operate in the real world to understand a complex technology like Tor is a really hard problem that I know the Tor project has tried — to my knowledge much harder than any other major anonymity or circumvention project — to get as right as it can, but it’s perfectly legitimate for me to point out that the Tor system as it is used in the real world, including its technical and social components, is vulnerable to many attacks like the one described in the paper.
Hi Hal,
The paper specifically said: “In particular, we traced 9% of all Tor streams carried by our instrumented exit nodes.”
That is not 9% of all Tor traffic and I don’t think they claimed it was reasonable to extrapolate their results to the full network. Perhaps I misread the paper.
They categorize their attack as “Using BitTorrent as the insecure application, we design two attacks tracing BitTorrent users on Tor” when they describe it. Why suggest that this is just a Tor problem and that we’re chalk full of vulnerabilities?
That is to say – their attacks are about BitTorrent and proxy compliance – they made it sexy by including Tor in the mix. Their claim of being able to “associate the top secret documents with the IP address of the anonymous source” is entirely dependent on the use of BitTorrent and some luck. They of course phrased it in the most hyperbolic manner possible – who has data on that kind of activity? Not them and nothing in their paper indicates otherwise.
I expect people to see past their hyperbolic phrasing and certainly I’d expect people not to play it up.
You said: “You argue that the BitTorrent attack described in the paper has nothing to do with Tor” – I maintain that the Tor code itself is not vulnerable, nor is any application bundle that we ship. Users using The Tor Browser Bundle can ignore this paper entirely, no?
I think it’s reasonable to point out that it’s a non-issue if you take BitTorrent out of the picture and that is why I take issue with the phrasing. Will people reading that article understand the subtleties? I don’t think so – I think it’s a likely citation for future FUD.
I agree with you entirely that users need to understand their risks and educate themselves; we do spend a lot of effort educating people – our warning page presented to users downloading Tor tried to explain these issues:
https://www.torproject.org/download/download.html.en#warning
Point ‘a’ in that warning generally covers this issue – I’ve advocated that we make a specific item (see https://trac.torproject.org/projects/tor/ticket/3025 ) for BitTorrent but I think users who didn’t read the first point are really unlikely to read the last one. Perhaps I’m mistaken?
So I’ll ask again – how is this a vulnerability in Tor and not every single system that I previously presented?
How many people need to succeed using BitTorrent with Tor for this to not be a “vulnerability in Tor” or a failure of user education? Just over fifty percent? I’m pretty sure the answer is zero, actually.
You argue “The goal is to design a system that works for the general public, regardless of how lazy or not they may be.” and you’ve not addressed my statements about TAIL(S) or similar systems.
How does TAIL(S) fail to meet this while providing Tor that isn’t vulnerable to this problem at all? If we’re now discussing security usability, I’m curious why all of the failures fall on Tor – not on the other applications or even on other major systems?
To belabor the point, I don’t think that it’s legitimate to call this a vulnerability in Tor. Perhaps it’s fair to make a criticism of The Tor Project’s documentation – we do fail to specifically discuss BitTorrent in our warning page but I didn’t read your quote as an example of how we need to educate people.
Every system has issues and we’re very open about our problems and our responses.
However, day to day, I don’t think that even the authors of this paper claim to trace or de-anonymize *any* Tor user. I’ve discussed systems where the vulnerability is not present. The article is incorrect because it’s actually safe, though we really dislike it, to use BitTorrent with Tor if you use the correct setup. Some of those setups merely require a user to download them, some users may prefer to do it by hand. In any case, the Tor Network in the real world includes people not de-anonymized with these techniques.
I think it’s legitimate for anyone to criticize The Tor Project; still, I think your criticisms are largely incorrect. I do not believe your comments in the article will be seen in the context you presented here. While I still disagree with your assertions in general, they’re certainly more nuanced in this blog post. This blog post seems to be an indication that you partially agree with me that your quote is overly harsh. I do appreciate you addressing my concerns but in some ways, I think that my concerns were validated by the discussion.
You said we have “lots of vulnerabilities” and that’s frankly at the core, not true. If you believe otherwise, I encourage you to file some bugs: https://trac.torproject.org/projects/tor/newticket
Jake,
Is there any reason to think that bittorrent traffic would congregate more heavily on their ‘instrumented’ exit nodes? Is there some way that bittorrent traffic might tend toward specific sorts of exit nodes? That’s an honest question — my knowledge of Tor stops well before understanding the details of its circuit creation!
I’m not sure what you’re trying to argue by saying that the authors of the paper are not actually trying to de-anonymize any Tor user. Of course not — they are security researchers. But they have documented an attack that can be used by bad guys (and might be being used right now) to de-anonymize some significant number of Tor users. That’s the whole point of publishing vulnerabilities, no?
Sure, again, it’s technically possible for users to use BitTorrent safely with Tor. But the paper shows that a significant number users are using BitTorrent unsafely. To argue that it’s not a problem as long as a technical solution is available is like saying a flu epidemic is not a public health problem as long as we have vaccines sitting around that people might have taken if they chose to. If no one takes the vaccine, it’s still a problem no matter how good the vaccine!
I don’t say anywhere that Tor is the only anonymity / circumvention app with problems. I do think this particular problem of applications leaking traffic outside of the tunnel is particular to Tor and other proxies that act as general purpose SOCKS proxies. To my understanding, the attack described in the paper would not work against a default VPN setup because VPNs by default tunnel all traffic (and in fact generally take a high level of technical skill to change that default). Likewise, it wouldn’t work against a CGI proxy because they are not at all capable of proxying the BitTorrent protocol.
HTTP proxies are more complicated — a user could use an HTTP proxy for the tracker request and be subject to this attack. But the attack would be pointless because the single hop HTTP proxy already has the user’s IP address. I know you will argue that this just proves the utter uselessness of HTTP proxies, but that’s a separate debate. This attack does not change that trust model for HTTP proxies (broken or not as it may be in the first place).
There are certainly other circumvention projects that offer SOCKS proxies (yourfreedom is the most prominent other than yours, I think, other than just using a generic ssh socks tunnel), and they do suffer from this same problem. But Tor is by far the most prominent circumvention tool promoting itself as a general purpose SOCKS proxy and therefore vulnerable to this sort of attack. You’ve stated a couple of times that any other VPN or proxy is vulnerable to this same attack, but I don’t understand how anything other than a SOCKS proxy system can be vulnerable to this attack.
I think it’s fair to call this a vulnerability in Tor in the same way that I’d call a unfixed vulnerability in OpenSSL a vulnerability in Tor as well. Even though other projects are certainly affected and Tor didn’t write the particular OpenSSL code, Tor chooses to rely on OpenSSL as part of its security model. Likewise, Tor chooses to rely on general purpose SOCKS proxying as part of its social and technical design, which decision makes it vulnerable to these sorts of attacks. If you implemented Tor as a VPN or an HTTP proxy, it would be not or less vulnerable to these attacks.
The Tor Browser Bundle does not suffer from this problem if only the Tor Browser Bundle software is used, but the more generally purposed Vidalia download does. Why provide the Vidalia download at all if not to encourage users to use Tor as a general purpose SOCKS proxying system (which is inherently vulnerable to these sorts of application leaks)?
There are a few things that the Tor project could do to mitigate itself from these application leaking attacks, in addition to changing to VPN or HTTP proxying. You could provide only the Tor Browser Bundle in the main download path and require users to dig harder for the stand alone Vidalia system.
You could come up with a whitelist of supported applications and state prominently that Tor is only secure when used with those specific applications.
You could provide TAIL(S) as a default within Vidalia so that users you use Tor as a general purpose proxy are by default protected. The reason I have not considered TAIL(S) before now is that its general impact on Tor usage is very low if it is not provided in the main download path of the Tor website, where the vast majority of Tor downloads happen. As a technical fix, it may be great, but as a public health fix it’s not useful if the Tor project is not putting it where people will find and use it.
These of course all have their various costs (again social and technical!) and might or might not make actually help this particular problem and might or might not sense for the project as a whole. But they are changes that could be made.
I feel like you are treating this like a zero sum game — that the problem is that I am harming Tor in comparison to other tools. But read the quote again: “There are lots of vulnerabilities in Tor, and Tor has always been open about the various vulnerabilities in its system,” says Hal Roberts at Harvard University, who studies censorship and privacy technologies. “Tor is far from perfect but better than anything else widely available.”
I say right in the quote that Tor is better than all the other tools! Notwithstanding the problem described in the paper or the ‘lots of vulnerabilities’ that fall into this and other classes of attacks. I focus on Tor because the article and the interview were specifically about Tor, though I spent plenty of time in the interview placing Tor in the context of other tools, which all have a host of other privacy issues (and again the quote does include the context that Tor is the best of the breed, by which I meant best of the breed for privacy).
I do agree with you that the single quote (culled from an hour long conversation with the reporter!) does not convey any of the context or nuance in this (quite good, I think!) debate we’re having here. I’ve sent the reporter a link to this blog post to do with what he will.
Hi Hal,
I don’t know if their nodes would be more heavily loaded – it really depends on a lot of different factors – their exit policy may have attracted more BitTorrent traffic or perhaps their geographic location attracted the traffic. None the less, their general claim is not fair to extend to the entire network – even they do not do this and I’m not sure why anyone else would feel comfortable doing so.
I suggest that this attack they perform is highly dependent on a number of issues – the ability to inject a peer into the tracker connection (is that possible over SSL or with encrypted BitTorrent? Doubtful?), the user must be using a vulnerable BitTorrent setup (not all of them), and more. This is different from a vulnerability in OpenSSL that _impacts_ Tor – if OpenSSL has a bug, it’s not _instantly_ a bug in Tor – Tor doesn’t use every line of code in OpenSSL. These bugs aren’t always transitive and it’s important to know where to lay blame. To suggest that any bug in OpenSSL is also a bug in Tor is precisely where I think you’re incorrect. Many bugs have been discovered in OpenSSL that do not impact Tor in any way at all – it isn’t correct to assume that use of OpenSSL means that Tor has one hundred percent bug/code coverage in common.
In all cases – this is not a vulnerability in Tor but in the applications that may utilize any proxy (HTTP or SOCKS, local or remote) – if I’m using Tor to chat with Jabber over SSL, I’m not traceable. This is why I suggest that it’s disingenuous to suggest that Tor has lots of vulnerabilities.
Still, I don’t think this is a zero sum game by any means; there is simply an untruth in your statement of “There are lots of vulnerabilities in Tor” – you’re saying in the present that Tor has lots of vulnerabilities. That is simply not true. It’s true that there are lots of risks with trying to use Tor properly and it’s also true that we have had issues in the past. We’ve been quite open about those issues.
Your statements imply that Tor or the Tor network itself has a problem – the issue is at best about user education. At worst, we’re mis-calculating what users expected – it could be that people are simply trying to protect against tracker logs being collected by unsavory types. We can’t know that but it’s worth considering that there is more than one answer.
To say “lots of vulnerabilities” in the present tense has a very specific meaning – it implies that using Tor is not entirely safe to use (it is) and that we should patch (not needed) some unspecified _software_ issues. It’s nice that you say we’re “better than anything else widely available” but what isn’t widely available that is better?
Hi Jake,
I think we’ve established our respective positions well, so I’ll leave you with the last word.