Yesterday, Microsoft released version 9 of Internet Explorer, which includes two significant new privacy features: Tracking Protection Lists (TPLs) and a Do Not Track (DNT) header that allows users to request that websites not track them. We've written about the virtues of the Do Not Track header previously. In this post we'll look more closely at privacy blocklists, a category of technologies that includes the popular AdBlock Plusbrowser extension and the new IE9 Tracking Protection Lists. These lists are an important type of privacy technology, with distinct advantages over other approaches. They also have a number of inherent limitations, which makes them natural complements, rather than alternatives, to DNT.
Advantages of Privacy Blocklists
Blocklists like IE9's TPLs and AdBlock Plus for Firefox and Chrome are great privacy technologies because they can prevent your browser from transmitting information (via HTTP requests) about your reading habits to 3rd parties, and they can achieve that in a fairly user-friendly way. The importance of usability in the design of privacy tools cannot be overstated. To enable a blocklist, the user subscribes to a list maintained by a group that spends a lot of time spotting the URLs used by tracking technologies, and in many cases that succeeds in sorting out which requests were being used to fetch useful parts of webpages and which requests were being used to fetch web bugs and beacons. Other non-list based technologies like RequestPolicycan achieve similar results, but require much more expertise to use. The beauty of stopping the HTTP requests altogether is that it is comparatively easy for the user to be sure that the information at stake is protected. Even in the extreme case of a third party that is nefarious and willing to break the law to track you, you get a fairly high degree of protection if your browser never connects to that 3rd party domain. The fact that blocking lists do not depend on legal or regulatory support is a significant advantage. For that reason they may turn to be an important way of enforcing DNT and protecting users against domains that simply refuse to honor Do Not Track requests.
Limitations of Privacy Blocklists
Unfortunately, there are a number of complications that mean that we can't set up tracking protection lists for web privacy and declare victory. Some of these limitations simply come from the difficulty of writing a correct, comprehensive anti-tracking list. Others arise because there are ways for trackers to sneak around the blocking list (and the Do Not Track header will be an important way to create legal disincentives for that kind of behaviour). Let's look at these issues more closely.
1. Maintaining a correct privacy blocklist is hard
With thousands of new websites and dozens of new cross-site APIs appearing on any given day, it's quite hard to sort out which ones are engaged in the business of tracking people. It's also quite hard to tell which cross-site requests can be blocked without breaking anything. The EasyPrivacyproject makes a praiseworthy attempt to maintain a list of these tracking URLs, and hopefully with the IE9 launch that project and others like it will receive more support and reinforcement from the community. Importantly, it should be noted that the the EasyPrivacy list is less comprehensive than its twin, the ad-blocking EasyList, for the simple reason that it's easy to see when you've failed to block an advertising URL, but hard to see when you've failed to block an invisible tracking beacon!
2. Consumers need to trust the list maintainers
There is an important implicit trust relationship that has to exist between users of a particular privacy-oriented blocklist and the users who will depend on it to prevent their online reading habits from being leaked to third parties. Projects like EasyPrivacy or even commercial blocklist maintainers like Abine, Evidon or PrivacyChoice may be fairly trustworthy, but it's hard for users to verify that in a practical way. It's important to have good UI controls that let users see what is and isn't being blocklisted (such as the "Open Blockable Items" context menu in AdBlock Plus). It will also be necessary to warn users about untrustworthy blocklists. One example of an untrustworthy blocklist is the IE9 TPL maintained by TRUSTe. TRUSTe's blocking list (you can inspect it here) appears to mostly be an anti-privacy technology, not a privacy enhancing technology. What it does is whitelist thousands of tracking domains (3954 of them) and blocks only a handful (23). The main consequence of subscribing to that list in IE9 is to ensure that web users are tracked, not that they are protected from tracking. Unfortunately, the design of IE9's TPLs are implemented greatly exacerbates that problem, because once a site has been whitelisted in TRUSTe's list, that overrides a blocklisting in any other list (such as EasyPrivacy). As a result, users should never install the dangerous TRUSTe list, because doing so significantly compromises the blocklist mechanism.
3. Some tracking mechanisms are hard to blocklist
Many tracking mechanisms tend to reside at domains and URLs that are fairly clear 3rd party trackers and are therefore easy to blacklist. But there's no guarantee that will continue to be true in the future, especially if tracking protection lists are widely used. Fingerprinting techniques like the ones we experimented with at Panopticlick, and which are already being used to track people, are extremely hard to blacklist because, unlike tracking cookies, they don't have to come from a constant domain in order to be an effective tracking mechanism.
4. Some web requests simultaneously serve tracking and other purposes
When you look at a web page with an embedded map (whether innocuous or potentially sensitive), the map provider has the ability to track you: it can see what page the map was embedded from, knows that you were reading that page, and can observe any interactions you might have with the map such as searching for directions. Similarly, every time you see embedded widgets like the Digg, Reddit or Facebook Like buttons, the HTTP Referrer headerallows those social media sites to track your reading habits. Mostly that's just a bad thing, but occasionally you will want to click the button in order to broadcast what you're reading. These examples show a dilemma that anti-tracking blacklist authors will have to grapple with: should they blacklist any of these kinds of embedded objects? If yes, how will users cope with all the aspects of the web that break as a result? If no, how can they protect users against these forms of third party tracking? A special case of this dual-use problem is advertising. AdBlock Plus solves the problem of being tracked by advertising by simply blocking all the advertising. But presumably a lot of people who care about privacy don't mind seeing ads. And it's great that advertising is one of the revenue sources available to support people who want to publish on the web — provided, of course, that there are ways to prevent the advertising from invading our privacy.
Conclusion: DNT and TPLs work well together
The Do Not Track header and user-controllable privacy blacklists are complementary innovations, and we're going to need both of them if we're going to succeed in moving from the privacy-intrusive web of 2009 to the privacy-friendly web of the future. Do Not Track is mostly a policy tool, whereas blacklists are basically privacy enhancing technologies. And fortunately, they each have the capability to help in situations where the other is weaker. Do Not Track is a good way to give consumers some meaningful privacy choices in a web where the line between useful functionality and tracking mechanisms has become increasingly blurred — and it won't have to break websites to achieve that. But TPLs and other blacklists are a self-enforcing fallback that we can use to protect ourselves against websites that refuse to respect consumers' preferences.