{"id":1540,"date":"2012-11-06T09:17:58","date_gmt":"2012-11-06T14:17:58","guid":{"rendered":"http:\/\/blogs.law.harvard.edu\/herdict\/?p=1540"},"modified":"2012-11-06T09:17:58","modified_gmt":"2012-11-06T14:17:58","slug":"ample-room-for-improvement-in-spam-filtering-practices","status":"publish","type":"post","link":"https:\/\/archive.blogs.harvard.edu\/herdict\/2012\/11\/06\/ample-room-for-improvement-in-spam-filtering-practices\/","title":{"rendered":"Ample room for improvement in spam filtering practices"},"content":{"rendered":"<p dir=\"ltr\">A recent <a href=\"http:\/\/yro.slashdot.org\/story\/12\/10\/16\/175248\/zero-errors-spamhaus-flubs-causing-domain-deletions\">article<\/a> from Slashdot contributor Bennett Haselton highlights the risks inherent in any automated filtering system&#8211; even one that is well intentioned. Haselton runs a <a href=\"http:\/\/www.peacefire.org\/circumventor\/\">mailing list<\/a> through which he informs his users of web proxies, which can be used to circumvent filtering. He regularly distributes information about new proxies as existing ones become blocked. In September, Haselton emailed the list with information about ten new proxies he had created. Two weeks later, two of the proxies were placed on the <a href=\"http:\/\/www.spamhaus.org\/dbl\/\">domain blocklist<\/a> of Spamhaus, a spam-tracking organization. This itself was not new, as spam filters have previously identified Haselton\u2019s proxies as spam. \u00a0What was new, however, was that following Spamhaus\u2019 action, all 10 of Haselton\u2019s new proxies (not just the two erroneously identified as spam) were disabled. \u00a0The proxies were taken down because Haselton\u2019s domain registrar preemptively disabled all ten domains.<\/p>\n<p dir=\"ltr\">Why were all 10 of Haselton\u2019s proxies disabled? It was actually the fault of two organizations working in concert: Spamhaus and Haselton\u2019s registrar, Afilias. \u00a0Spamhaus is one of several organizations maintaining databases to help flag spam. Spamhaus maintains blacklists&#8211; which flag, for example, IP addresses associated with spam operation. \u00a0One of those blacklists is of domains typically found in spam messages, which Spamhaus calls the <a href=\"http:\/\/www.spamhaus.org\/dbl\/\">DBL<\/a>. Spamhaus <a href=\"http:\/\/www.spamhaus.org\/whitepapers\/effective_filtering\/\">recommends<\/a> that ISPs and other entities use the DNS blacklists to reject mail from bad IP addresses before it is processed by the mail server, then use the DBL to scan the content of remaining messages for blacklisted domains. Spamhaus\u2019s lists are publicly accessible and receive, according to the website, billions of queries every day.<\/p>\n<p dir=\"ltr\">The problem with Haselton\u2019s proxies began when Spamhaus placed two of the proxy domains on their DBL. An effective domain blocklist must be careful not to identify domains that are actually legitimate. \u00a0In fact, Spamhaus crows that its DBL has a \u201czero-false-positive reputation.\u201d But as Haselton discovered, there is no way to guarantee a zero-false-positive rate. In fact, it might even be possible for a malicious party to force a domain onto the DBL by repeatedly inserting it into spam messages.<\/p>\n<p dir=\"ltr\">Afilias, Haselton\u2019s registrar, removed all ten of the proxies once Spamhaus placed two of them on its DBL. \u00a0As a domain registrar, Afilias uses the DBL to try to shut down spammers. \u00a0When it saw two of the domains they had registered appear on the DBL, they noticed that Haselton had registered eight others at the same time and preemptively suspended all ten. Haselton was not notified; instead, he had to wade through a circuitous series of calls to three different companies before he was told that his proxies had been placed on a blacklist. Haselton then found that he was able to instantly and automatically remove his sites from the DBL by submitting a <a href=\"http:\/\/www.spamhaus.org\/lookup\/\">form<\/a> on Spamhaus\u2019s site. This itself gave Haselton pause because it defeats the purpose of a blacklist if all the sites on the list can be removed so easily.<\/p>\n<p dir=\"ltr\">Haselton\u2019s experience demonstrates the draconian spam-prevention policies of some domain registrars. First, Afilias should have notified Haselton with the reason that his sites were to be taken down; instead, Haselton found out only when the members of his mailing-list emailed him. Furthermore, Afilias should not have automatically suspended all of Haselton\u2019s domains. At the very least, it should have examined the content of each of the sites to see whether they were actually connected to spam operations. Finally, Afilias should have given Haselton better resources for dealing with the suspension of his domains. Haselton received no help from Afilias and had to investigate by himself how to get his sites removed from the blacklist. Afilias\u2019s current policy towards spam-filtering casts too wide a net and seems to offer no due process to site owners.<\/p>\n<p>Haselton\u2019s experience also underscores the need for transparency in spam-filtering practices. According to Spamhaus, the suggested implementation for its blacklists \u201cwill identify and reject approximately 85% of an average mail relay&#8217;s incoming mail traffic.\u201d That is, 85% of messages sent to a mail server will be rejected outright&#8211; the potential recipients have no way of ever accessing or seeing those messages. This is not necessarily a problem, but the potential for abusive filtering needs to be kept in check. Both email providers and blacklist maintainers should be as transparent and public in their practices as possible and should give reasonable recourse to parties who have been wrongly marked as spammers.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A recent article from Slashdot contributor Bennett Haselton highlights the risks inherent in any automated filtering system&#8211; even one that is well intentioned. Haselton runs a mailing list through which he informs his users of web proxies, which can be used to circumvent filtering. He regularly distributes information about new proxies as existing ones become [&hellip;]<\/p>\n","protected":false},"author":4643,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[1],"tags":[253],"class_list":["post-1540","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-filtering"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p4LdGs-oQ","_links":{"self":[{"href":"https:\/\/archive.blogs.harvard.edu\/herdict\/wp-json\/wp\/v2\/posts\/1540","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/archive.blogs.harvard.edu\/herdict\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/archive.blogs.harvard.edu\/herdict\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/herdict\/wp-json\/wp\/v2\/users\/4643"}],"replies":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/herdict\/wp-json\/wp\/v2\/comments?post=1540"}],"version-history":[{"count":1,"href":"https:\/\/archive.blogs.harvard.edu\/herdict\/wp-json\/wp\/v2\/posts\/1540\/revisions"}],"predecessor-version":[{"id":1541,"href":"https:\/\/archive.blogs.harvard.edu\/herdict\/wp-json\/wp\/v2\/posts\/1540\/revisions\/1541"}],"wp:attachment":[{"href":"https:\/\/archive.blogs.harvard.edu\/herdict\/wp-json\/wp\/v2\/media?parent=1540"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/herdict\/wp-json\/wp\/v2\/categories?post=1540"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/herdict\/wp-json\/wp\/v2\/tags?post=1540"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}