You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

Herdict 2012: By the Numbers

2012 was a great year for Herdict during which we saw substantial growth in user reporting.  I thought I’d kick off 2013 by highlighting our year through facts and figures.  For those of you who aren’t familiar with the project, Herdict is a crowdsourced platform for collecting reports about accessible and inaccessible websites, regardless of the root cause of the issue—from filtering to server problems to any other Web blockage.

As a crowdsourced project, Herdict data reflects the Internet usage of “the Herd” – the people who report to us through our website, Twitter, e-mail, or our browser add-ons.  Because it is crowdsourced, a significant level of inaccessible reports from a particular region or about a particular site at a given time serves as a real-time proxy for where filtering is taking place and what is being filtered, as experienced by the Herd.  The more users report on Herdict, the more robust and accurate of a proxy the platform becomes.

Substantial Increase in Reports in 2012:

In 2012, we saw a large increase in reporting, which is critical for building an informative and useful database of site accessibility.

Total Number of Reports for 2012: 97,931

Total Inaccessible Reports for 2012: 57,262

Total Accessible Reports for 2012: 39,538

By comparison, in 2011, we received 43,481 reports, of which 12,567 were inaccessible reports.  Herdict’s 2012 reporting represents a 125% increase in total reports compared to 2011 and a 358% increase in inaccessible reports. The increase brings our cumulative number of inaccessible reports to 140,323 and total reports to 311,804.

Many New Inaccessible domains in 2012:

In 2012 we collected 19,631 reports on unique domains.  Of those, 15,499 were domains that were new to Herdict.  This brings the total number of domains in our database to 26,636.

Among the top domains reported to Herdict in 2012, the most inaccessible site globally was Facebook.com (853 inaccessible reports), mostly coming from Vietnam and China.  This is not surprising given Facebook’s widespread popularity and the aggressive stances that Vietnam and China have taken to restrict access to the site within their borders.

Top 10 inaccessible domains in 2012 by reports:

  1. facebook.com (853)
  2. viet.rfi.fr (826)
  3. viettan.org (815)
  4. steves-digicams.com (772)
  5. voanews.com (680)
  6. bbc.co.uk (632)
  7. danlambaovn.blogspot.com (503)
  8. rfa.org (463)
  9. x-cafevn.org (456)
  10. danchimviet.info (437)

 

Also high on the list were the domains for YouTube (#12), Scribd (#18), and Twitter (#19).   With the exception of steves-digicams.com, Herdict’s top domains are all sites that we might expect to be filtered—domains for newspapers, social networking tools, and activist organizations.  When Herdict first began, some critics feared that spammers would overwhelm the site with meaningless data; however, our experience proves otherwise.

Among the sites with the largest increases in inaccessibility from 2011 was viettan.org, which had 781 more inaccessible reports this year than last.  This increase reflects a large growth in reporting from Vietnam in 2012.  In contrast, Facebook.com was relatively steady, with inaccessible reports declining slightly from 987 reports in 2011.  Among other sites with significant increases in inaccessible reports were Radio Free Asia, Human Rights Watch, Scribd, Freedom House, and the New York Times.

Among sites for which we have categories, in 2012 we received by far the most inaccessible reports for political sites (16,505 reports), which had more inaccessible reports than social sites (5,685 reports) and Internet tools (3,713 reports) combined.  That said, we received 30,817 reports for uncategorized sites, suggesting that we need to find a better way to assign categories to domains—a challenge given the increase in new URLs this year.

Most Reports from China in 2012:

In 2012 we received 30,831 inaccessible reports from China and 10,931 accessible reports.  A large part of this influx is due to our partnership with GreatFire.org.  This data-sharing agreement has granted us greater insight into what sites are and are not accessible in China.

Top 10 countries by inaccessible reports in 2012:

  1. China (30,831)
  2. Vietnam (20,639)
  3. Thailand (1,333)
  4. India (1,259)
  5. United States (1,134)
  6. France (206)
  7. Iran (198)
  8. Germany (195)
  9. United Kingdom (193)
  10. Cambodia (158)

 

The substantial difference between the number of reports coming from the first five countries when compared to the rest shows that we must do more work to increase awareness about Herdict.  I hope that you will help us in this effort, whether by reporting yourself or by encouraging others to help us.

Looking to 2013:

Both in terms of data and site functionality, Herdict saw great improvements in 2012.  In 2013 we will be adding a few more important features, including customizable lists that will allow you to create and share sets of sites that you care about.  These queues will allow users to focus on particular countries; you will be able to designate a few countries in which you are interested, and when we get visitors from those countries, we will encourage them to test the sites on your list.  This will strengthen the Herdict community around the world.

While this was a year of significant growth, we have much more to do.  We are dependent upon users’ reports, and we need more reports from more places.  I encourage all of you to help spread the word about Herdict to your friends, family, and followers. With more reports from even more locations, we will be able to provide ever more useful data about inaccessible sites in 2013!

Thanks,

Ryan

UK Filtering Proposal: An Analysis

Back in October of this year, David Cameron proposed a censorship scheme which would make it harder to access pornography in the UK. Under the plan, the UK’s biggest four ISPs — BT, Talk Talk, Virgin, and Sky — would automatically block access to pornographic sites for anyone with an existing broadband contract unless they opted out. Ostensibly, users would be given a choice, but it would be a choice that many users would be reluctant to make due to embarrassment or fear of recrimination.

The proposal paves the way for increased governmental regulation of the Internet in the UK. Despite the fact that Cameron suggested censorship would apply only to pornographic sites (and supposedly only to prevent the increasing sexualisation of children), the reality is that there is no reliable way to filter out only pornographic sites. The result would be over-inclusiveness rendering many acceptable sites blocked simply for containing related terms. The sites affected would be those that discuss topics such as sexuality, sexual health, and safe sex; these topics are essential to education, public safety, and public health.

Glenys Roberts, from the UK paper Daily Mail, heartily agreed with the proposal in an article entitled ‘We need protection from more than just porn on the internet.’ The paper stated that adults, not just children, need protection from ‘sudden unexpected exposure to internet porn’ because ‘there is nothing quite so upsetting as coming upon pornographic images you do not want to see’. In fact, the article went even further, suggesting that all vaguely ‘upsetting’ imagery should be excised from the Internet including ‘people in unnatural positions doing unimaginable things to each other,’ ‘beautiful girls in come-hither postures,’ ‘laboratory animals undergoing horrific experiments,’ ‘images posted by the anti-fur brigade or those against the production of foie gras or cruelty to dogs in China,’ and even ‘anything supernatural designed to terrify me out of my wits’.

On the contrary, I would suggest that for most people there are far more upsetting things than coming across the occasional porn site. In fact, even imagery that is intentionally upsetting serves an important speech function. Certain upsetting images draw our attention to human and animal abuses all over the world, helping to rally support for ending these abuses. And these supposedly ‘terrifying’ supernatural images and tales are simply part of all the weird and wonderful ways in which human beings choose to express themselves. I wonder if Glenys Roberts would also prefer to have all of the ‘upsetting’ imagery taken out of news reports? Would she prefer we never had the opportunity to see anything at all which remotely threatens the stultifying status quo?

The most dangerous part of Cameron’s proposal, of course, is that it makes censorship the default position. This makes it possible for many other sites to be covertly and automatically censored, without prompting the furor that overt censorship would cause in situations unrelated to the cause of child protection.

Cameron’s proposal is unlikely to be implemented as is. Recent reports indicate  that the four biggest ISPs in the UK have refused to agree to the proposal. While they will give new users the option to block all pornography, all existing contracts will remain the same. This will also be an active choice rather than a default position, requiring new subscribers to opt into censorship. And only a few people will even have that option because only 5% of broadband customers ever change providers, meaning that very few people enter into new contracts.

It seems as though Cameron’s attempt to regulate UK Internet access has been thwarted. All of those who were worried about their children’s access to pornography will simply have to use their parental controls as they have always have done. Glenys Roberts and all those who are ‘upset’ by the strange things on the Internet will simply have to restrain themselves from searching for them. However, with leaks from December’s World Conference on International Telecommunications suggesting that many world leaders view freedom of expression on the Internet as a problem, and the US’s participation in the dubious and non-transparent ‘Trans-Pacific Partnerships’ negotiations, it seems that this freedom is in imminent danger.

With David Cameron having reportedly tried to stop rioters from communicating via social media during the London riots of 2011 (before his plan became known to the public and was compared to the actions of Egyptian PM Hosni Mubarak), it is clear that Cameron sees the freedom of the Internet, and social media in particular, as a threat. Although the UK sided with the US in their refusal of the new ITRs (International Telecommunication Regulations) proposed at the WCIT last week, this only indicates that the UK government understands that such a move would be extremely unpopular. This makes it abundantly clear that as long as the opposition remains strong, censorship can be fought and the freedom of the Internet defended.

Jean-Loup Richet, Special Herdict Contributor

DPI Threat to Freedom of Expression

The International Telecommunications Union has approved the adoption of a technical standard for deep packet inspection (DPI) technology, arousing concerns about the potential effects of standardizing invasive technology that can be used for censorship and surveillance.

What is DPI?

As described in our blog post on Russia’s new internet bill, information over the internet is sent in packets. Just like letters dropped in the mail, each packet contains a header indicating the destinations. Routing the packet to its destination requires only that the network look at the header of the packet.  DPI, however, is more invasive, examining the content of the packet as well.  In other words, DPI enables ISPs or other network operators to peek at the letter inside the envelope.

Networks operators can use DPI for innocuous and useful applications such as network security and malware detection. However, ISPs have also used DPI for more invasive applications, such as blocking competitors’ products, bandwidth shaping, and targeted advertising. Additionally, governments have found DPI to be an effective means for both censorship or surveillance. It is known that China, Iran, and Russia currently use DPI; it has been alleged that the United States has also used DPI for warrantless surveillance.

The ITU Standard

The ITU, or International Telecommunications Union, is a United Nations agency whose mission to help coordinate international cooperation in information and communication technology. Whereas the International Telecommunication Regulations is a binding treaty overseen by the ITU, ITU standards and recommendations are non-binding.

The new ITU standard, titled “Requirements for Deep Packet Inspection in Next Generation Networks,” is still under development and has not been officially made public, although a draft version was inadvertently released earlier this month. The standard proposes requirements for DPI capabilities in networks but does not describe how those capabilities are to be implemented.

The existence of a standard for DPI is not inherently harmful; as noted, DPI has applications beyond censorship and surveillance. But the ITU’s development of this particular DPI standard is problematic for several reasons. First, as a political rather than purely technical body, the ITU’s proposal of a standard for DPI may legitimize and facilitate government use of that technology for censorship and surveillance. Germany, for example, argued that the ITU should “not standardize any technical means that would increase the exercise of control over telecommunications content, could be used to empower any censorship of content, or could impede the free flow of information and ideas.” Second, because DPI is highly invasive– not even properly encrypted web traffic is completely secure–a technical standard for DPI ought to be accompanied, at the very least, by a discussion of how to safeguard communications privacy. The ITU’s draft standard, on the other hand, is not at all attentive to DPI’s troubling privacy implications.

The ITU’s non-public development of a DPI standard is alarming given that the ITU is providing a standard for a technology that can and has been used for extensive fine-grained censorship and surveillance.  Moreover, it is doing so without any accompanying discussion of how to preserve internet freedom.  Perhaps this is unsurprising given how several countries recently tried to use the ITU as a vehicle for increasing filtering and government control of the Internet.  Although these proposals were not implemented, it seems that the ITU may still be a vehicle for standardizing censorship through other means.

« Older posts       Newer Posts »