You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

Archive for the 'information quality' Category

Information Quality and Reputation


Heavily influenced by the work of Jean Nicolas Druey and Herbert Burkert, among others, I’ve been working on information quality issues in various contexts for the past 8 years or so. Today, I have the pleasure to attend the Yale Information Society Project’s conference on Reputation Economies in Cyberspace and contribute to a panel on reputational quality and information quality. Essentially, I would like to share three observations that are based on previous research projects. The three points I will talk about later today are:

  1. From both a theoretical and empirical viewpoint, information quality is a horse that is difficult to catch. As a complicating factor, information quality in the context of reputation systems is a meta-question, concerning the quality of statements about the qualities of a person, service, advice, or the like. As such, it is important to be specific about the particular aspect of the quality challenge that is up for discussion in a given quality discourse. A taxonomy of quality problems/issues in the context of online reputation might be a good first step. Such a taxonomy needs to conceptualize informational quality of reputation as a composite of syntactic (data), semantic (meaning), and pragmatic (effects) factors. [We will present an initial draft of such a taxonomy at the conference]
  2. While addressing specific quality issues, it’s important to consider the full range of possible approaches (“tools”) that are available. The role of market-based approaches (“pricing”, “incentives”) has already been explored in detail in the context of reputation systems. We also have a growing understanding about the social norms at work (research on online identity). As far as technology (“platform design”) is concerned, insights from social signaling theory might be a source of inspiration (e.g. conditions to foster honest signaling). Largely unexplored, by contrast, is the substantive (e.g. privacy) or procedural (e.g. due process) role that law may play in the context of a blended approach.
  3. Information quality conflicts can’t be avoided, only managed. Each “regulatory” approach mentioned before comes at costs and has inherent (factual and/or normative) limitations. A general limitation is the contextual and subjective nature of human information processing and decision-making processes (e.g. buying a digital camera) in which the quality of statements about quality (reputation) plays a role. The case of “teenagers” might be illustrative given our knowledge about the neurobiological state of development of brain areas (prefrontal cortex) involved in information selection, interpretation, and evaluation. But also cognitive biases of adults mark the limits on what can be achieved at the level of governance of reputation systems.

Comments, as always, welcome.

The Future of Books in the Digital Age: Conference Report


Today, I attended a small, but really interesting conference chaired by my colleagues Professor Werner Wunderlich und Prof. Beat Schmid from the Institute for Media and Communication Management, our sister institute here at the Univ. of St. Gallen. The conference was on “The Future of the Gutenberg Galaxy” and looked at trends and perspectives of the medium “book”. I’ve learned a big deal today about the current state of the book market and future scenarios from a terrific line-up of speakers. It was a particular pleasure, for instance, to meet Prof. Wulf D. von Lucus, who’s teaching at the Univ. of Hohenheim, but is also the Chairman of the Board of Carl Hanser Verlag, which will be publishing the German version of our forthcoming book Born Digital.

We covered a lot of terrain, ranging from definitional question (what is a book? Here is a legal definition under Swiss VAT law, for starters) to open access issues. The focus of the conversation, though, was on the question how digitization shapes the book market and, ultimately, whether the Internet will change the concept “book” as such. A broad consensus emerged among the participants (a) that digitization has a profound impact on the book industry, but that it’s still too early to tell what it means in detail, and (b) that the traditional book is very unlikely to be substituted by electronic formats (partly referring to the superiority-of-design-argument that Umberto Eco made some time ago).

I was the last speaker at the forum and faced the challenge to talk about the future of books from a legal perspective. Based on the insights we gained in the context of our Digital Media Project and the discussion at the forum, I came up with the following four observations and theses, respectively:

Technological innovations – digitization in tandem with network computing – have changed the information ecosystem. From what we’ve learned so far, it’s safe to say that at least some of the changes are tectonic in nature. These structural shifts in the way in which we create, disseminate, access, and (re-)use information, knowledge, and entertainment have both direct and indirect effects on the medium “book” and the corresponding subsystem.

Some examples and precursors in this context: collaborative and evolutionary production of books (see Lessig’s Code 2.0); e-Books and online book stores (see ciando or; online access to books (see, e.g., libreka, Google Book Search, digital libraries); creative re-uses such as fan fiction, podcasts, and the like (see, e.g., LibriVox, Project Gutenberg,

Law is responding to the disruptive changes in the information environment. It not only reacts to innovations related to digitization and networks, but has also the power to actively shape the outcome of these transformative processes. However, law is not the only regulatory force, and to gain a deeper understanding of the interplay among these forces is crucial when considering the future of books.

While fleshing out this second thesis, I argued that the reactions to innovations in the book sector may follow the pattern of ICT innovation described by Debora Spar in her book Ruling the Waves (Innovation – Commercialization – Creative Anarchy – Rules and Regulations). I used the ongoing digitization of books and libraries by Google Book Search as a mini-case study to illustrate the phases. With regard to the different regulatory forces, I referred to Lessig’s framework and used book-relevant examples such as DRM-protected eBooks (“code”), the use of collaborative creativity (“norms”), and book-price fixing (“markets”) to illustrate it. I also tried to emphasis that the law has the power to shape each of the forces mentioned above in one way or another (I used examples such as anti-circumvention legislation, the legal ban on book-price fixing, and mandatory copyright provisions that preempt certain contractual provisions.)

The legal “hot-spots” when it comes to the future of the book in the digital age are the questions of distribution, access, and – potentially – creative re-use. The areas of law that are particularly relevant in this context are contracts, copyright/trademark law, and competition law.

Based on the discussion at the forum, I tried to map some of the past, current, and emerging conflicts among the different stakeholders of the ecosystem “book”. In the area of contract law, I focused on the relationship between authors and increasingly powerful book publishers that are tempted to use their unequal bargaining power to impose standard contracts on authors and transfer as many rights as possible (e.g. “buy out” contracts).

With regard to copyright law, I touched upon a small, but representative selection of conflicts, e.g. the relation between right holders and increasingly active users (referring to the recent hp-lexicon print-version controversy); the tensions between right holders and (new) Internet intermediaries (e.g. liability of platforms for infringements of their users in case of early leakage of bestsellers; e.g. interpretation of copyright limitations and exemptions in case of full-text book searches without permission of right holders); the tension between publishers and libraries (e.g. positive externalities of “remote access” to digital libraries vs. lack of exemptions in national and international copyright legislation – a topic my colleague Silke Ernst is working on); and the tension between right holders and educational institutions (with reference to this report).

As far as competition law is concerned, I sketched a scenario in which Google Book Search would reach a dominant market position with strong user lock-in due to network effects and would decline to digitize and index certain books or book programs, for instance due to operational reasons. Based on this scenario, I speculated about a possible response by competition law authorities (European authorities in mind) and raised the question whether Google Book Search could be regarded, at some point, as an essential facility. (In the subsequent panel discussion, Google’s Jens Redmer and I had a friendly back-and-forth on this issue.)

Not all of the recent legal conflicts involving the medium “book” are related to the transition from an analog/offline to a digital/online environment. Law continues to address book-relevant issues that are not new, but rather variations on traditional doctrinal themes.

I used the Michael Baigent et al. v. Random House Group decision by the London’s High Court of Justice as one example (has the author of Da Vinci Code infringed copyright by “borrowing” a theme from the earlier book Holy Blood, Holy Grail?), and the recent Esra-decision by the German BVerfG as a second one (author’s freedom of expression vs. privacy right of a person in a case where it was too obvious that the figure used in a novel was a real and identifiable person and where intimate details of the real person were disclosed in the book.)

Unfortunately, we didn’t have much time to discuss several interesting other issues and topics that were brought up and related to the generation born digital and its use of books – and the consequences of kids’ changed media usage in a changed media environment, e.g. with regard to information overload and the quality of information. Topics, to be sure, that John Palfrey and I are addressing in our forthcoming book.

In sum, an intense, but very inspiring conference day.

Update: Dr. David Weinberger, among the smartest people I’ve ever met, has just released a great article on ebooks and libraries.

“Born Digital” and “Digital Natives” Project Presented at OECD-Canada Foresight Forum


Here in Ottawa, I had the pleasure to speak at the OECD Technology Foresight Forum of the Information, Computer and Communications Policy Committee (ICCP) on the participative web – a forum aimed at contributing to the OECD Ministerial Meeting “The Future of the Internet Economy” that will take place in Seoul, Korea, in June 2008.

My remarks (what follows is a summary, full transcript available, too) were based on our joint and ongoing HarvardSt.Gallen research project on Digital Natives and included some of the points my colleague and friend John Palfrey and I are making in our forthcoming book “Born Digital” (Basic Books, 2008).

I started with the observation that increased participation is one of the features at the very core of the lives of many Digital Natives. Since most of the speakers at the Forum were putting emphasis on creative expression (like making mash-ups, contributing to Wikipedia, or writing a blog), I tried to make the point that participation needs to be framed in a broad way and includes not only “semiotic democracy”, but also increased social participation (cyberspace is a social space, as Charlie Nesson has argued for years), increased opportunities for economic participation (young digital entrepreneurs), and new forms of political expression and activism.

Second, I argued that the challenges associated with the participative web go far beyond intellectual property rights and competition law issues – two of the dominant themes of the past years as well as at the Forum itself. I gave a brief overview of the three clusters we’re currently working on in the context of the Digital Natives project:

  • How does the participatory web change the very notion of identity, privacy, and security of Digital Natives?
  • What are its implications for creative expression by Digital Natives and the business of digital creativity?
  • How do Digital Natives navigate the participative web, and what are the challenges they face from an information standpoint (e.g. how to find relevant information, how to assess the quality of online information)?

The third argument, in essence, was that there is no (longer a) simple answer to the question “Who rules the Net?”. We argue in our book (and elsewhere) that the challenges we face can only be addressed if all stakeholders – Digital Natives themselves, peers, parents, teachers, coaches, companies, software providers, regulators, etc. – work together and make respective contributions. Given the purpose of the Forum, my remarks focused on the role of one particular stakeholder: governments.

While still research in progress, it seems plain to us that governments may play a very important role in one of the clusters mentioned above, but only a limited one in another cluster. So what’s much needed is a case-by-case analysis. I briefly illustrated the different roles of governments in areas such as

  • online identity (currently no obvious need for government intervention, but “interoperability” among ID platforms on the “watch-list”);
  • information privacy (important role of government, probably less regarding more laws, but better implementation and enforcement as well as international coordination and standard-setting);
  • creativity and business of creativity (use power of market forces and bottom-up approaches in the first place, but role of governments at the margins, e.g. using leeway when legislating about DRM or law reform regarding limitations and exceptions to copyright law);
  • information quality and overload (only limited role of governments, e.g. by providing quality minima and/or digital service publique; emphasis on education, learning, media & information literacy programs for kids).

Based on these remarks, we identified some trends (e.g. multiple stakeholders shape our kids’ future online experiences, which creates the need for collaboration and coordination) and closed with some observations about the OECD’s role in such an environment, proposing four functions: awareness raising and agenda setting; knowledge creation (“think tank”); international coordination among various stakeholders; alternative forms of regulation, incl. best practice guides and recommendations.

Berkman Fellow Shenja van der Graaf was also speaking at the Forum (transcripts here), and Miriam Simun presented our research project at a stand.

Today and tomorrow, the OECD delegates are discussing behind closed doors about the take-aways of the Forum. Given the broad range of issues covered at the Forum, it’s interesting to see what items will finally be on the agenda of the Ministerial Conference (IPR, intermediaries liability, and privacy are likely candidates.)

Figures tell: hacker Tron more popular than ever after restraining order against


In my first post on the controversy (concerning the late German hacker “Tron”) I predicted that as a result of the legal action taken by Tron’s family against Wikipedia, many more people would learn about the real name of Tron than would have otherwise.

From that day on, my colleagues at the FIR, James Thurman and Daniel Haeusermann, have performed Google searches using the phrases (1) (Tron “[real name]”) and (2) (Tron “[real name]” Wikipedia). They discovered that within five days after was shut down by the German court, the overall number of hits to search (1) went up from 428 to 928. In the same time, the number of hits to search (2) increased from 178 to 792.

Hence, Tron and his real name have gained substantial exposure, not only in terms of people who now know his real name, but also in terms of mentions on the World Wide Web. Our little experiment suggests a) that it might be counterproductive to enforce the right to privacy on the Web by legal means and b) that there is no (legal) remedy available that could prevent such a thing from happening – this is of course due to the decentralized, multijurisdictional character of the Web.

Update on Tron Controversy


Heise online reports that a Berlin District Court overturned the temporary restraining order against Wikimedia Deutschland. According to Heise, the application of the plaintiff has been dismissed. Consequently, Wikimedia is legally entitled to redirect visitors to the domain to the international domain Read more here, background here. Controversy


Reportedly (see, e.g., here, here, here, and here), Wikipedia Germany (i.e., Wikimedia Deutschland – Gesellschaft zur F�rderung Freien Wissens e.V.) has been forced by a temporary restraining order of the District Court of Berlin-Charlottenburg not to redirect from to The story seems to be straightforward: Wikipedia features a story on the deceased German hacker Tron and – as many other online sources do – also reveals his real name in the respective article. The hacker’s family has taken legal actions against Wikipedia based on the argument that the post qualifies as an intrusion of privacy.

The interesting part of the story: Apparently, the German version of the article is stored on a server in the U.S. controlled by the Wikipedia Foundation. While it is not that surprising that the family’s lawyers were able to get a preliminary injunction against Wikimedia Germany, it is much more challenging to take effective actions against the content provider in the U.S. In my personal view, it’s almost impossible to enforce a similar court order (targeting the article itself, though) in the U.S. based on the privacy argument mentioned above. It’s yet another variation on the theme global internet versus local free speech and privacy laws. And once again the story is likely to boil down to an enforcement issue. In any event, another illustrative example for our privacy classes

My question to the family’s lawyer: Did you tell your client in advance that legal actions against Wikimedia/Wikipedia will get a lot of public attention (trust me on this one) – with the result that many more people will learn about the real name of Tron than would have otherwise? It’s a basic information law – and I use the term ‘law’ not in the legal sense…

Update: check here.

Regulating Search? Discussion Paper I


I have the pleasure to participate in a terrific conference on “Regulating Search?” organized and hosted by our friends at the Information Society Project at Yale Law School. Here is my first discussion paper. I will post a second one later on:

Regulating Search?
Sketching a Normative Framework for Assessing Regulatory Proposals

1. The question of this symposium – Regulating Search? – can be approached from various angles and at different levels. In any event, one might expect, inter alia, that several proposals of legal and/or regulatory actions aimed at regulating search engines, ranging from consumer protection laws, IP reform, etc., will be up for discussion. Presumably, the respective proposals will pursue different policy goals and use different regulatory techniques.

2. In a later phase, proposals like this are likely to enter into competition with one another. Lawmaking and regulation are costly processes, requiring that choices about goals and means be made. Against this backdrop, a systematic comparison and isolated evaluations of regulatory proposals become essential in order to make well-informed and sustainable decisions. A look back at the history of what has been termed “cyberlaw,” however, reveals a prevalent lack of thorough assessment of legislative and/or regulatory actions, in part because such an assessment requires an open discussion and shared understanding of what fundamental policy objectives should underlie today’s information society in the first place. This failure should not be repeated in the future and with regard to a potential regulation of “search.”

3. I would like to suggest three core values (or policy goals) of a democratic information ecosystem that may serve as the benchmarks for assessing proposals aimed at regulating search engines in particular and search more generally: Autonomy, diversity, and quality. Informational autonomy includes at least three elements. First, an individual must have the freedom to make choices among alternative sets of information, ideas, opinions, and the like. This includes the freedom to decide what information someone wants to receive and process. Second, informational autonomy as an aspect of individual liberty necessitates that everyone has the right to express her own beliefs and opinions. Third, autonomy in the digitally networked environment arguably requires that every user can participate in the creation of information, knowledge, and entertainment.

4. The development of an individual’s own personality and self-fulfillment intersects with a second core value of the digitally networked ecosystem: its diversity. Diversity in the sense of a wide distribution of information from a great variety of competing sources can either be seen as a valuable mechanism to attain truth, or as a crucial instrument for protecting democratic process and deliberation. In the digital environment, however, the diversity of information, knowledge, and entertainment is an important aspect of the broader concept of cultural diversity.

5. As individuals, groups, and societies, we heavily depend in our decision-making processes on information, which is increasingly acquired over the Internet. In order to make good decisions, we depend on quality information, i.e., information that meets the functional, cognitive, aesthetic, and ethical requirements of different stakeholders such as users, creators, experts, and administrators. Consequently, legal and regulatory regimes should contribute to the creation and further development of a high-quality information ecosystem.

6. Each proposal that seeks to regulate search in general and search engines in particular can be evaluated based on these normative criteria. Even with this normative framework in place, however, the assessment of alternative governance regimes gets complicated, since the three policy goals “autonomy,” “diversity,” and “quality” are not necessarily always aligned. Unleashed diversity in the digitally networked environment, for instance, might have negative feedback effects on user autonomy because it increases an individual’s risk to be exposed to undesired information. A regulatory approach aimed at ensuring high-quality information, by contrast, might be in tension with informational autonomy, because it may impose a quality requirement leading to a level of quality that does not meet an individual’s informational needs.

7. As a consequence, governance proposals for search engines and their environments face the challenge of achieving a balance among three policy goals that are not perfectly aligned. In the case of search engine regulation, this problem is accentuated by the fact that search engines simultaneously affect all three aspects. For example, since search engine users often do not know in advance what specific piece of information they are looking for, the quality of the information that users get depends to a great extent on search engines. Consequently, the quality of information is intertwined with the quality of the search engine that defines which information becomes available based on any given query. Similarly, search engines have effects on autonomy and diversity in the digitally networked environment. Against this backdrop, regulation of search (engines) is a particularly complex task because each regulatory intervention focusing on one issue almost certainly affects another element of the normative framework.

8. In conclusion, this discussion paper calls not only for a careful design of legal or regulatory actions aimed at governing “search,” but also for a thorough assessment of legislative and/or regulatory proposals and their potential effects against the backdrop of core values of a democratic digital environment (system of “moving elements”). In that sense, the paper also advocates for a systemic view of “search” regulation, where “search” is understood as one element that interacts with other elements of the digitally networked environment, including decentralized content production and peer-to-peer distribution of digital content.

Comments welcome.

Misuse(s) of the Information Quality Act


My colleague and friend Derek Bambauer, Fellow at the Berkman Center, was kind enough to send me a link to an interesting article in the Boston Globe on the strategic (mis-)use of the U.S. Data Quality Act by the industry. According to the Globe, the Data Quality Act has become a ”device that defenders of industry have increasingly relied upon to attack all range of scientific studies whose results or implications they disagree with, from government global warming reports to cancer research using animal subjects. …. [A]s interpreted by the Bush administration, it creates an unprecedented and cumbersome process that saddles agencies with a new workload while empowering businesses to challenge not just government regulations–something they could do anyway–but scientific information that could potentially lead to regulation somewhere down the road.”

The Globe draws our attention to lawsuit before the federal appeals court in Virginia brought by the US Chamber of Commerce and the Salt Institute. The suit challenges a National Institutes of Health study showing that reduced salt intake lowers blood pressure. The court is expected to decide “whether companies can sue agencies that reject their ‘data quality’ complaints, thereby dragging individual studies into the courtroom …. If the judge in the case writes a precedent-setting opinion, and if higher courts agree, a brand-new body of law could emerge, consisting largely of corporate lawsuits against scientific analyses.”

I analyzed the Data Quality Act from an information law perspective soon after its enactment and also mentioned potential misuse scenarios. The paper is available here and might be worth a skim reading.

Law & Economics of Blogging


Larry E. Ribstein, Univ. of Illinois College of Law, offers on SSRN Initial Reflections on the Law and Economics of Blogging. After an overview of the technology of blogging, the author explores the economics of blogging, discussing private costs and benefits (such as self expresion and reputation, among others) on the one hand and social benefits and costs on the other hand. Personally, I’m particularly interested in the low-quality information argument as a potential social cost, since it links nicely to my research-in-progress on information quality on the Internet. Unfortunately, the quality argument made by Ribstein is indeed “initial” as the paper’s title suggests.
Ribstein then focuses on public choice of blogging, followed by a discussion of specific legal issues, including the journalists’ privilege, the application of election laws, copyright issues, media ownership restrictions, as well as defamation and licensing laws.
Overall, a nice and short Saturday morning “food-for-thought” read.

Birnhack on Public Domain


Michael Birnhack posted an interesting article on SSRN (forthcoming in THE PUBLIC DOMAIN OF INFORMATION, P. Bernt Hugenholtz & Lucie Guibault, eds., Kluwer Law International, 2005): “More or Better? Shaping the Public Domain“. I’m particularly interested in the way he frames information quality issues in the context of free speech theories and copyright. Here’s the abstract:

One of the most interesting concepts that emerged from the battle over the continuous expansion of copyright law in the last decade is that of the public domain. After the public domain was identified, many authors struggled to define it, map it, locate its constitutional sources and explain its crucial role in copyright law. This important work poses a viable alternative to the pro-property or commodification of information alternative. The public domain project reminds us that at least under an instrumentalist view of copyright law, the public domain is not merely – or rather should not be – an unintended byproduct, or graveyard of copyrighted works, but rather a playground for speech-experiments. Copyright is one of the main tools aimed to create the public domain. This domain is a commons, owned by all and none, a resource which we can use without asking permission. It has a crucial role in personal self-development, learning, experiencing, imagining, speaking with others, creating new works for the benefit of ourselves and wider circles, starting from the immediate interlocutor and up to the entire community. The public domain is the means and the end to promote the progress of science. It is where knowledge is created and where it lies, awaiting new interpretations, new applications and new meanings.

Once we accept that the public domain is not only a negative, we need to figure out how we would like it to be constructed. In this article I would like to add my contribution to the construction of the public domain. In performing this task, we need not ignore the elaborate political thought about freedom of speech. The public domain and free speech are two sides of the same coin. Both notions aim at constructing a communicative sphere, where people can interact with each other in various circles, whether it is an interpersonal circle, a communitarian one or a wider political circle. In this sense, both are derivatives of a political notion, which is a particular conception of democracy. Accordingly, it is useful to learn from the lessons of the free speech-copyright conflict in our task of constructing the public domain, within copyright law.

What kind of public domain are we interested in? I apply the notions of quality and quantity. These are fuzzy terms. At best, we would like to have a combination of both: we would like to construct a public domain that has more information and more speech of better quality. The article explores how these fuzzy terms interact with various theoretical justifications of both free speech jurisprudence, and then with various theories of copyright law, and concludes with tying all the ends together – examining how we can better construct the public domain.

Log in