{"id":387,"date":"2020-02-25T02:54:05","date_gmt":"2020-02-25T02:54:05","guid":{"rendered":"http:\/\/blogs.harvard.edu\/copyrightosc\/?p=387"},"modified":"2020-02-25T03:35:25","modified_gmt":"2020-02-25T03:35:25","slug":"fair-use-week-2020-day-two-with-guest-expert-brandon-butler","status":"publish","type":"post","link":"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/2020\/02\/25\/fair-use-week-2020-day-two-with-guest-expert-brandon-butler\/","title":{"rendered":"Fair Use Week 2020: Day Two With Guest Expert Brandon Butler"},"content":{"rendered":"<h2>The\u00a0<em>Feist<\/em>-y Reason That Text and Data Mining is Fair Use<\/h2>\n<h4>by Brandon Butler<\/h4>\n<p><a href=\"https:\/\/www.fairuseweek.org\/\">Happy Fair Use Week<\/a>! This is a happy week, indeed, for me, because fair use is my favorite copyright doctrine. But my favorite copyright decision just may be\u00a0<em><a href=\"https:\/\/cite.case.law\/us\/499\/340\/\">Feist v. Rural Telephone Co.<\/a><\/em>, a case about\u2026telephone books!<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-389\" src=\"http:\/\/blogs.harvard.edu\/copyrightosc\/files\/2020\/02\/whitepages.jpg\" alt=\"\" width=\"344\" height=\"229\" srcset=\"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/files\/2020\/02\/whitepages.jpg 800w, https:\/\/archive.blogs.harvard.edu\/copyrightosc\/files\/2020\/02\/whitepages-300x200.jpg 300w, https:\/\/archive.blogs.harvard.edu\/copyrightosc\/files\/2020\/02\/whitepages-768x511.jpg 768w\" sizes=\"auto, (max-width: 344px) 100vw, 344px\" \/><\/p>\n<p>Among the many wonderful qualities of the\u00a0<em>Feist<\/em>\u00a0opinion is the bright neon line that it draws between the purpose of copyright (to give incentives for the creation and distribution of creative, expressive works) and what way, way, WAY too many people\u00a0<em>think<\/em>\u00a0is copyright\u2019s purpose: to ensure that someone who works hard to make something gets paid every time someone else uses it. If you understand why\u00a0<em>Feist<\/em>\u00a0draws that line, you\u2019ll understand why text and data mining is clearly a fair use. (See, I got there! Now hang in a little longer and I\u2019ll get back to fair use in a minute\u2026)<\/p>\n<p>The idea that whoever makes something should control it, or get paid whenever it gets used, is sometimes called \u201clabor-desert theory,\u201d and it sounds pretty tempting. There\u2019s even an Enlightenment philosopher that people invoke to support it: John Locke, who is said to have argued that when someone takes something from \u201cthe commons\u201d and mixes it with their labor, the result is a delicious property gumbo, and it is\u00a0<em>theirs<\/em>.<\/p>\n<p>It\u2019s been a minute since I last read Locke, so I can\u2019t promise that\u2019s the most faithful representation of his thinking. But I can tell you it is a pretty faithful representation of the arguments that some copyright holders and property rights enthusiasts make in favor of long, strong copyright. They talk about how hard it is to make a movie, how much time and energy must be devoted to various forms of creative work, how many jobs are required to make the creative economy hum, and so on.<\/p>\n<p>That may all be true, but the fact (ha!) is that how hard you work to make something is irrelevant to the question of whether copyright protects it. Why? Well, it is an axiom of US copyright law that the author\u2019s monopoly protects her\u00a0<em>expressive contributions<\/em>\u00a0to a work, but does not protect any facts (or ideas) that might be embedded in the work.<\/p>\n<p>For example, where two authors write about the same underlying historical event, the first author may prevent the second author from copying too much of her expressive prose (these were the facts of the pioneering fair use decision\u00a0<a href=\"https:\/\/cite.case.law\/f-cas\/9\/342\/\">Folsom v. Marsh<\/a>, in which verbatim copying from an exhaustive biography of George Washington to create a second, shorter biography was found to be infringing), but she certainly can\u2019t prevent the second author from relying on facts uncovered in her research (as, for example, in\u00a0<em><a href=\"https:\/\/cite.case.law\/f2d\/650\/1365\/\">Miller v. Universal<\/a><\/em>, where an author\u2019s \u201cresearch\u201d on a famous kidnapping case was held not to be the proper subject of copyright protection as against a second author). Facts are not created by anyone (<em>pace<\/em>\u00a0post-modernism etc.), and are no one\u2019s property, according to copyright law. And, crucially, wrapping facts in a crunchy, flaky layer of your copyrighted expression is not enough to give you rights in the underlying facts.<\/p>\n<p>Despite the bedrock status of this proposition, and its seemingly clear embodiment in the statute at\u00a0<a href=\"https:\/\/urldefense.proofpoint.com\/v2\/url?u=https-3A__www.law.cornell.edu_uscode_text_17_102&amp;d=DwMGaQ&amp;c=WO-RGvefibhHBZq3fL85hQ&amp;r=sLjykPVK6rYnb5xQBJWWgzvTiqS5Ic0JMO5L6p0mJkw&amp;m=82E0iVqM7QyZhjhWslj9KRZ36kauBQKVXoImmR3ZRY0&amp;s=0NvZ6mFJ5WxqHTw90Rp5cIZoDkeXGWEZ1zcl1LNWFHA&amp;e=\">\u00a7 102(b) of the Copyright Act<\/a>, courts had trouble resisting the impulse to reward\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Sweat_of_the_brow\">\u201csweat of the brow\u201d<\/a>\u00a0or \u201cindustrious collection\u201d by granting copyright protection to facts first revealed in a work of authorship. It wasn\u2019t until the 1991 resolution of a dispute over the wholesale copying of names and numbers in telephone directories in\u00a0<em><a href=\"https:\/\/scholar.google.com\/scholar_case?case=1195336269698056315\">Feist<\/a><\/em>\u00a0that the Supreme Court gave us a strong, clear articulation of both the principle and its deep Constitutional foundations:<\/p>\n<blockquote><p>The mere fact that a work is copyrighted does not mean that every element of the work may be protected. Originality remains the sine qua non of copyright; accordingly, copyright protection may extend only to those components of a work that are original to the author. [citations omitted] Thus, if the compilation author clothes facts with an original collocation of words, he or she may be able to claim a copyright in this written expression.\u00a0<em>Others may copy the underlying facts from the publication<\/em>, but not the precise words used to present them.<\/p><\/blockquote>\n<p>[snip]<\/p>\n<blockquote><p>It may seem unfair that much of the fruit of the compiler\u2019s labor may be used by others without compensation. As Justice Brennan has correctly observed, however, this is not \u201csome unforeseen byproduct of a statutory scheme.\u201d <a href=\"https:\/\/cite.case.law\/us\/471\/539\/#p589\">Harper &amp; Row, 471 U. S., at 589<\/a> (dissenting opinion). It is, rather, \u201cthe essence of copyright,\u201d ibid., and a constitutional requirement. The primary objective of copyright is not to reward the labor of authors, but \u201c[t]o promote the Progress of Science and useful Arts.\u201d <a href=\"https:\/\/fairuse.stanford.edu\/law\/us-constitution\/\">Art. I, \u00a7 8, cl. 8<\/a>. Accord, <a href=\"https:\/\/cite.case.law\/us\/422\/151\/#p156\">Twentieth Century Music Corp. v. Aiken, 422 U. S. 151, 156 (1975)<\/a>.\u00a0<strong>To this end, copyright assures authors the right to their original expression, but encourages others to build freely upon the ideas and information conveyed by a work.<\/strong>\u00a0<a href=\"https:\/\/cite.case.law\/us\/471\/539\/#p556\">Harper &amp; Row, supra, at 556-557<\/a>. This principle, known as the idea\/expression or fact\/expression dichotomy, applies to all works of authorship. \u2026This result is neither unfair nor unfortunate. It is the means by which copyright advances the progress of science and art. (Emphases added.)<\/p><\/blockquote>\n<p>The Supreme Court\u00a0<a href=\"https:\/\/urldefense.proofpoint.com\/v2\/url?u=https-3A__scholar.google.com_scholar-5Fcase-3Fcase-3D12147684852241107557-26hl-3Den-26as-5Fsdt-3D6-26as-5Fvis-3D1-26oi-3Dscholarr&amp;d=DwMGaQ&amp;c=WO-RGvefibhHBZq3fL85hQ&amp;r=sLjykPVK6rYnb5xQBJWWgzvTiqS5Ic0JMO5L6p0mJkw&amp;m=82E0iVqM7QyZhjhWslj9KRZ36kauBQKVXoImmR3ZRY0&amp;s=Tbg7mlJ5-xjFCzWPfydOmRevwhSp8wMe4tjM3AB4dlU&amp;e=\">subsequently<\/a>\u00a0<a href=\"https:\/\/urldefense.proofpoint.com\/v2\/url?u=https-3A__scholar.google.com_scholar-5Fcase-3Fq-3DGolan-2Bv-2Bholder-26hl-3Den-26as-5Fsdt-3D20000006-26as-5Fvis-3D1-26case-3D3239612723066820072-26scilh-3D0&amp;d=DwMGaQ&amp;c=WO-RGvefibhHBZq3fL85hQ&amp;r=sLjykPVK6rYnb5xQBJWWgzvTiqS5Ic0JMO5L6p0mJkw&amp;m=82E0iVqM7QyZhjhWslj9KRZ36kauBQKVXoImmR3ZRY0&amp;s=Deuq06288wL8j9OXcBYxmevzHXViM9XjOuxcxHfUxV8&amp;e=\">called<\/a>\u00a0this distinction (also known as the \u201cidea\/expression dichotomy\u201d) part of the \u201ctraditional contours of copyright\u201d and a \u201cbuilt-in First Amendment safety valve.\u201d This is, in other words, about as fundamental a proposition as there can be in copyright law, grounded in both the Copyright Clause and the First Amendment of the Constitution. To the extent that fact and expression in a protected work can be separated, the facts are free for the taking. Whether it\u2019s a phonebook or a newspaper article, expression is protected, but facts are free.<\/p>\n<p>But, it turns out that one of the most powerful ways to extract and use all the facts embedded in a wide variety of creative works, to separate them from the expression in which they subsist, is to use text and data mining. But in order to perform text and data mining, a computer has to do things that ordinarily require the permission of the copyright holder, namely,\u00a0<em>copying<\/em>\u00a0the full text of the works into a computer, and in many cases\u00a0<em>displaying to the public<\/em>\u00a0contextual snippets that substantiate your claims. All this takes place thanks to technology that the Founders certainly couldn\u2019t have foreseen, and that even the drafters of the 1976 Copyright Act might not have anticipated. Enter fair use, with the flexibility required to adapt to a changing world.<\/p>\n<p>While there was already plenty of smart writing on the issue, and a long line of cases pointing in the right direction, the question of whether using computers to read in-copyright texts and extract facts from them got its fullest, and perhaps final, answer when Judge Pierre Leval decided the\u00a0<a href=\"https:\/\/cite.case.law\/f3d\/804\/202\/\">Google Books case<\/a>. Google Books was the result of a massive digitization effort in which university libraries (including\u00a0<a href=\"https:\/\/news.virginia.edu\/content\/uva-library-joins-google-books-library-project\">ours<\/a>) provided millions of books to Google to digitize and crawl, just like they crawl websites, to help people\u00a0<a href=\"https:\/\/books.google.com\/\">find books<\/a>. (Libraries got to keep the digital copies, which we deposited with the\u00a0<a href=\"https:\/\/urldefense.proofpoint.com\/v2\/url?u=https-3A__hathitrust.org&amp;d=DwMGaQ&amp;c=WO-RGvefibhHBZq3fL85hQ&amp;r=sLjykPVK6rYnb5xQBJWWgzvTiqS5Ic0JMO5L6p0mJkw&amp;m=82E0iVqM7QyZhjhWslj9KRZ36kauBQKVXoImmR3ZRY0&amp;s=V3RQuiB2artGZLXBgWHu7f-UfGlNLu1hSBLn4jfzL9w&amp;e=\">HathiTrust Digital Library<\/a>.) Leval more or less created the modern fair use doctrine in a\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Toward_a_Fair_Use_Standard\">law review article<\/a>\u00a0first published 30 years ago, so it was fitting that he was the judge to finally give a broad blessing to text and data mining. In his opinion, Judge Leval answers two fundamental questions:<\/p>\n<ol>\n<li>Is Google\u2019s purpose transformative, i.e., is it different from the author\u2019s original expressive purpose and does it \u201cserve[] copyright\u2019s goal of enriching\u00a0<a href=\"https:\/\/www.publicknowledge.org\/\">public knowledge<\/a>\u201d by using the protected material to \u201ccommunicate[] something new and different from the original or expand[] its utility.\u201d And,<\/li>\n<li>Does Google\u2019s use provide the public with a \u201csubstitute\u201d in the market for the original works in a way that does \u201cmeaningful\u201d \u201csignificant\u201d harm to the market for the work?<\/li>\n<\/ol>\n<p>The ethos of\u00a0<em>Feist<\/em>\u00a0informs these two questions in a fundamental way. First, Judge Leval finds Google\u2019s purpose to be transformative because of its fundamentally factual, informative character. The core purposes of Google Book Search\u2014to locate relevant books by providing facts about the occurrence of search terms inside of books, and to reveal facts about the occurrence of words and phrases throughout the entire corpus of books\u2014are of course radically different from the expressive purpose(s) of any particular book. And, not only is that purpose different, but it is consonant with the design of copyright itself, which is tailored to facilitate the free circulation of facts. It also serves the ultimate purpose of copyright, which is to \u201cpromote the Progress of Science\u201d (where \u201cScience\u201d means all manner of learning and culture). Google Books is transformative because it is\u00a0<em>Feist<\/em>-y &#8211;\u00a0it liberates facts from expression in a way that adds to the world\u2019s knowledge and doesn\u2019t implicate the expressive monopoly of authors.<\/p>\n<p>Which brings us to the question of market harm and substitution, which is also filtered through a\u00a0<em>Feist<\/em>-ian lens. In addition to the obvious point that Google Book Search results are not a substitute for access to the underlying books (snippets are too small, and they are impossible to reassemble into the original work), which is certainly of fundamental importance, the court must contend with two other market-based challenges.<\/p>\n<p>First, the Authors Guild argued that some users will find the information they need in snippets, which will forestall sales of the relevant works (either directly to researchers, or to libraries that serve them). The court\u2019s response here is fundamentally <em>Feist-ian<\/em>: so what? That is, to the extent that the snippet reveals a\u00a0<em>fact<\/em>\u00a0that obviates a researcher\u2019s need to buy a copy of the book containing that fact, that is all to the good.<\/p>\n<p>Leval observes, by way of example, that a student looking for the year <a href=\"https:\/\/en.wikipedia.org\/wiki\/Franklin_D._Roosevelt\">Franklin D. Roosevelt<\/a> was first stricken by polio can find it in a snippet from Richard Thayer Goldberg\u2019s\u00a0<em>The Making of Franklin D. Roosevelt<\/em>\u00a0(1981) that is returned from a Google Book Search query. The student will not have to buy Goldberg\u2019s book, or even check it out from a library, to find this fact. And that\u2019s fine; this is not a \u201charm\u201d that copyright cares about. Judge Leval writes:<\/p>\n<blockquote><p><em>[The author\u2019s] copyright does not extend to the facts communicated by his book. It protects only the author\u2019s manner of expression.\u2026 Google would be entitled, without infringement of [the author\u2019s] copyright, to answer the student\u2019s query about the year Roosevelt was afflicted, taking the information from Goldberg\u2019s book.The fact that, in the case of the student\u2019s snippet search, the information came embedded in three lines of Goldberg\u2019s writing, which were superfluous to the searcher\u2019s needs, would not change the taking of an unprotected fact into a copyright infringement.<\/em><\/p><\/blockquote>\n<p>Or, as Justice O\u2019Connor says in\u00a0<em>Feist<\/em>, \u201cThis result is neither unfair nor unfortunate.\u201d<\/p>\n<p>The Authors Guild also argued that Google\u2019s scanning harms a \u201cderivative\u201d market, namely the market for creating search databases and displaying snippets. At first glance, this may be the Guild\u2019s most compelling argument. Maybe Google Book Search\u00a0<em>users<\/em>\u00a0never see the entire work, but of course Google itself necessarily does copy the full text, so the status of Google\u2019s use behind the curtain could be less clear.<\/p>\n<p>Judge Leval doesn\u2019t think so. To the contrary, he says \u201cThere is no merit to this argument.\u201d Why? Because<\/p>\n<blockquote><p><em>\u201cThe copyright resulting from the Plaintiffs\u2019 authorship of their works does not include an exclusive right to furnish the kind of information about the works that Google\u2019s programs provide to the public. For substantially the same reasons, the copyright that protects Plaintiffs\u2019 works does not include an exclusive derivative right to supply such information through query of a digitized copy.\u201d<\/em><\/p><\/blockquote>\n<p>Judge Leval goes on to argue that the right to create derivative works is limited to works that \u201cre-present the protected aspects of the original work, i.e., its expressive content, converted into an altered form.\u201d As has already been established, the Google Book Search project does no such thing. Indeed, Judge Leval distinguishes Google Book Search from other projects that have sought permission to display shorter portions of books or songs (as in ringtones) by observing that,<\/p>\n<blockquote><p><em>Unlike the reading experience that the Google Partners program or the Amazon Search Inside the Book program provides [or the listening experience that Ringtones provide], the snippet function does not provide searchers with any meaningful experience of the\u00a0expressive content\u00a0of the book. (emphasis added)<\/em><\/p><\/blockquote>\n<p>So, the fact\/expression dichotomy, defended most memorably in\u00a0<em>Feist<\/em>, does a\u00a0<em>lot<\/em>\u00a0of work in the Google Books opinion. And that is a good thing, because it grounds the right to text and data mine in fundamental copyright and Constitutional principles with roots as deep and broad as the fair use doctrine itself.<\/p>\n<p><em><a href=\"https:\/\/www.library.virginia.edu\/staff\/bcb4y\/\">Brandon Butler<\/a>\u00a0is Director of Information Policy at University of Virginia. \u00a0There he works on\u00a0implementing programs to guide the University Library on issues of intellectual property, copyright, and rights management for scholarly materials. He was a Practitioner-in-Residence at the Glushko-Samuelson Intellectual Property Law Clinic at American University\u2019s Washington College of Law from 2013 to 2016. Before that, Brandon was Director of Public Policy Initiatives at ARL from 2009 to 2013.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The\u00a0Feist-y Reason That Text and Data Mining is Fair Use by Brandon Butler Happy Fair Use Week! This is a happy week, indeed, for me, because fair use is my favorite copyright doctrine. But my favorite copyright decision just may be\u00a0Feist v. Rural Telephone Co., a case about\u2026telephone books! Among the many wonderful qualities of [&hellip;]<\/p>\n","protected":false},"author":6259,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"quote","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[257,690],"tags":[138869,138870,138871],"class_list":["post-387","post","type-post","status-publish","format-quote","hentry","category-copyright","category-fair-use","tag-copyright","tag-fair-use","tag-fair-use-week","post_format-post-format-quote"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p7gxeS-6f","_links":{"self":[{"href":"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/wp-json\/wp\/v2\/posts\/387","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/wp-json\/wp\/v2\/users\/6259"}],"replies":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/wp-json\/wp\/v2\/comments?post=387"}],"version-history":[{"count":4,"href":"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/wp-json\/wp\/v2\/posts\/387\/revisions"}],"predecessor-version":[{"id":392,"href":"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/wp-json\/wp\/v2\/posts\/387\/revisions\/392"}],"wp:attachment":[{"href":"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/wp-json\/wp\/v2\/media?parent=387"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/wp-json\/wp\/v2\/categories?post=387"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/copyrightosc\/wp-json\/wp\/v2\/tags?post=387"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}