{"id":1319,"date":"2012-04-27T15:09:24","date_gmt":"2012-04-27T19:09:24","guid":{"rendered":"http:\/\/blogs.law.harvard.edu\/pamphlet\/?p=1319"},"modified":"2012-04-27T15:09:24","modified_gmt":"2012-04-27T19:09:24","slug":"the-new-harvard-library-open-metadata-policy","status":"publish","type":"post","link":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/2012\/04\/27\/the-new-harvard-library-open-metadata-policy\/","title":{"rendered":"The new Harvard Library open metadata policy"},"content":{"rendered":"<table width=\"200\" align=\"right\" bgcolor=\"#F7EFE5\">\n<tbody>\n<tr>\n<td align=\"center\"><a href=\"http:\/\/blogs.law.harvard.edu\/pamphlet\/files\/2012\/04\/oldbooks.jpg\"><img decoding=\"async\" src=\"http:\/\/blogs.law.harvard.edu\/pamphlet\/files\/2012\/04\/oldbooks-199x300.jpg\" alt=\"\u201cOld Books\u201d photo by flickr user Iguana Joe, used by permission (CC-by-nc)\" width=\"200\" \/><\/a><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center\"><span style=\"color: #999999\">\u201cOld Books\u201d<br \/>\n<span style=\"font-size: 85%\"><a href=\"http:\/\/www.flickr.com\/photos\/iguanajo\/3332803370\/in\/photostream\/\">photo<\/a> by flickr user <a href=\"http:\/\/www.flickr.com\/photos\/iguanajo\/\">Iguana Joe<\/a>, used by permission (<a href=\"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/\">CC-by-nc<\/a>)<\/span><\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Earlier this week, the Harvard Library <a href=\"http:\/\/isites.harvard.edu\/icb\/icb.do?keyword=k77982&amp;pageid=icb.page498373\">announced<\/a> its new <a href=\"http:\/\/openmetadata.lib.harvard.edu\/\" target=\"_blank\">open metadata policy<\/a>, which was approved by the <a href=\"http:\/\/isites.harvard.edu\/icb\/icb.do?keyword=k77982&amp;pageid=icb.page399544\" target=\"_blank\">Library Board<\/a> earlier this year, along with an initial two metadata releases. <a href=\"http:\/\/openmetadata.lib.harvard.edu\/\">The policy<\/a> is\u00a0straightforward:<\/p>\n<p style=\"padding-left: 30px\">The Harvard Library provides open access to library metadata, subject to legal and privacy factors. In particular, the Library makes available its own catalog metadata under appropriate broad use licenses. The Library Board is responsible for interpreting this policy, resolving disputes concerning its interpretation and application, and modifying it as necessary.<\/p>\n<p>The first releases under the policy include the metadata in <a href=\"http:\/\/dash.harvard.edu\/\">the DASH repository<\/a>. Though this metadata has been available through <a href=\"http:\/\/openmetadata.lib.harvard.edu\/dash\">open APIs<\/a> since early in the repository&#8217;s history, the open metadata policy makes clear the open licensing terms that the data is provided under.<\/p>\n<p>The release of a huge percentage of the Harvard Library&#8217;s bibliographic metadata for its holdings is likely to have much bigger impact. We&#8217;ve provided 12 million records \u2014 the vast majority of Harvard&#8217;s\u00a0bibliographic\u00a0data \u2014 describing Harvard&#8217;s library holdings in <a href=\"http:\/\/www.loc.gov\/marc\/\">MARC<\/a> format <a href=\"http:\/\/openmetadata.lib.harvard.edu\/bibdata\/useterms\">under a CC0 license that requests adherence to a set of community norms<\/a> that I think are quite reasonable, primarily calling for attribution to Harvard and our major partners in the release, <a href=\"http:\/\/www.oclc.org\/\" target=\"_blank\">OCLC<\/a> and the <a href=\"http:\/\/www.loc.gov\/\" target=\"_blank\">Library of Congress<\/a>.<\/p>\n<p>OCLC in\u00a0particular\u00a0has <a href=\"http:\/\/hangingtogether.org\/?p=1647\">praised the effort<\/a>, saying it\u00a0\u201cfurthers [Harvard&#8217;s] mandate from their Library Board and Faculty to make as much of their metadata as possible available through open access in order to support learning and research, to disseminate knowledge and to foster innovation and aligns with the very public and established commitment that Harvard has made to open access for scholarly communication. I\u2019m pleased to say that they worked with OCLC as they thought about the terms under which the release would be made.\u201d\u00a0We&#8217;ve gotten nice coverage from the <a href=\"http:\/\/bits.blogs.nytimes.com\/2012\/04\/24\/harvard-releases-big-data-for-books\/?pagemode=print\">New York Times<\/a>,\u00a0<a href=\"http:\/\/www.thedigitalshift.com\/2012\/04\/metadata\/harvard-releases-metadata-into-public-domain\/\">Library Journal<\/a>, and\u00a0<a href=\"http:\/\/boingboing.net\/2012\/04\/24\/massive-public-domain-catalog.html\" target=\"_blank\">Boing Boing<\/a>\u00a0as well.<\/p>\n<p>Many people have asked what we expect people to do with the data. Personally, I have no idea, and that&#8217;s the point. I&#8217;ve seen over and over that when data is made openly available with the fewest impediments \u2014 legal and technical \u2014 people are incredibly creative about finding innovative uses for the data that we never could have predicted. Already, we&#8217;re seeing people picking up the data, exploring it, and building on it.<\/p>\n<ul>\n<li>The <a href=\"http:\/\/dp.la\/\">Digital Public Library of America<\/a> is making the data available through <a href=\"http:\/\/dp.la\/dev\/wiki\/Item_API\">an API<\/a> that provides data in a much nicer way than the pure MARC record dump that Harvard is making available.<\/li>\n<li>Within hours of release, Benjamin Bergstein had already set up <a href=\"http:\/\/benjaminbergstein.com\/dpla\/\">his own search interface<\/a> to the Harvard data using the <a href=\"http:\/\/dp.la\/dev\/wiki\/Item_API\">DPLA API<\/a>.<\/li>\n<li>Carlos Bueno\u00a0has developed code for the Harvard Library Bibliographic Dataset to parse its &#8220;wonky&#8221; MARC21 format, and has <a href=\"https:\/\/github.com\/aristus\/copymine-harvard#readme\">open-sourced the code<\/a>.<\/li>\n<li>Alf Eaton has <a href=\"http:\/\/hublog.hubmed.org\/archives\/001953.html\">documented his own efforts<\/a> to\u00a0work with the bibliographic dataset, providing instructions for downloading and extracting the records and putting up all of the code he developed to massage and render the data. He outlines his plans for further\u00a0extensions\u00a0as well.<\/li>\n<\/ul>\n<p>(I&#8217;m sure I&#8217;ve missed some of the ways people are using the data. Let me know if you&#8217;ve heard of others, and I&#8217;ll update this list.)<\/p>\n<p>As <a href=\"http:\/\/bits.blogs.nytimes.com\/2012\/04\/24\/harvard-releases-big-data-for-books\/?pagemode=print\" target=\"_blank\">I&#8217;ve said before<\/a>,\u00a0\u201cThis data serves to link things together in ways that are difficult to predict. The more information you release, the more you see people doing innovative things.\u201d\u00a0These examples are the first evidence of that potential.<\/p>\n<p><a href=\"http:\/\/cyber.law.harvard.edu\/people\/jpalfrey\" target=\"_blank\">John Palfrey<\/a>, who was really the instigator of the open metadata project, has been especially interested in getting other institutions\u00a0to make their own collection metadata publicly available, and the DPLA stands ready to help. They&#8217;re running a wiki with instructions on\u00a0<a href=\"http:\/\/dp.la\/dev\/wiki\/Metadata_upload\" target=\"_blank\">how to add your own institution&#8217;s metadata to the DPLA service<\/a>.<\/p>\n<p>It&#8217;s hard to list all the people who make initiatives like this possible, since there are so many, but I&#8217;d like to mention a few major participants (in addition to John): Jonathan Hulbert, Tracey Robinson, David Weinberger, and Robin Wendler. Thanks to them and the many others that have helped in various ways.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u201cOld Books\u201d photo by flickr user Iguana Joe, used by permission (CC-by-nc) Earlier this week, the Harvard Library announced its new open metadata policy, which was approved by the Library Board earlier this year, along with an initial two metadata releases. The policy is\u00a0straightforward: The Harvard Library provides open access to library metadata, subject to [&hellip;]<\/p>\n","protected":false},"author":2110,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[2780,618,68],"tags":[],"class_list":["post-1319","post","type-post","status-publish","format-standard","hentry","category-libraries","category-open-access","category-scholarly-communication"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p5pLfN-lh","jetpack-related-posts":[{"id":960,"url":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/2011\/10\/13\/the-future-of-the-library-expressed-in-sculpture\/","url_meta":{"origin":1319,"position":0},"title":"The future of the library, expressed in sculpture","author":"Stuart Shieber","date":"Thursday, October 13, 2011","format":false,"excerpt":"Petrus Spronk, \u201cArchitectural Fragment\u201d, 1992. Photo \u00a9 2005 Robert Laddish (www.laddish.net), used by permission. I've just been\u00a0at the conference in honor of the 30th anniversary\u00a0of the University of Sao Paulo Integrated Library System (SIBi USP). David Palmer, one of the speakers at the conference, used in his presentation a picture\u2026","rel":"","context":"In &quot;libraries&quot;","block_context":{"text":"libraries","link":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/category\/scholarly-communication\/libraries\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1376,"url":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/2012\/05\/29\/shieber\/","url_meta":{"origin":1319,"position":1},"title":"Processing special collections: An archivist&#8217;s workstation","author":"Stuart Shieber","date":"Tuesday, May 29, 2012","format":false,"excerpt":"John Tenniel, c. 1864. Study for illustration to Alice's adventures in wonderland.\u00a0Harcourt Amory collection of Lewis Carroll, Houghton Library, Harvard University. We've just completed spring semester during which I taught a system design course jointly in Engineering Sciences and Computer Science.\u00a0The aim of ES96\/CS96 is to help the students learn\u2026","rel":"","context":"In &quot;computer science&quot;","block_context":{"text":"computer science","link":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/category\/computer-science\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":456,"url":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/2010\/05\/27\/green-oa-as-appropriation\/","url_meta":{"origin":1319,"position":2},"title":"Green OA as &#8220;appropriation&#8221;","author":"Stuart Shieber","date":"Thursday, May 27, 2010","format":false,"excerpt":"Sandy Thatcher feels \"very uneasy about the massive postings of Green OA articles at sites like Harvard\u2019s, which given that university\u2019s great prestige may well lead to the widespread appropriation of those versions by scholars who find it easier to access them OA than to hunt down (and perhaps pay\u2026","rel":"","context":"In &quot;open access&quot;","block_context":{"text":"open access","link":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/category\/scholarly-communication\/open-access\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1106,"url":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/2012\/01\/04\/switching-to-open-access-for-the-new-year\/","url_meta":{"origin":1319,"position":3},"title":"Switching to open access for the new year","author":"Stuart Shieber","date":"Wednesday, January 4, 2012","format":false,"excerpt":"\u201c...time to switch...\u201d A very old light switch (2008) by RayBanBro66 via flickr. Used by permission (CC by-nc-nd) The journal Research in Learning Technology has switched its approach from closed to open access as of New Year's 2012. Congratulations to the Association for Learning Technology (ALT) and its Central Executive\u2026","rel":"","context":"In &quot;computational linguistics&quot;","block_context":{"text":"computational linguistics","link":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/category\/linguistics\/computational-linguistics\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":32,"url":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/2009\/05\/28\/open-access-policies-and-academic-freedom\/","url_meta":{"origin":1319,"position":4},"title":"Open-access policies and academic freedom","author":"Stuart Shieber","date":"Thursday, May 28, 2009","format":false,"excerpt":"I very occasionally hear expressed a concern about the Harvard open-access policy that it violates some aspect of academic freedom. The argument seems to be that by granting a prior license to Harvard, faculty may be forced to forgo publication in certain venues.\u00a0 Our rights as scholars to determine the\u2026","rel":"","context":"In &quot;open access&quot;","block_context":{"text":"open access","link":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/category\/scholarly-communication\/open-access\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":383,"url":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/2010\/02\/28\/harvard-business-school-approves-open-access-policy\/","url_meta":{"origin":1319,"position":5},"title":"Harvard Business School approves open-access policy","author":"Stuart Shieber","date":"Sunday, February 28, 2010","format":false,"excerpt":"Two years to the day after the Faculty of Arts and Sciences became the first school at Harvard to vote an open-access policy, the Harvard Business School enacted their own policy on February 12, 2010, becoming the fifth Harvard school with a similar policy. Under the HBS policy, Like the\u2026","rel":"","context":"In &quot;open access&quot;","block_context":{"text":"open access","link":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/category\/scholarly-communication\/open-access\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/wp-json\/wp\/v2\/posts\/1319","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/wp-json\/wp\/v2\/users\/2110"}],"replies":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/wp-json\/wp\/v2\/comments?post=1319"}],"version-history":[{"count":20,"href":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/wp-json\/wp\/v2\/posts\/1319\/revisions"}],"predecessor-version":[{"id":1342,"href":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/wp-json\/wp\/v2\/posts\/1319\/revisions\/1342"}],"wp:attachment":[{"href":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/wp-json\/wp\/v2\/media?parent=1319"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/wp-json\/wp\/v2\/categories?post=1319"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/pamphlet\/wp-json\/wp\/v2\/tags?post=1319"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}