{"id":867,"date":"2023-03-28T21:32:22","date_gmt":"2023-03-28T21:32:22","guid":{"rendered":"http:\/\/blogs.harvard.edu\/perma\/?p=867"},"modified":"2023-03-28T21:33:22","modified_gmt":"2023-03-28T21:33:22","slug":"867","status":"publish","type":"post","link":"https:\/\/archive.blogs.harvard.edu\/perma\/2023\/03\/28\/867\/","title":{"rendered":"New Release: High Fidelity Capture Engine for Witnessing the Web &#x1f368;"},"content":{"rendered":"<p><span style=\"font-weight: 400\">This week the Perma team is releasing new software that is a building block for any individual or organization creating a web archive. Scoop is a <\/span><span style=\"font-weight: 400\">highly-tunable single page capture library that prioritizes fidelity and provenance, drawing on our decade of experience archiving citations for law journals and courts. \u00a0 \u00a0 \u00a0<\/span><\/p>\n<hr \/>\n<p><span style=\"font-weight: 400\">When designing Scoop, we focused on making high-quality, signed web captures that you can take with you and host anywhere you want, while still being able to verify where they came from.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Why does that matter? Because we want to update how people talk to each other \u2014 and convince each other \u2014\u00a0about what content has been on the web.<\/span><\/p>\n<p><span style=\"font-weight: 400\">We\u2019ve all seen them: the contextless, but authentic-looking screenshot tweeted out to thousands of followers and proliferated throughout different networks, often jumping platforms.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">Media literacy and years of experience seeing photoshopped or faked content could help you parse what is real and what is fake, but the more time we spend on the internet and seek our news there, the more we will fall victim to inauthentic web content. Given the state of information on the web, this is likely to <\/span><a href=\"https:\/\/www.theatlantic.com\/technology\/archive\/2021\/09\/eric-schmidt-artificial-intelligence-misinformation\/620218\/\"><span style=\"font-weight: 400\">only get worse<\/span><\/a><span style=\"font-weight: 400\">, not better.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">Our IT departments and savvy-minded friends will give us tips on how to avoid phishing scams: call the friend directly, log into your bank account via the online portal instead of clicking that link, or otherwise meet the information at its source to guarantee you\u2019re not being duped. How do we validate authenticity, though, when the thing we\u2019re seeking is fragile: a dynamic web page, <\/span><a href=\"https:\/\/harvardlawreview.org\/2014\/03\/perma-scoping-and-addressing-the-problem-of-link-and-reference-rot-in-legal-citations\/\"><span style=\"font-weight: 400\">vulnerable<\/span><\/a><span style=\"font-weight: 400\"> to<\/span><a href=\"https:\/\/www.cjr.org\/analysis\/linkrot-content-drift-new-york-times.php\"><span style=\"font-weight: 400\"> link rot<\/span><\/a><span style=\"font-weight: 400\">?\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">We trust things that we know represent reality as much as possible, and we trust things that we know the origin of. Basically, we trust witnesses.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">Here\u2019s the thing: running a web archive up to this point has been so complex that it is necessarily centralized. Witnessing what is on the web has come down to just a few centralized archives who are trusted to maintain their collections, whether it is the indispensable Internet Archive or our own Perma.cc.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Even the most established archives have potential to <\/span><a href=\"https:\/\/seclab.cs.washington.edu\/2017\/10\/30\/rewriting-history-manipulating-the-archived-web-from-the-present\/\"><span style=\"font-weight: 400\">be manipulated<\/span><\/a><span style=\"font-weight: 400\">, and no one archive can serve all the needs our users have for web witnessing. With tools like Scoop, and others pioneered by our friends at the Webrecorder project, we don\u2019t have to. Advances in web technology are making it more plausible to decentralize the means of web archiving throughout the entire pipeline, from creation to storage and playback. But what about that trust factor?\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">Scoop is a highly-tunable single-page capture engine that has compatibility with<\/span><a href=\"https:\/\/github.com\/webrecorder\/authsign\"><span style=\"font-weight: 400\"> recently crafted .wacz signing standards<\/span><\/a><span style=\"font-weight: 400\">. As a default, extensive provenance information is included for traceability and transparency. Additionally, as a guiding light in our design we captured the web under a no alterations principle, prioritizing an \u201cas is\u201d state over potentially smoother playbacks to strengthen the value of the record\u2019s testimony.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Scoop is a library that can be used as a witness. Learn more about the specs on <\/span><a href=\"https:\/\/github.com\/harvard-lil\/scoop\"><span style=\"font-weight: 400\">Github<\/span><\/a><span style=\"font-weight: 400\">, and keep an eye out for stories about how this technology can be used and deep dives about its capabilities.\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<hr \/>\n<p><i><span style=\"font-weight: 400\">Perma Links for sources in this blog post:\u00a0<\/span><\/i><\/p>\n<ul>\n<li><a href=\"https:\/\/www.theatlantic.com\/technology\/archive\/2021\/09\/eric-schmidt-artificial-intelligence-misinformation\/620218\/\"><span style=\"font-weight: 400\">https:\/\/www.theatlantic.com\/technology\/archive\/2021\/09\/eric-schmidt-artificial-intelligence-misinformation\/620218\/<\/span><\/a><span style=\"font-weight: 400\"> archived at <\/span><a href=\"https:\/\/perma.cc\/3VSA-S6MX\"><span style=\"font-weight: 400\">https:\/\/perma.cc\/3VSA-S6MX<\/span><\/a><span style=\"font-weight: 400\">\u00a0<\/span><\/li>\n<li><a href=\"https:\/\/www.cjr.org\/analysis\/linkrot-content-drift-new-york-times.php\"><span style=\"font-weight: 400\">What the ephemerality of the Web means for your hyperlinks<\/span><\/a><span style=\"font-weight: 400\"> archived at <\/span><a href=\"https:\/\/perma.cc\/TYW6-FQ5F\"><span style=\"font-weight: 400\">perma.cc\/TYW6-FQ5F<\/span><\/a><\/li>\n<li><a href=\"https:\/\/harvardlawreview.org\/2014\/03\/perma-scoping-and-addressing-the-problem-of-link-and-reference-rot-in-legal-citations\/\"><span style=\"font-weight: 400\">Perma: Scoping and Addressing the Problem of Link and Reference Rot in Legal Citations<\/span><\/a><span style=\"font-weight: 400\"> archived at <\/span><a href=\"https:\/\/perma.cc\/D29D-MV4L\"><span style=\"font-weight: 400\">perma.cc\/D29D-MV4L<\/span><\/a><\/li>\n<li><a href=\"https:\/\/seclab.cs.washington.edu\/2017\/10\/30\/rewriting-history-manipulating-the-archived-web-from-the-present\/\"><span style=\"font-weight: 400\">REWRITING HISTORY: MANIPULATING THE ARCHIVED WEB FROM THE PRESENT<\/span><\/a><span style=\"font-weight: 400\"> archived at <\/span><a href=\"https:\/\/perma.cc\/K853-FF3V\"><span style=\"font-weight: 400\">https:\/\/perma.cc\/K853-FF3V<\/span><\/a><span style=\"font-weight: 400\">\u00a0<\/span><br style=\"font-weight: 400\" \/><br style=\"font-weight: 400\" \/><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>This week the Perma team is releasing new software that is a building block for any individual or organization creating a web archive. Scoop is a highly-tunable single page capture library that prioritizes fidelity and provenance, drawing on our decade of experience archiving citations for law journals and courts. \u00a0 \u00a0 \u00a0 When designing Scoop, [&hellip;]<\/p>\n","protected":false},"author":9608,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[1],"tags":[],"class_list":["post-867","post","type-post","status-publish","format-standard","hentry","category-uncategorized","post-preview"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/s4RYx6-867","_links":{"self":[{"href":"https:\/\/archive.blogs.harvard.edu\/perma\/wp-json\/wp\/v2\/posts\/867","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/archive.blogs.harvard.edu\/perma\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/archive.blogs.harvard.edu\/perma\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/perma\/wp-json\/wp\/v2\/users\/9608"}],"replies":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/perma\/wp-json\/wp\/v2\/comments?post=867"}],"version-history":[{"count":10,"href":"https:\/\/archive.blogs.harvard.edu\/perma\/wp-json\/wp\/v2\/posts\/867\/revisions"}],"predecessor-version":[{"id":878,"href":"https:\/\/archive.blogs.harvard.edu\/perma\/wp-json\/wp\/v2\/posts\/867\/revisions\/878"}],"wp:attachment":[{"href":"https:\/\/archive.blogs.harvard.edu\/perma\/wp-json\/wp\/v2\/media?parent=867"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/perma\/wp-json\/wp\/v2\/categories?post=867"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/perma\/wp-json\/wp\/v2\/tags?post=867"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}