Today is nominally the last day it will be editable, though it will stay up for archiving and export for another month. The WordPress dashboard lately has hadan expandable bar in the corner titled ‘Recent Updates’, but I’d never expanded it to see that it was local news about the platform, so this came as a surprise.
Checklist:
1) ping people who still need to migrate 2) draft final blog post, honoring the network
In the early days of blogging, Dave Winer was an energetic advocate of the form, as something important for writing and communication and not just another modern pastime. He set up the first version of Blogs@Harvard while he was a Berkman fellow (a Manila instance hosted by the Berkman Center, at blogs.law.harvard.edu), and started blogging there as well as at Scripting News. It moved to WordPress in 2007. The community revisited it in 2011 to reaffirm the value in keeping it online. (JP, as the head of the center, warmly summarized the project history to date at that point)
Over the next decade, new blogs were only created by Harvard affiliates. In 2014, technical maintenance of the blogs moved to the Harvard Library’s Office for Scholarly Communication, and the domain changed to blogs.harvard.edu. In 2018 its maintenance shifted to Harvard University Information Technology, and any old blogs run by authors who were not affiliates were closed [and taken offline, if they had not set up an archive]. This also affected a number of past affiliates who no longer had university or alum email addresses, including the pathbreaking info/law and j’s scratchpad, blog of the founding organizer of the Blogging Group.
Now the rest are being shut down. While bloggers still at Harvard can migrate to the existing sites.harvard.edu, with a bit of effort, they are not being migrated by default, and most have not migrated. Those without new posts in the past year were not notified of the change. This also affects people like Doc Searls, a long-time pillar of free software and the open web who we’ve been lucky to have in the local eddy, whose active projects live on nearby.
There are plans for a full archive to be preserved; let’s make it one befitting this decentralized community, which has hosted many students and practitioners of digital creation and archiving. Going through the archiving process myself reminds me of the [extraordinary, wonderful] service of the Wayback Machine, which may also let us restore former blogs currently hidden behind its veil.
Checklist:
3) Salvage old drafts
4) Make a proper export
It is a curious sensation to revisit my old tempo of posting by seeing the proportionate tempo of unpublished drafts; some quite good and close to completion, but written in a week or month when many other works were going out. These days I would publish a good three-section post without hesitation. Most drafts removed or published; new “unfinished draft” category added.
I am also reminded that fully half of the links from over 5 years ago are no longer online; other websites having a much shorter time-to-linkrot than this blog family. Again, Wayback is not only a default salvation but one of the only options; if it disappeared, readers, researchers, and historians would be entirely out of luck (short of bring up one of the Wayback mirrors). If you are in a position to host a full mirror (currently around 100PB), please get in touch with the archiveteam or the Internet Archive.
Exports should be easy, though mine is not small. Preserving the directory structure on import requires a target style that uses the same schema for dated posts. Alternately, I could scrape the entire site into a .wacz file and restore its public appearance exactly as it stands today, then move to a different format for a future blog. I’d like something more collaborative by nature; easy to have a cohort working together. I have hopes that Tana could be turned towards this end, as shared writing is naturally a more social activity than just linking to one another’s blogs (and even here some of the best outlier blogs here have been multi-author, during times when many were active together)
In the end, Elsevier retracted Kostoff’s anti-vax article, along with a pro-ivermectin study in the same issue that was similarly statistically-challenged. (It was that ivermectin study that led me to discover the issue in the first place, via scite.ai)
But not before his article dominated media and social media references to the Journal for months; and the author parlayed his peer-reviewed work into a DailyClout essay that was even more extreme, and did a tour on the social media anti-vax circuit. Thousands of people spent time debunking this nonsense, including a dozen on PubPeer alone. Millions of people saw references to it on social media.
The editor-in-chief who regularly published his own articles (or added himself as author to articles in his journal) stepped down as EIC, but continues to edit other toxicology journals and publish research at a healthy clip of three articles a month. Global understanding of COVID-19 is advancing steadily, with no further confusion or misdirection whatever. Everything isfine 🐶🔥
Comments Off on Kostoff, reprised: peer review secured again, everything is fine.
A few years ago I wrote about how our civilization was forfeiting the zeroth AI war — allowing individual attention hacks, deployed at scale, to diminish and replace our natural innovation and productivity in every society. We gained efficiency in every area of life, and then let our new wealth and spare time get absorbed by newly-efficient addictive spirals.
Exploit culture
This war for attention affects what sort of society we can hope to live in. Channeling so much wealth to attention-hackers, and the networks of crude AI tools and gambling analogs that support them, has strengthened an entire industry of exploiters, allowing a subculture of engineers and dealmakers to flourish. That industry touches on fraud, propaganda, manipulation of elections and regulation, and more, all of which influence what social equilibria are stable.
The first real AI war
Now we are facing the first real artificial-intelligence war — dominated by entities that appear as avatars of independent, intelligent people, but are artificial, scripted, automated.
What is new in this? Earlier low-tech versions of this required no machine learning or programming: they used the veil of pseudonymity to fake authorship, votes, and small-scale consensus. In response, we developed layers of law and regulation around earlier attacks — fraud, impersonation, and scams are illegal. AI can smoothly scale this to millions of comments on public bills, and to forging microtargeted social proof in millions of smaller group interactions online. And these scaled attacks are often still legal, or lightly penalized and enforced. (more…)
Comments Off on Forging Social Proof: the Networked Turing Test Rules the First AI War
It is time we stop talking about “golf time” as leisure time away from the presidency, and start treating it as a primary channel for meetings, negotiations, and decision-making. (See for instance the last line of this remarkable story.)
Trump’s presidential schedule is full of empty days and golf weekends – roughly two days a week have been spent on his own resorts, throughout his presidency. Combined with his historically light work schedule, averaging under two hours of meetings per day, the majority of small-group meetings may be taking place at his resorts.
He has also directed hundreds of government groups, and countless diplomatic partners and allies, to stay at his resorts and properties.
On his properties, his private staff control the access list, security videos and other records. They are also able to provide privacy from both press and government representatives that no federal property could match.
How might we address the issues involved with more clarity?
Paying himself with government funds
To start with, this is self-dealing on an astronomical scale: the 300+ days spent at his golf clubs and other properties have cost the US government, by conservative estimate, $110 million. The cost of encouraging the entire government to stay at Trump properties is greater still, if harder to estimate. (more…)
Comments Off on Trump’s tee-totalling: why are so many meetings held on the golf course?
Updated for the new year: with specific things we can all start doing 🙂
Wikipedia currently tracks and stores almost no data about its readers and editors. This persistently foils researchers and analysts inside the WMF and its projects; and is largely unnecessary.
Not tracked last I checked: sessions, clicks, where on a page readers spend their time, time spent on page or site, returning users. There is a small exception: data that can fingerprint a user’s use of the site is stored for a limited time, made visible only to developers and checkusers, in order to combat sockpuppets and spam.
This is all done in the spirit of preserving privacy: not gathering data that could be used by third parties to harm contributors or readers for reading or writing information that some nation or other powerful group might want to suppress. That is an essential concern, and Wikimedia’s commitment to privacy and pseudonymity is wonderful and needed.
However, the data we need to improve the site and understand how it is used in aggregatedoesn’t require storing personally identifiable data that can be meaningfully used to target editors in specific. Rather than throwing out data that we worry would expose users to risk, we should be fuzzing and hashing it to preserve the aggregates we care about. Browser fingerprints, including the username or IP, can be hashed; timestamps and anything that could be interpreted as geolocation can have noise added to them.
We could then know things such as, for instance:
the number of distinct users in a month, by general region
how regularly each visitor comes to the projects; which projects + languages they visit [throwing away user and article-title data, but seeing this data across the total population of ~1B visitors]
particularly bounce rates and times: people finding the site, perhaps running one search, and leaving
the number of pages viewed in a session, its tempo, or the namespaces they are in [throwing away titles]
the reading + editing flows of visitors on any single page, aggregated by day or week
clickflows from the main page or from search results [this data is gathered to some degree; I don’t know how reusably]
These are just rough descriptions — great care must be taken to vet each aggregate for preserving privacy. but this is a known practice that we could do with expert attention..
What keeps us from doing this today? Some aspects of this are surely discussed in places, but is hard to find. Past discussions I recall were brought to an early end by [devs worrying about legal] or [legal worrying about what is technically possible].
Discussion of obstacles and negative-space is generally harder to find on wikis than discussion of works-in-progress and responses to them: a result of a noun-based document system that requires discussions to be attached to a clearly-named topic!
What we can do, both researchers and data fiduciaries:
As site-maintainers: Start gathering this data, and appoint a couple privacy-focused data analysts to propose how to share it.
Identify challenges, open problems, solved problems that need implementing.
Name the (positive, future-crafting, project-loving) initiative to do this at scale, and the reasons to do so.
By naming the positive aspect, distinguish this from a tentative caveat to a list of bad things to avoid, which leads to inaction. (“never gather data! unless you have extremely good reasons, someone else has done it before, it couldn’t possibly be dangerous, and noone could possibly complain.“)
As data analysts (internal and external): write about what better data enables. Expand the list above, include real-world parallels.
How would this illuminate the experience of finding and sharing knowledge?
Invite other sociologists, historians of knowledge, and tool-makers to start working with stub APIs that at first may not return much data.
Without this we remain in the dark —- and, like libraries who have found patrons leaving their privacy-preserving (but less helpful) environs for data-hoarding (and very handy) book-explorers, we remain vulnerable to disuse.
Recently, statistician Andrew Gelman has been brilliantly breaking down the transformation of psychology (and social psych in particular) through its adoption of and creativeuse of statistical methods, leading to an improved understanding of how statistics can be abused in any field, and of how empirical observations can be [unwittingly and unintentionally] flawed. This led to the concept of p-hacking and other methodological fallacies which can be observed in careless uses of statistics throughout scientific and public analyses. And, as these new tools were used to better understand psychology and improve its methods, existing paradigms and accepted truths have been rapidly changed over the past 5 years. This shocks and anguishes researchers who are true believers in”hypotheses vague enough to support any evidence thrown at them“, and have built careers around work supporting those hypotheses.
Here is a beautiful discussion a week later, from Gelman, about how researchers respond to statistical errors or other disproofs of part of their work. In particular, how co-authors handle such new discoveries, either together or separately.
At the end, one of its examples turns up a striking example of someone taking these sorts of discoveries and updates to their work seriously: Dana Carney‘s public CV includes inline notes next to each paper wherever significant methodological or statistical concerns were raised, or significant replications failed.
Carney makes an appearance in his examples because of her most controversially popular research, with Cuddy an Yap, on power posing. A non-obvious result (that holding certain open physical poses leads to feeling and acting more powerfully) became extremely popular in the popular media, and has generated a small following of dozens of related extensions and replication studies — which starting in 2015 started to be done with large samples and at high power, at which point the effects disappeared. Interest within social psychology in the phenomenon, as an outlier of “a popular but possibly imaginary effect”, is so great that the journal Comprehensive Results in Social Psychology has an entire issue devoted to power posing coming out this Fall.
Perhaps motivated by Gelman’s blog post, perhaps by knowledge of the results that will be coming out in this dedicated journal issue [which she suggests are negative], she put out a full two-page summary of her changing views on her own work over time, from conceiving of the experiment, to running it with the funds and time available, to now deciding there was no meaningful effect. My hat is off to her. We need this sort of relationship to data, analysis, and error to make sense of the world. But it is a pity that she had to publish such a letter alone, and that her co-authors didn’t feel they could sign onto it.
Update: Nosek also wrote a lovely paper in 2012 on Restructuring incentives to promote truth over publishability[with input from the estimable Victoria Stodden] that describes many points at which researchers have incentives to stop research and publish preliminary results as soon as they have something they could convince a journal to accept.
Comments Off on Psych statistics wars: new methods are shattering old-guard assumptions
“A snub,” defined Lady Roosevelt, “is the effort of a person who feels superior to make someone else feel inferior. To do so, he has to find someone who can be made to feel inferior.”
Wikipedia has gotten more elaborate and complex to use. Adding a reference, marking something for review, uploading a file or creating a new article now take many steps — and failing to follow them can lead to starting all over. The curators of the core projects are concerned with uniformly high quality, and impatient with contributors who don’t have the expertise and wiki-experience to create something according to policy. Good stubs or photos are deleted for failing to comply with one of a dozen policies, or for inadequate cites or license templates; even when they are in fact derived from reliable sources and freely licensed.
The Article Creation Wizard has a five-step process for drafting an article, after which it is submitted for review by a team of experienced editors, and finally moved to the article namespace. 7 steps for approval is too much overhead for many. And the current notability guidelines on big Wikipedias excludes most local and specialist knowledge.
We need a simpler scratch-space to develop new material:
A place not designed to be high quality, where everything can be in flux, possibly wrong, in need of clarification and polishing and correction.
A place that can be used to build draft articles, images, and other media before posting them to Wikipedia
A place where everyone is welcome to start a new topic, and share what they know: relying on verifiability over time (but not requiring it immediately), and without any further standard for notability
A place with no requirements to edit: possibly style guidelines to aspire to, but where newbies who don’t know how the tools or system works are welcomed and encouraged to contribute more, and not chastised for getting things wrong.
Since this will be a new sort of compendium or comprehensive cyclopedia, covering all topics, it should have a new name. Something simple, say Newpedia. Scripts can be written to help editors work through the most polished Newpedia items and push them to Wikipedia and Wikisource and Commons. We could invite editors to start doing their rough work on Newpedia, to avoid the conflict and fast reversion on the larger wiki references that make it hard to use for quick new work.
Update: Mako discussed Newpedia (or double-plus-newpedia) in his panel about “Wikipedia in 2022“, and Erik Moeller talked about how the current focus on notability is keeping all of our projects from growing, in his “Ghosts of Wikipedia Future“. I look forward to the video and transcripts.
What do you think? I started a mailing list for people who are interested in developing such a knowledge-project. I look forward to your thoughts, both serious and otherwise 😉
In 1996, two French food scientists, André Briend and Michel Lescanne, developed a nut-based food formulation to serve as an emergency food relief product in famine-stricken areas. The goal was to have a high-density balanced food with a long and robust shelf life – one which, unlike the previous standard of milk-based therapeutic food, could be taken at home rather than in a hospital.
They soon formed the company Nutriset to further develop and commercialize the idea. Their most popular product, Plumpy’Nut, has shipped millions of units and currently makes up roughly 90% of UNICEF’s stocks of ready-to-use therapeutic foods [RUTFs] for famine relief.
In forming their company, they captured their idea in the form of a patent (a standard way to declare ownership of and investment) and went on to build a production chain around it. This included tweaked formulas and a family of products; production and packaging factories; and grant-writing and research to get certification + field-feedback + approval from various UN bodies. This involved few years of up-front investment and reputation-building, and then ramping up mass production of millions of pounds of Plumpy’Nut and its derivatives. They later set up a novel “patentleft” process allowing companies in developing countries to use the patent commercially, and make derivatives from it, at no cost — after a brief online registration. This is something which has received surprisingly little attention since, considering how simple and elegant their solution. Read on for details! (more…)
From a recent discussion about Web 3.0 and the far future, on the AIR-L list:
In fact, the Web is currently developing Web <30, to be rolled out
with Chrome 25, Firefox 20, Opera 15, and IE 10 later this winter.
If you are interested in cutting-edge research and convolving
observation with participation, you can take part in the design of Web
<30 yourself. It is being developed through a massively
multistakeholder open online crowd-refined platform generation
(MMOOCRPG) design.
Building on the exponential success of past
efforts, the development mailing list includes a periodic
distributed auto-immolating critique of its own work, where the future
web is continuously redefined as its own dual.
Comments Off on Web <30 – the Future of the Web is Intertextual