The NIH responds to my letter
June 7th, 2011
![]() |
| Front steps of National Library of Medicine, 2008, photo courtesy of NIH Image Bank |
Imagine my surprise when I actually received a response to my letters in recognition of the NIH public access policy, a form letter undoubtedly, but nonetheless gratefully received. And as a side effect, it allows us to gauge the understanding of the issues in the pertinent offices.
The letter, which I’ve duplicated below in its entirety, addresses two of the issues that I raised in my letter, the expansion of the policy to other agencies and the desirability for a reduction in the embargo period.
With regard to expanding the NIH policy to other funding agencies, the response merely notes the America COMPETES Act‘s charge to establish a working group to study the matter — fine as far as it goes, but not an indication of support for expansion itself.
With regard to the embargo issue, the response seems a bit confused as to how things work in the real world. Let’s look at some sentences from the pertinent paragraph:
- “As you may know, the 12-month delay period specified by law (Division G, Title II, Section 218 of P.L. 110-161) is an upper limit. Rights holders (sometimes the author, and sometimes they transfer some or all of these rights to publishers) are free to select a shorter delay period, and many do.” This is of course true. My hope, and that of many others, is to decrease this maximum.
- “The length of the delay period is determined through negotiation between authors and publishers as part of the copyright transfer process.” Well, not so much. Authors don’t so much negotiate with publishers as just sign whatever publishers put in their path. When one actually attempts to engage in negotiation, sadly rare among academic authors, things often go smoothly, but sometimes take a turn for the odd, and authors in the thrall of publish or perish are short on negotiating leverage.
- “These negotiations can be challenging for authors, and our guidance (http://publicaccess.nih.gov/FAQ.htm#778) encourages authors to consult with their institutions when they have questions about copyright transfer agreements.” I have a feeling that the word challenging is a euphemism for something else, but I’m not sure what. The cited FAQ doesn’t in fact provide guidance on negotiation, but just language to incorporate into a publisher agreement to make it consistent with the 12-month embargo. No advice on what to do if the publisher refuses, much less how to negotiate shorter embargoes. As for the excellent advice to “consult with their institutions”, in the case of Harvard, that kind of means to talk with my office, doesn’t it? Which, I suppose, is a vote of confidence.
So there is some room for improvement in understanding the dynamic at play in author-publisher relations, but overall, I’m gratified that NIH folks are on top of this issue and making a good faith effort to bring the fruits of research to the scholarly community and the public at large, and reiterate my strong support of NIH’s policy.
Here’s the full text of the letter:
DEPARTMENT OF HEALTH & HUMAN SERVICES
Public Health Service
National Institutes of Health
Bethesda, Maryland 20892
May 27 2011
Stuart M. Shieber, Ph.D.
Welch Professor of Computer Science, and
Director, Office for Scholarly Communication
1341 Massachusetts Avenue
Cambridge, Massachusetts 02138
Dear Dr. Shieber:
Thank you for your letters to Secretary Sebelius and Dr. Collins regarding the NIH Public Access Policy. I am the program manager for the Policy, and have been asked to respond to you directly.
We view the policy as an important tool for ensuring that as many Americans as possible benefit from the public’s investment in research through NIH.
I appreciate your suggestions about reducing the delay period between publication and availability of a paper on PubMed Central. As you may know, the 12-month delay period specified by law (Division G, Title II, Section 218 of P.L. 110-161) is an upper limit. Rights holders (sometimes the author, and sometimes they transfer some or all of these rights to publishers) are free to select a shorter delay period, and many do. The length of the delay period is determined through negotiation between authors and publishers as part of the copyright transfer process. These negotiations can be challenging for authors, and our guidance (http://publicaccess.nih.gov/FAQ.htm#778) encourages authors to consult with their institutions when they have questions about copyright transfer agreements.
I also appreciate your suggestion to expand this Policy to other Federal science funders, and the confidence it implies in our approach. The National Science and Technology Council (NSTC) has been charged by the America COMPETES Reauthorization Act of 2010 (P.L. 111-358) to establish a working group to explore the dissemination and stewardship of peer reviewed papers arising from Federal research funding. I am copying Dr. Celeste Rohlfing at the Office of Science and Technology Policy on this correspondence, as she is coordinating the NSTC efforts on Public Access.
Sincerely,
Neil M. Thakur, Ph.D.
Special Assistant to the NIH Deputy Director for Extramural Research
cc: Ms. Celeste M. Rohlfing
Assistant Director for Physical Sciences
Office of Science and Technology Policy
Executive Office of the President
725 17th Street, Room 5228
Washington, DC 20502
The benefits of copyediting
June 4th, 2011
![]() |
| Dictionary and red pencil, photo by novii, on Flickr |
Sanford Thatcher has written a valuable, if anecdotal, analysis of some papers residing on Harvard’s DASH repository (Copyediting’s Role in an Open-Access World, Against the Grain, volume 23, number 2, April 2011, pages 30-34), in an effort to get at the differences between author manuscripts and the corresponding published versions that have benefited from copyediting.
“What may we conclude from this analysis?” he asks. “By and large, the copyediting did not result in any major improvements of the manuscripts as they appear at the DASH site.” He finds that “the vast majority of changes made were for the sake of enforcing a house formatting style and cleaning up a variety of inconsistencies and infelicities, none of which reached into the substance of the writing or affected the meaning other than by adding a bit more clarity here and there” and expects therefore that the DASH versions are “good enough” for many scholarly and educational uses.
Although more substantive errors did occur in the articles he examined, especially in the area of citation and quotation accuracy, they were typically carried over to the published versions as well. He notes that “These are just the kinds of errors that are seldom caught by copyeditors.”
One issue that goes unmentioned in the column is the occasional introduction of errors by the typesetting and copyediting process itself. This used to happen with great frequency in the bad old days when publishers rekeyed papers to typeset them. It was especially problematic in fields like my own, in which papers tend to have large amounts of mathematical notation, which the typesetting staff had little clue about the niceties of. These days more and more journals allow authors to submit LaTeX source for their articles, which the publisher merely applies the house style file to. This practice has been a tremendous boon to the accuracy and typesetting quality of mathematical articles. Still, copyediting can introduce substantive errors in the process. Here’s a nice example from a paper in the Communications of the ACM:
“Besides getting more data, faster, we also now use much more sophisticated learning algorithms. For instance, algorithms based on logistic regression and that support vector machines can reduce by half the amount of spam that evades filtering, compared to Naive Bayes.” (Joshua Goodman, Gordon V. Cormack, and David Heckerman, Spam and the ongoing battle for the inbox, Communications of the Association for Computing Machinery, volume 50, number 2, 2007, page 27. Emphasis added.)
Any computer scientist would immediately see that the sentence as published makes no sense. There is no such thing as a “vector machine” and in any case algorithms don’t support them. My guess is that the author manuscript had the sentence “For instance, algorithms based on logistic regression and support vector machines can reduce by half…” — without the word that. The copyeditor apparently didn’t realize that the noun phrase support vector machine is a term of art in the machine learning literature; the word support was not intended to be a verb here. (Do a Google search for vector machine. Every hit has the phrase in the context of the term support vector machine, at least for the pages I looked at before boredom set in.)
Presumably, the authors didn’t catch the error introduced by the copyeditor. The occurrence of errors of this sort is no argument against copyediting, but it does demonstrate that it should be viewed as a collaborative activity between copyeditors and authors, and better tools for collaboratively vetting changes would surely be helpful.
In any case, back to Dr. Thatcher’s DASH study. Ellen Duranceau at MIT Libraries News views the study as “support for the MIT faculty’s approach to sharing their articles through their Open Access Policy”, and the same could be said for Harvard as well. However, before we declare victory, it’s worth noting that Dr. Thatcher did find differences between the versions, and in general the edits were beneficial.
The title of Dr. Thatcher’s column gets at the subtext of his conclusions, that in an open-access world, we’d have to live with whatever errors copyediting would have caught, since we’d be reading uncopyedited manuscripts. But open-access journals can and do provide copyediting as one of their services, and to the extent that doing so improves the quality of the articles they publish and thus the imprimatur of the journal, it has a secondary benefit to the journal of improving its brand and its attractiveness to authors.
I admit that I’m a bit of a grammar nerd (with what I think is a nuanced view that manages to be linguistically descriptivist and editorially prescriptivist at the same time) and so I think that copyediting can have substantial value. (My own writing was probably most improved by Savel Kliachko, an outstanding editor at my first employer SRI International.) To my mind, the question is how to provide editing services in a rational way. Given that the costs of copyediting are independent of the number of accesses, and that the value accrues in large part to the author (by making him or her look like less of a halfwit for exhibiting “inconsistencies and infelicities” and occasionally more substantive errors), it seems reasonable that authors ought to pay publishers a fee for these services. And that is exactly what happens in open-access journals. Authors can decide if the bargain is a good one on the basis of the services that the publisher provides, including copyediting, relative to the fee the publisher charges. As a result, publishers are given incentive to provide the best services for the dollar. A good deal all around.
Most importantly, in a world of open-access journals the issue of divergence between author manuscripts and publisher versions disappears, since readers are no longer denied access to the definitive published version. Dr. Thatcher concludes that the benefits of copyediting were not as large as he would have thought. Nonetheless, however limited the benefits might be, properly viewed those benefits argue for open access.

Letters in recognition of the NIH Public Access Policy anniversary
April 13th, 2011
In recognition of the third anniversary of the establishment of the NIH Public Access Policy on April 7, 2008, I’ve sent letters to John Holdren, Director of the Office of Science and Technology Policy; Francis Collins., Director of the National Institutes of Health; and Kathleen Sebelius, Secretary of Health and Human Services. The letter to Dr. Holdren is duplicated below; the others are substantially similar. The Alliance for Taxpayer Access provides further background.
April 13, 2011
John Holdren
Assistant to the President for Science and Technology
Director, Office of Science and Technology Policy, Executive Office of the President
New Executive Office Building
725 – 17th Street NW
Washington, DC 20502
Dear Dr. Holdren:
I write to you in my role as the Director of the Office for Scholarly Communication at Harvard University, where I lead efforts to broaden access to the research and scholarly results of our university. I and others at Harvard working towards these goals so central to the university’s mission have been inspired by the National Institutes of Health Public Access Policy, now celebrating its third anniversary. The NIH policy has had an enormous impact in increasing availability of government-funded research to the citizens that have supported it through their tax dollars. Every day nearly half a million people access the over two million articles that the NIH policy makes available through the PubMed Central repository. I am especially proud that Harvard affiliates have contributed over thirty thousand of these articles.
The NIH should be applauded for these efforts to bring the fruits of scientific research to the public, and should be encouraged to provide even more timely access by shortening the embargo period in the policy. I believe that the NIH example should be broadly followed by all government agencies engaged in substantial research funding, as envisioned in the Federal Research Public Access Act (FRPAA) that has several times been introduced in Congress, and encourage you to extend this kind of policy to other science and technology funding agencies as soon as possible.
The tremendous success of the NIH policy should be celebrated. It provides a sterling example of government acting in the public interest, leading to broader access to the important scientific results that inform researchers and lay citizens alike.
Sincerely,
Stuart M. Shieber
Welch Professor of Computer Science, and
Director, Office for Scholarly Communication

The importance of dark deposit
March 12th, 2011
![]() |
| Hubble’s Dark Matter Map from flickr user NASA Goddard Photo and Video, used by permission |
The Harvard repository, DASH, comprises several thousand articles in all fields of scholarship. These articles are stored and advertised through an item page providing metadata — such as title, author, citation, abstract, and link to the definitive version of record — which typically allows downloading of the article as well. But not all articles are distributed. On some of the item pages, the articles themselves can’t be downloaded; they are “dark”. The decision whether or not to allow for dark articles in a repository comes up sufficiently often that it is worth rehearsing the several reasons to allow it.
- Posterity: Repositories have a role in providing access to scholarly articles of course. But an important part of the purpose of a repository is to collect the research output of the institution as broadly as possible. Consider the mission of a university archives, well described in this Harvard statement: “The Harvard University Archives (HUA) supports the University’s dual mission of education and research by striving to preserve and provide access to Harvard’s historical records; to gather an accurate, authentic, and complete record of the life of the University; and to promote the highest standards of management for Harvard’s current records.” Although the role of the university archives and the repository are different, that part about “gather[ing] an accurate, authentic, and complete record of the life of the University” reflects this role of the repository as well.Since at any given time some of the articles that make up that output will not be distributable, the broadest collection requires some portion of the collection to be dark.
- Change: The rights situation for any given article can change over time — especially over long time scales, librarian time scales — and having materials in the repository dark allows them to be distributed if and when the rights situation allows. An obvious case is articles under a publisher embargo. In that case, the date of the change is known, and repository software can typically handle the distributability change automatically. There are also changes that are more difficult to predict. For instance, if a publisher changes its distribution policies, or releases backfiles as part of a corporate change, this might allow distribution where not previously allowed. Having the materials dark means that the institution can take advantage of such changes in the rights situation without having to hunt down the articles at that (perhaps much) later date.
- Preservation: Dark materials can still be preserved. Preservation of digital objects is by and large an unknown prospect, but one thing we know is that the more venues and methods available for preservation, the more likely the materials will be preserved. Repositories provide yet another venue for preservation of their contents, including the dark part.
- Discoverability: Although the articles themselves can’t be distributed, their contents can be indexed to allow for the items in the repository to be more easily and accurately located. Articles deposited dark can be found based on searches that hit not only the title and abstract but the full text of the article. And it can be technologically possible to pass on this indexing power to other services indexing the repository, such as search engines.
- Messaging: When repositories allow both open and dark materials, the message to faculty and researchers can be made very simple: Always deposit. Everything can go in; the distribution decision can be made separately. If authors have to worry about rights when making the decision whether to deposit in the first place, the cognitive load may well lead them to just not deposit. Since the hardest part about running a successful repository is getting a hold of the articles themselves, anything that lowers that load is a good thing. This point has been made forcefully by Stevan Harnad. It is much easier to get faculty in the habit of depositing everything than in the habit of depositing articles subject to the exigencies of their rights situations.
- Availability: There are times when an author has distribution rights only to unavailable versions of an article. For instance, an author may have rights to distribute the author’s final manuscript, but not the publisher’s version. Or an art historian may not have cleared rights for online distribution of the figures in an article and may not be willing to distribute a redacted version of the article without the figures. The ability to deposit dark enables depositing in these cases too. The publisher’s version or unredacted version can be deposited dark.
- Education: Every time an author deposits an article dark is a learning moment reminding the author that distribution is important and distribution limitations are problematic.
For all these reasons, I believe that it is important to allow for dark items in an article repository. Better dark than missing.
[Hat tip to Sue Kriegsman for discussions on this issue.]

Some open-access publishers offer institutional memberships, whereby a fixed annual fee, often based on the size of faculty or expected number of submitted articles, covers all or a percentage of article-processing fees for the institution for the year.
The issue of OA publisher memberships is interesting and fraught. Harvard University is not currently a member of any of the major OA publishers—BioMed Central, Hindawi, or Public Library of Science. (Actually, Harvard Medical School is a PLoS member.) I’m not involved in Harvard’s decisions about institutional memberships, although I am not a fan of memberships in general, as you will see. I’ll explain my own view of the difficulty with memberships in terms of the market design for publisher services, and then talk about what alternatives there are.
There are a variety of different kinds of membership models but I will start by discussing the basic kind, where a fixed annual institutional fee is charged to reduce the per-article fee for articles emanating from that institution to zero, that is, a 100% reduction. This kind of membership is now rare—most memberships reduce fees by a smaller fraction, say 15%—but it is useful to examine as a thought experiment. There are other sorts of membership based on prepayment or cost sharing, which I’ll come back to.
The problem with institutional memberships (and here I mean the 100% reduction model for the moment) is that they have the potential to create the same moral hazard in the publication-fee revenue model that institutional subscriptions do in the subscription-fee business model. Institutional subscriptions hide the cost of a journal from the readers, leading to overconsumption, inelasticity of demand, and knock-on hyperinflation that is called, somewhat inaccurately, the “serials crisis”. Institutional memberships potentially have the same effect for authors, hiding the cost of the journal fees from the authors, presumably leading to overconsumption and raising the specter of hyperinflation of publication fees and membership fees down the line.
What does “overconsumption” of an OA journal mean? It means that authors publish in that journal more than they should given the relative tradeoff of fee for services. Imagine a university is a PLoS member but not a BMC member (and let’s imagine, contrary to fact, that the PLoS membership uses the 100% reduction model). Authors will see a fee of $0 for PLoS but $1500 (say) for BMC, leading them to preferentially publish in PLoS over BMC journals, even though the true cost to the university is well over $0 for PLoS publication. The result, ceteris paribus, will be that the numbers of PLoS articles will go up at that university, leading PLoS to raise the membership fee for the university over time.
Who benefits from the institutional membership at a COPE-compliant university? Whether there is an institutional membership or not, an author pays no fee; either the grant does or, if there is no grant, the university’s COPE fund does. If the article is grant funded, however, the author doesn’t need to use grant funds if there’s a membership. So authors benefit from memberships a bit by husbanding grant resources. But the major beneficiary of a university membership is the funding agency, which no longer needs to fund the publisher’s fee. Institutional memberships are essentially a transfer payment from universities to funding agencies.
The whole premise of COPE is that each stakeholder in the scholarly publishing milieu needs to do its fair share, no less and no more. Funders need to fund the fees for articles they support. Universities need to fund the fees for articles written under their auspices (but not those under a funder’s auspices). Memberships break this model.
I know what you’re thinking. Memberships mean that overall less money is being sent to the publisher. Isn’t that cost savings a good thing? If the university typically publishes 20 BMC articles a year at $1,500 ($30,000 total) and the membership is $20,000, someone or other has just saved $10,000. It’s not likely to be the university, but someone.
Here’s why it’s not the university that saves money: Imagine that 15 of the 20 articles were grant funded. (For science journals, this is a conservative estimate.) Then without the membership, grants would have paid for 15 of the articles, for $22,500, and the university just 5, for an OA fund cost of $7,500. So the university pays $12,500 more with a membership than without. But the funder pays $22,500 less. Together, funder and university pay $10,000 less, but the university is still paying much more. In essence, the university pays $12,500 for the privilege of saving the funders $22,500.
So university memberships have the effect of saving funding agencies some money and authors some grant funds, at the cost of skewing author behavior toward the particular publisher. When you buy a membership, you tilt the playing field toward that publisher; you are playing favorites. I suppose in the short term, there’s probably nothing wrong with playing favorites towards PLoS and BMC. They are nice folks, and maybe a little thumb on the scale isn’t a bad thing. But in the long run, it’s not the recipe for an efficient market.
Note that this effect holds even if the membership doesn’t reduce the fee to zero. A membership that reduces the fee by 15% still hides 15% of the true cost from the author, so it has a smaller but still non-zero skewing effect. And the cost savings issue is the same as well. With a 15% membership, the university is still paying (though less) to save the funding agencies (though commensurately less).
There may be a way of making memberships consistent with an efficient market, namely, by transferring some portion of the cost of publishing back to the authors even where a membership is in place. When an article is published, the author’s funder could be charged their share of the membership fee. If the article is not grant-funded but there is a COPE-compliant OA fund where (as in the Harvard and Cornell funds, and perhaps others’ as well) there is an annual per-author cap on fund reimbursements, the author’s cap would be charged against. If the cap is maxed out, the author would be charged for the remainder. Moral hazard eliminated. The bookkeeping is daunting, and the whole thing is cumbersome, but in theory at least it would work.
To my mind, the right thing is just not to pay for (these kinds of) memberships. Let the publishers charge what they think is an appropriate fee, and let them have to worry about whether the fee is so high they will scare off authors. If, in the short run, an institution wants to put a thumb on the scale for certain OA publishers (say because they think that in these early days OA publishers need special help—affirmative action so to speak), then buying a membership may be a good idea, but I’d hope they would realize that that’s what you’re doing, and plan on dropping the memberships once OA publishing is fully robust and can stand on its own.
Now, what about two different membership models: (i) BMC’s prepayment model in which the university prepays funds in return for a discount on processing fees, and (ii) BMC’s shared support model in which the university prepays funds to be used to cover a fixed percentage (say 50%) of each processing fee, again in return for a discount on processing fees.
In the prepayment model, there is still a per-article fee directly attributable to a given author (even if it has been prepaid), and the author—or the author’s funder if grant-funded, or the author’s OA fund cap—can be charged that fee. In the first two cases, the mechanism for doing so may be problematic. You’d have to invoice the funder for the article fee even though it was prepaid perhaps a year before. I’m not sure how the accounting would work. Similarly for charging the author if he or she had used up the annual allotment of OA funds. But assuming you could make that work, the prepayment model at least doesn’t have the moral hazard problems of the institutional membership. Given all the accounting and logistics difficulties of the prepayment model and given that almost all of the savings from prepayment is being recouped by the funders and not the university, I wonder if it is worth the trouble.
The shared support model only makes sense if you are not planning on charging back the university’s prepaid share of the fee, in which case it has all the same moral hazard and funder-gift problems as the institutional memberships.
There is something simple—and, if I say so myself, elegant—about the bare COPE model without memberships. The journal charges the author a fee. The author charges it off to the funder if there is one, or, if there isn’t, to the OA fund up to the capped limit. Done. The university supports the OA publishers (like it does the subscription publishers) without a major moral hazard or transfer payments to funders. I’d hope that universities considering OA publisher memberships would consider COPE-compliant OA funds instead.
[For my non-computer-scientist readers, the title of this post is a reference to a famous phrase used in the title of this classic CS article.]
Update June 21, 2013: The Open Access Scholarly Publisher’s Association has posted a response to a request for input on the Finch report, in which they pick up on many of the same themes:
However, OASPA also recognizes the possibility that such schemes could lead to a lack of transparency regarding the cost of publication in different Open Access outlets, particularly if the terms of these deals are not publicly disclosed, which could be detrimental to the functioning of the market. Moreover, OASPA feels that membership schemes that are based on up-front commitments for a university to publish a particular volume of content with a given publisher can potentially reduce competition within the Open Access ecosystem, making it difficult for smaller publishers to compete on a level playing field with larger publishers, who are inherently better positioned to negotiate individual deals with universities.
It’s good to see that OA publishers recognize the incentive problems with membership arrangements.
Dissertation distribution online: my comments at the AHA
February 14th, 2011
I spoke at a panel last month at the annual meeting of the American Historical Association devoted to the question of electronic dissertations and intellectual property rights entitled “When Universities Put Dissertations on the Internet: New Practice; New Problem?” My co-panelists included Edward Fox, professor of computer science at Virginia Tech and director of the Networked Digital Library of Theses and Dissertations and Susan Ferber, history editor at Oxford University Press, with moderation by Sarah Maza, professor of history at Northwestern University. I have to believe that this was the only AHA panel ever with two computer scientists on it.
The panel was precipitated by a particular complaint about online distribution by then PhD candidate Ulrich Groetsch against his alma mater, Rutgers. Dr. Groetsch was initially supposed to be on the panel as well, but unfortunately was not able to attend. Dr. Maza read a statement that he had prepared outlining his concerns.
By way of background, Dr. Groetsch was basically concerned that the online availability of his dissertation from Rutgers’ open-access repository RUcore would affect his later ability to publish a book based on the dissertation. The details of the case, when embargoes were granted or expired, whether proper notifications were sent or received, and so forth, are disputed, and in any case not particularly relevant, as the particular case is of interest because it raises more general issues of under what conditions and on what basis dissertation distribution should be controlled.
On the assumption that someone or other might be interested, I’ve paraphrased my comments on the issue here. Much of my thinking is based on nascent efforts I’ve been making at Harvard to provide for open online distribution of theses and dissertations at Harvard, which is an ongoing effort. Here’s what I said: Read the rest of this entry »
The Tetrahedron test case
February 1st, 2011
![]() |
Phil Davis’s recent post over at The Scholarly Kitchen on whether open access might save the academic world some money misses the point of the COPE initiative and Harvard’s open-access fund (HOPE). Davis speculates that for the case of one set of journals that happened to be mentioned in my colleague Bob Darnton‘s recent NYRB piece, HOPE would cost the university more than its current subscriptions, echoing a more general claim he has made in previous work that OA article processing charges (APCs) will cost many universities more than they now pay in subscription fees. In particular, with regard to Tetrahedron‘s $39,082 price tag, he says “Of the nearly 6,000 articles in the Tetrahedron bundle, Harvard researchers authored 22 of them in 2010. Given that COPE will pay $3,000 for each article out of this fund, paying for open access would cost Harvard $66K in 2010, $27K more than its subscription price.”
Harvard’s HOPE fund, like all COPE-responsive funds, is intended to cover Harvard’s fair share of OA article processing fees. Harvard is dedicated to doing its part to underwrite OA fees — but not others’ parts. For that reason, it does not cover articles based on grant-funded research; the granting agency should cover that. (The same is true of the open-access funds at many other COPE-signatory institutions.)
For the particular case at hand, I found 24 articles from 2010 in the Tetrahedron bundle with Harvard affiliations using a Scirus search. All of the articles were grant-funded (16 by NIH, 11 by one or more other foundations, 5 by companies; the sum is greater than 24 as some articles had more than one funder). Thus none of them would have been eligible for HOPE funding; the HOPE fund cost to Harvard would have been $0.
But even if none of them had been grant-funded, HOPE covers fees prorated based on Harvard authorship. Since only 65 of the 144 authors on these 24 articles were Harvard affiliated, it would have covered only 65/144 (about 45%) of the fees. The cost would be (65/144)×$3000×24 = $32,500. Further, it covers only authors at schools with open-access policies. Of the 65 Harvard affiliates, only 22 were at schools with OA policies, so payment would be restricted to 22/144 (about 15%), so the cost would be (22/144)×$3000×24 = $11,000.
Of course, BioMed Central journals with similar impact factors to the Tetrahedron journals charge publication fees of $1820. (PLoS journals with considerably higher impact factors charge $1350 or $2250.) Presumably, if the Tetrahedron journals were open-access journals, as hypothesized in the thought experiment Davis is implicitly undertaking, they would have to compete for authors with other open-access journals and would need to charge similar rates (just as Nature Publishing Group’s Scientific Reports is doing relative to PLoS ONE). Redoing the calculation with the BMC rate gives (22/144)×$1820×24 ≈ $6,700. Even covering authors at ineligible schools, the cost would only be (65/144)×$1820×24 ≈ $20,000.
Two of the five bundled journals — accounting for all but 5 of the 24 articles — are “Letters” journals publishing quite short articles of a couple of pages. Presumably, they should require even lower APCs, reducing the likely cost further.
So, the upshot is that the cost to the HOPE fund for all of the Tetrahedron journals would be $0, and even if it were to cover the fees for grant-funded articles, the cost would be $32,500. Or $20,000. Or $11,000. Or $6,700. Or less.
The truth is that no one knows how much costs would be in a counterfactual open-access world with competitive APC fees. The kinds of calculations in Davis’s post (and this one and other previous work) are a kind of silly game. But given that the highest APC for an OA journal ($2,900 for PLoS Biology) is far less than the average revenue per article for a subscription journal ($5,000 according to the Scholarly Publishing Roundtable), it seems extraordinarily unlikely that the overall costs would be higher. And we’d drop all of the access restrictions as a nice side effect. Seems like a bargain to me.
But the most important point is that the idea behind COPE and the HOPE fund is not to save an individual institution money. If in the long term COPE has the intended effect of shifting journals such as Tetrahedron to an OA model within a journal ecology based on an efficient market — a situation that we manifestly do not now have — and if under those conditions Harvard ends up paying a bit more, well, then the market has spoken.
By the way, the Tetrahedron bundle price is now up to $41,361.
Are open-access fees disenfranchising?
January 18th, 2011
I had an interesting discussion over coffee at the recent SOAP Symposium about the question of whether the article processing fee revenue model for open-access journals disenfranchises authors with fewer financial resources. It prompted me to write up a fuller explanation of why this worry is misplaced.
Opportunity for full participation in research by as wide a range of scholars as possible is, of course, central to our meritocratic notion of the scholarly endeavor. Perhaps the biggest impediment to such full participation — to getting to the point where one has a scholarly result to present to the world — is gaining access to the facilities for carrying out research in the first place, including access to the published literature. It makes little sense to worry about disenfranchisement from publishing research results if the alternative is disenfranchisement from the reading that would allow generating the results in the first place. For that reason, open access to the scholarly literature is inherently an enfranchising program.
It also bears mentioning that it is not only open-access journals that charge author-side fees, the kinds of fees that critics complain are disenfranchising. Many subscription journals charge quite substantial fees as well. For NIH-funded research, the average is $1250 per article, which is plenty big enough to give your average developing-country scientist pause. One would be hard-pressed to impugn open-access journals on these grounds without roping in many subscription journals as well.
That being said, of course we want everyone to have the opportunity to publish in the scholarly literature, even those with lesser means. And there is a simple mechanism to allow for this with open-access journals that charge article processing fees. Journals can, should, and commonly do waive fees for necessitous authors. The details of these waiver policies differ. (See here for the PLoS policy or here for BioMed Central.) But the effect is the same: authors unable to afford the fees can still publish in these journals. More importantly, they can read the articles published in the journals too.
Some worry that authors requiring fee waivers may be discriminated against in the editorial process. Editorial processes must, of course, be kept separate from the financial processes. Different groups separated by a Chinese wall can handle the two issues. Indeed, the question of whether a waiver will be requested needn’t even be raised until an editorial decision on a paper is finalized, eliminating any possibility of a conflict of interest. PLoS has an especially simple method for handling waivers. After a paper is accepted for publication, authors can request a waiver of the fee, which is always granted.
Of course, the waiver idea can’t possibly be controversial. It is the same approach that subscription journal publishers use to address the reader-side disenfranchisement argument. They point out that the World Health Organization‘s Hinari program provides subsidized access to journals for scholars in a specified set of countries that have been deemed sufficiently impoverished. A similar eligibility criterion could be used for processing fee waivers. But an approach based on targeting individuals rather than countries has much to recommend it. It can be much better focused on the real problem. For instance, it can address authors in needy cohorts who happen to live in a country not on the approved list. There are unemployed scholars in first-world countries or faculty at small schools in developing countries, for example, for whom Hinari is no help, whereas a fee waiver allows them to fully participate in the open-access publishing milieu on both the reading and writing side.
[UPDATE 1/21/11: The recent news that publishers have withdrawn Bangladesh’s access through the HINARI program (because Bangladesh is “start[ing] to secure active sales“) makes regrettably clear the problem with this approach. Just because some researchers in Bangladesh may now fall within the scope of an institutional subscription, all are deprived access.]
The issue of fee waivers is important, and we should actively promote their availability. By way of example, many COPE-compliant open-access funds — including those at Harvard, Cornell, Dartmouth, MIT, and Columbia — will only cover fees for journals that have a waiver policy. Hopefully, this will provide some impetus for OA journals to institute reasonable waiver policies.
Ironically, Nature Publishing Group is entering the OA arena with Scientific Reports, a PLoS ONE competitor. Phil Davis reports that they are apparently not allowing for fee waivers, and points out that this could lead to a problem of adverse selection, where PLoS ONE ends up handling all of the fee-waived articles to their competitive disadvantage. On the other hand, if this turns out to be true, Scientific Reports will not be eligible for support from the COPE-compliant open-access funds as discussed above. There thus may be ways to mitigate the adverse selection problem.
With open access, we can enfranchise both the readers and the writers of the scholarly literature. We can, and we should.
A ray of sunshine in the open-access future
January 15th, 2011
![]() |
| Used by permission of PLoS |
I’m flying back from Berlin, where I gave talks at the Academic Publishing in Europe (APE) Conference and the Study of Open Access Publishing (SOAP) Symposium. Karmically, the SOAP Symposium was held in the very room, in Harnack Haus of the Max Planck Society, where the Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities was drafted in 2003. I’ll post links to those talks at this site when they become available. [Update 1/31/11: Links to the talks are now available at right.]
These several days of listening to presentations and talking with publishers, academics, and librarians have left me, I have to say, more optimistic about the potential future of open-access publishing than I’ve been in many years, maybe ever. Of course, that may not be saying much; I’ve never been very sanguine. But at the moment I’m marginally positive.
Over the past years, a transition path to relatively widespread open-access publishing has been obscure at best, and progress has been slow to nonexistent. Uptake, however measured, has been grudging, and author apathy overwhelming.
Especially problematic, but completely understandable, is the relatively slow uptake of authors in publishing in OA journals. Part, of course, is a numbers game: there are very few open-access journals of sufficient quality to provide more than a tiny fraction of the needed capacity, and little hope at the moment of remedying that situation given the lack of a viable revenue model for OA journals; it’s hard to imagine publishers starting a whole lot of OA journals if there’s no revenue model to keep them going in a sustainable, scalable manner. That’s the problem that COPE is attempting to address. In fact, that was the topic of my SOAP Symposium keynote.
Another contributing factor to authors’ ambivalence is their need to chase journal brand. Indeed, this is the main reason academics publish in journals — to get the imprimatur of the journal on their paper, since for better or worse (and, to my mind, mostly worse) that’s often what affects their career trajectory.
For a long time, I’ve assumed that a transition to a sizable role for OA publishing will require existing publishers to switch their existing journals to an OA publication-fee revenue model in order to cover enough of the scholarly fields, because the founding of enough new journals, whether by existing publishers or new ones, is a long and unlikely process, and won’t be able to address the brand development problem for even longer.
But recent developments may indicate a breakthrough from a surprising direction: PLoS ONE, a new kind of open-access mega-journal. This journal has a set of interlocking characteristics: broad scope (“primary research from any scientific discipline“), focused peer review for validity and soundness (but not field or predicted impact), reasonable publication fees, and post-publication article metrics and other services. Surprisingly (to me at least; I was frankly skeptical when PLoS ONE was launched) this model has shown tremendous popularity; submission growth has been geometric. Evidently, PLoS is able to provide a venue for verifying scientific validity over a huge range of fields and a huge number of articles, and make money doing it. PLoS One has become the largest peer-reviewed journal on earth, publishing almost 7,000 articles last year. It is single-handedly allowing PLoS to break even, subsidizing its higher-selectivity and field-focused journals.
Publishers have not failed to notice the dramatic success of PLoS ONE, and they are jumping on the bandwagon. SAGE announced SAGE Open, a mega-journal for “the social and behavioral sciences and humanities”, Nature Publishing Group is rolling out Scientific Reports (“all areas of the natural sciences— biology, chemistry, physics, and earth sciences”), and there’s BMJ Open (“medical research”), AIP Advances (“applied research in the physical sciences”), Genetics Society of America’s G3: Genes, Genomes, Genetics (“high-quality foundational research, particularly research that generates large-scale datasets”). As Mark Patterson, Director of Publishing for PLoS, pointed out in his talk at APE, all of these journals take up the PLoS ONE approach: broad scope, open access with an article-processing-fee revenue model, peer review for validity but not predicted importance or impact, post-publication article metrics and services, scalability, and a strong brand.
(NPG has decided not to use their trademark Nature in the name of their mega-journal, presumably out of fear that they would dilute the brand of their other journals. I think they are missing a strategic opportunity to strongly brand the new journal, based on a misapprehension that people primarily associate brand imprimatur with publisher journal collections rather than individual journals. PLoS ONE has shown that a publisher with an excellent high-quality brand association can run a mega-journal without diluting the brand signal of its flagship journals. I’d guess that sooner or later, we’ll start seeing the journal referred to as Nature Scientific Reports, even if the name doesn’t officially change.)
It seems extraordinarily likely that other major publishers will move in this direction as well. (You heard it here first.) [Update 5/25/11: Elsevier is now advertising for a Scientific Editor for a journal called Cell Reports from Cell Press, which “will publish high quality papers across the broadest possible range of disciplines in biology. It is an open access, online-only journal with continuous publication.” Sounds like PLoS ONE and Scientific Reports to me.] [Update 7/7/11: Bloomsbury has announced yet another new megajournal QScience Connect under sponsorship of the Qatar Science Foundation “for all research that is considered to be valid, ethical and correct”. Notably, they claim to cover “all fields”, including the traditional physical and life sciences, but also, math, computer science, law, the humanities, the social sciences, etc., etc.] [Update 1/12/12: Joining the other major scholarly publishers, Springer has launched SpringerPlus, covering “all disciplines of Science” with review based on scientific soundness alone.] [Update 1/17/12: I missed the December announcement of FEBS Open Bio, an OA megajournal published by Elsevier on behalf of the Federation of Biochemical Societies. The journal covers “the molecular and cellular life sciences in both health and disease” and reviews based on “soundness”, not “eventual importance”. Interestingly, this new journal would seem to compete directly with Elsevier’s other OA megajournal, Cell Reports. How they’ll handle that remains to be seen.]
The mega-journal trend means that strong traditional publishers with name recognition are entering open-access publishing in a big way. They’ll be hard pressed to trot out their hackneyed canards (vanity press, disenfranchisement). And these journals will provide coverage of a huge swathe of academic fields. Between SAGE Open, PLoS ONE, and Scientific Reports, essentially all of the social sciences, humanities, and natural sciences are covered. In addition, the breadth of these journals means that they will be competing for the same pool of articles. Authors will have a choice between submitting papers in genetics, say, to PLoS One or G3, in physics to AIP Advances or Scientific Reports, and so forth. Publishers will have to compete in order to attract authors, either on price or publisher services or both. They’ll have to market these journals to authors, using their intellectual capital to convince authors that OA journals (at least their OA journals) are a Good Thing. As authors and promotion committees get used to using the new article-level metrics (as they already increasingly are, with download counts and h-index), journal brand name — whether of these mega-journals or traditional journals — will become less important, and authors will feel freer to publish in these and other OA journals, again based on publisher services rather than journal brand name.
As an aside, I note that PLoS has another nascent service, PLoS Hubs, that could interact synergistically with the mega-journal trend as well. Hubs provide the ability to build subcollections of articles from PLoS ONE or other journals based on various selection criteria. At the moment, the hubs are specified and curated by PLoS editors, but you could imagine opening up the service to hubs based on whatever selection criteria a self-proclaimed editor chooses. For instance, I could put together a subcollection of articles in my own field, computational linguistics, essentially generating a bespoke computational linguistics journal of articles already vetted for validity by PLoS reviewers, and for field by me, and perhaps by predicted impact by a cohort of post-reviewers I assemble. It provides a platform for the kind of ecology of overlay journals that has been talked about for many years, but with little in the way of success. (Faculty of 1000 is a notable case for the life sciences, but hasn’t been replicated elsewhere.) The ability to have their articles participate in such hubs would provide mega-journal authors with the ability to generate cachet from imprimatur without the access limitations of traditional field-focused journals.
Mega-journals could be the new new thing that makes open-access publishing viable at scope. If so, Public Library of Science would have cracked it — not through its flagship but self-consciously retro journals but through its unlikely innovation PLoS ONE.
[Thanks to Mark Patterson for his APE talk and for providing me copies of his slides. And for publishing the PLoS journals.]
Chicago Manual of Style on Open Access
December 20th, 2010
![]() |
| University of Chicago Library, from Carlos Jimenez via flickr, used by permission |
Who knew? The Chicago Manual of Style‘s current edition (the 16th) includes for the first time a stance on open-access (Section 4.62), and on Harvard-style OA policies in particular (Section 4.63).
Written by copyright lawyer William S. Strong of Kotin, Crabtree & Strong, LLP, the chapter comes down hard on academics’ attempts to use their own writings.
Section 4.62 on “Authors’ electronic use of their own works” claims that open access to articles in institutional repositories is “likely to diminish licensing revenues” (despite all evidence to the contrary). It concludes that “The fact that licensing revenue helps support the publication of important scholarly work seems to have escaped general notice.” As if.
Section 4.63 on “University licenses” seems to be particularly aimed at Harvard-style open-access policies, “under which they presumptively receive nonexclusive licenses of journal articles written by their faculty, with the right to post those articles on the Internet and to make and license ‘noncommercial’ uses. (Commonly, faculty are permitted but not encouraged to opt out of this arrangement on a case-by-case basis.)” He lists as faults the claim that addenda “do[] not make clear what the author can, and cannot, do with derivative works that he or she creates” (because there is no limitation); that they “do[] not make clear whether what the author can distribute, display, and otherwise use is the author’s own manuscript or the finished, published work” (even though the Harvard addendum [paragraph 4a] is explicit about not distributing publishers’ versions); and that they “do[] not prevent the author from licensing the article to a competing journal” (except that journals won’t publish already published articles anyway). He frets about the vagueness of the term “noncommercial”, though the Harvard policies are explicit in stating that articles “are not sold for a profit” and agreements with publishers have further clarified the university’s intention not to sell the articles at all.
The Manual makes a recommendation to publishers to generate their own addenda “to use when presented with author requests for nonexclusive rights.” But why make the addendum conditions available only upon request? If the addendum-specified activities are allowable, why not just allow them in the publisher’s agreement from the get-go? In particular, how about a recommendation that publisher agreements allow authors of scholarly articles to post their final manuscript versions at their discretion, that is, allowing green OA?
Section 4.64 on “The NIH Public Access Policy” recommends that publishers “push for the maximum delay (i.e., twelve months) on public posting” if concerned about maximizing their revenues.
The most surprising thing about the new Manual sections is that a style manual is taking a stance on these intellectual property issues in the first place. The issues are obviously considerably more nuanced than Mr. Strong’s RIAA-like stance makes clear. Given that stance, you’d hardly know that the book is owned by a university (The University of Chicago, as stated in three copyright notices on each page) filled with faculty and students whose interests are not best served by this kind of short-term profit-maximizing attitude. Perhaps the editors might solicit a broader range of informed advice ahead of their next edition.
[Hat tip to Tom Dodson for bringing these sections to my attention.]




