If the map becomes the territory then we will be lost

That which computation sets out to map and model it eventually takes over. Google sets out to index all human knowledge and becomes the source and the arbiter of that knowledge: it became what people think. Facebook set out to map the connections between people – the social graph – and became the platform for those connections, irrevocably reshaping societal relationships. Like an air control system mistaking a flock of birds for a fleet of bombers, software is unable to distinguish between the model of the world and reality – and, once conditioned, neither are we.

James Bridle, New Dark Age, p.39.

I am here to bring your attention to two developments that are making me worried:

  1. The Social Graph of Scholarly Communications is becoming more tightly bound into institutional metrics that have an increasing influence on institutional funding
  2. The publishers of the Social Graph of Scholarship are beginning to enclose the Social Graph, excluding the infrastructure of libraries and other independent, non-profit organizations

Normally, I would try to separate these ideas into two dedicated posts but in this case, I want to bring them together in writing because if these two trends converge together, things will become very bad, very quickly.

Let me start with the first trend:

1. The social graph that binds

When I am asked to explain how to achieve a particular result within scholarly communication, more often than not, I find myself describing four potential options:

  1. a workflow of Elsevier products (BePress, SSRN, Scopus, SciVal, Pure)
  2. a workflow of Clarivate products (Web of Science, InCites, Endnote, Journal Citation Reports)
  3. a workflow of Springer-Nature products (Dimensions, Figshare, Altmetrics)
  4. a DIY workflow from a variety of independent sources (the library’s institutional repository, ORCiD, Open Science Framework)

Workflow is the new content.

That line – workflow is the new content – is from Lorcan Dempsey and it was brought to my attention by Roger Schonfeld. For Open Access week, I gave a presentation on this idea of being mindful of workflow and tool choices in a presentation entitled, A Field Guide to Scholarly Communications Ecosystems. The slides are below.

(My apologies for not sharing the text that goes with the slides. Since January of this year, I have been the Head of the Information Services Department at my place of work. In addition to this responsibility, much of my time this year has been spent covering the work of colleagues currently on leave. Finding time to write has been a challenge.)

In Ontario, each institution of higher education must submit a ‘Strategic Mandate Agreement‘ with its largest funding body, the provincial government. Universities are currently in the second iteration of these types of agreements and are preparing for the third round. These agreements are considered fraught by many, including Marc Spooner, a professor in the faculty of education at the University of Regina, who wrote the following in an opinion piece in University Affairs:

The agreement is designed to collect quantitative information grouped under the following broad themes: a) student experience; b) innovation in teaching and learning excellence; c) access and equity; d) research excellence and impact; and e) innovation, economic development and community engagement. The collection of system-wide data is not a bad idea on its own. For example, looking at metrics like student retention data between years one and two, proportion of expenditures on student services, graduation rates, data on the number and proportion of Indigenous students, first-generation students and students with disabilities, and graduate employment rates, all can be helpful.

Where the plan goes off-track is with the system-wide metrics used to assess research excellence and impact: 1) Tri-council funding (total and share by council); 2) number of papers (total and per full-time faculty); and 3) number of citations (total and per paper). A tabulation of our worth as scholars is simply not possible through narrowly conceived, quantified metrics that merely total up research grants, peer-reviewed publications and citations. Such an approach perversely de-incentivises time-consuming research, community-based research, Indigenous research, innovative lines of inquiry and alternative forms of scholarship. It effectively displaces research that “matters” with research that “counts” and puts a premium on doing simply what counts as fast as possible…

Even more alarming – and what is hardly being discussed – is how these damaging and limited terms of reference will be amplified when the agreement enters its third phase, SMA3, from 2020 to 2023. In this third phase, the actual funding allotments to universities will be tied to their performance on the agreement’s extremely deficient metrics.

Ontario university strategic mandate agreements: a train wreck waiting to happen”, Marc Spooner, University Affairs, Jan 23 2018

The measure by which citation counts for each institution are going to be assessed have already been decided. The Ontario government has already stated that it is going to use Elsevier’s Scopus (although I presume they really meant SciVal).

What could possibly go wrong? To answer that question, let’s look at the second trend: enclosure.

2. Enclosing the social graph

The law locks up the man or woman
Who steals the goose from off the common
But leaves the greater villain loose
Who steals the common from off the goose.

Anonymous, “The Goose and the Commons”

As someone who spends a great deal of time ensuring that the scholarship of the University of Windsor’s Institutional Repository meets the stringent restrictions set by publishers, it’s hard not to feel a slap in the face when reading Springer Nature Syndicates Content to ResearchGate.

ResearchGate has been accused of “massive infringement of peer-reviewed, published journal articles.”

They say that the networking site is illegally obtaining and distributing research papers protected by copyright law. They also suggest that the site is deliberately tricking researchers into uploading protected content.

Who is the they of the above quote? Why they is the publishers, the American Chemical Society and Elsevier.

It is not uncommon to find selective enforcement of copyright within the scholarly communication landscape. Publishers have cast a blind eye to the copyright infringement of ResearchGate and Academia.edu for years, while targeting course reserve systems set up by libraries.

Any commercial system that is part of the scholarly communication workflow can be acquired for strategic purposes.

As I noted in my contribution to Grinding the Gears: Academic Librarians and Civic Responsibility, sometimes companies purchase competing companies as a means to control their development and even to shut their products down.

One of the least understood and thus least appreciated functions of calibre is that it uses the Open Publication Distribution System (OPDS) standard (opds-spec.org) to allow one to easily share e-books (at least those without Digital Rights Management software installed) to e-readers on the same local network. For example, on my iPod Touch, I have the e-reader program Stanza (itunes.apple.com/us/app/stanza/id284956128) installed and from it, I can access the calibre library catalogue on my laptop from within my house, since both are on the same local WiFi network. And so can anyone else in my family from their own mobile device. It’s worth noting that Stanza was bought by Amazon in 2011 and according to those who follow the digital e-reader market, it appears that Amazon may have done so solely for the purpose of stunting its development and sunsetting the software (Hoffelder,2013)

Grinding the Gears: Academic Librarians and Civic Responsibility” Lisa Sloniowski, Mita Williams, Patti Ryan, Urban Library Journal. Vol. 19. No.1 (2013). Special Issue: Libraries, Information and the Right to the city: Proceedings of the 2013 LACUNY Institute.

And sometimes companies acquire products to provide a tightly integrated suite of services and seamless workflow.

If individual researchers determine that seamlessness is valuable to them, will they in turn license access to a complete end-to-end service for themselves or on behalf of their lab?

And, indeed, whatever model the university may select, if individual researchers determine that seamlessness is valuable to them, will they in turn license access to a complete end-to-end service for themselves or on behalf of their lab?  So, the university’s efforts to ensure a more competitive overall marketplace through componentization may ultimately serve only to marginalize it.

“Big Deal: Should Universities Outsource More Core Research Infrastructure?”, Roger C. Schonfeld, January 4, 2018

Elsevier bought BePress in August of 2017. In May of 2016, Elsevier acquired SSRN. Bepress and SSRN are currently exploring further “potential areas of integration, including developing a single upload experience, conducting expanded research into rankings and download integration, as well as sending content from Digital Commons to SSRN.

Now, let’s get to the recent development that has me nervous.

10.2 Requirements for Plan S compliant Open Access repositories

The repository must be registered in the Directory of Open Access Repositories (OpenDOAR) or in the process of being registered.

In addition, the following criteria for repositories are required:

  • Automated manuscript ingest facility
  • Full text stored in XML in JATS standard (or equivalent)
  • Quality assured metadata in standard interoperable format, including information on the DOI of the original publication, on the version deposited (AAM/VoR), on the open access status and the license of the deposited version. The metadata must fulfil the same quality criteria as Open Access journals and platforms (see above). In particular, metadata must include complete and reliable information on funding provided by cOAlition S funders. OpenAIRE compliance is strongly recommended.
  • Open API to allow others (including machines) to access the content
  • QA process to integrate full text with core abstract and indexing services (for example PubMed)
  • Continuous availability

Automated manuscript ingest facility probably gives me the most pause. Automated means a direct pipeline from publisher to institutional repository that could be based on a publishers’ interpretation of fair use/fair dealing and we don’t know what the ramifications of that decision making might be. I’m feeling trepidation because I believe we are already experiencing the effects of a tighter integration between manuscript services and the IR.

Many publishers – including Wiley, Taylor and Francis, IEEE, and IOP – already use a third party manuscript service called ScholarOne. ScholarOne integrates the iThenticate service which produces reports of what percentage of a manuscript has already been published. Journal editors have the option to set what extent a paper can make use of a researcher’s prior work, including their thesis. Manuscripts that exceed these set thresholds can be automatically rejected without human interjection from the editor. We are only just starting to understand how this workflow is going to impact the willingness of young scholars to make their theses and dissertations open access.

It is also worth noting that ScholarOne is owned by Clarivate Analytics, the parent company of Web of Science, Incites, Journal Citation Reports, and others. One on hand, having a non-publisher act as a third party to the publishing process is probably ideal since it reduces the chances of a conflict of interest. On the other hand, I’m very unhappy with Clarivate Analytics’s product called Kopernio which provides “fast, one-click access to millions of research papers” and “integrates with Web of Science, Google Scholar, PubMed” and 20,000 other sites” (including ResearchGate and Academia.edu natch). There are prominent links to Kopernio within Web of Science that essentially positions the product as a direct competitor to a university library’s link resolver service and in doing so, removes the library from the scholarly workflow – other than the fact that the library pays for the product’s placement.

The winner takes it all

The genius — sometimes deliberate, sometimes accidental — of the enterprises now on such a steep ascent is that they have found their way through the looking-glass and emerged as something else. Their models are no longer models. The search engine is no longer a model of human knowledge, it is human knowledge. What began as a mapping of human meaning now defines human meaning, and has begun to control, rather than simply catalog or index, human thought. No one is at the controls. If enough drivers subscribe to a real-time map, traffic is controlled, with no central model except the traffic itself. The successful social network is no longer a model of the social graph, it is the social graph. This is why it is a winner-take-all game.

Childhood’s End, Edge, George Dyson [1.1.19]