Noting well

Scribble, scribble, scribble (Eh! Mr Gibbon?)

Last week I read an article that made me very uncomfortable. I had been diagnosed by the author and was found to be diseased.

The Twittering Machine is powered by an insight at once obvious and underexplored: we have, in the world of the social industry, become “scripturient—possessed by a violent desire to write, incessantly.” Our addiction to social media is, at its core, a compulsion to write. Through our comments, updates, DMs, and searches, we are volunteers in a great “collective writing experiment.” Those of us who don’t peck out status updates on our keyboards are not exempt. We participate too, “behind our backs as it were,” creating hidden (written) records of where we clicked, where we hovered, how far we scrolled, so that even reading, within the framework of the Twittering Machine, becomes a kind of writing.

Going Postal: A psychoanalytic reading of social media and the death drive, Max Read for Bookforum

The scripturient among us cannot stop writing even though social media brings no joy. Some of us opted for a lesser evil and have Waldenponded to the cozyweb

Unlike the main public internet, which runs on the (human) protocol of “users” clicking on links on public pages/apps maintained by “publishers”, the cozyweb works on the (human) protocol of everybody cutting-and-pasting bits of text, images, URLs, and screenshots across live streams. Much of this content is poorly addressable, poorly searchable, and very vulnerable to bitrot. It lives in a high-gatekeeping slum-like space comprising slacks, messaging apps, private groups, storage services like dropbox, and of course, email.

from Cozyweb by Venkatesh Rao

In other words, some of us have opted to keep writing compulsively but mostly to ourselves.

I’ve found Notion to be welcome respite from the public square of Twitter or even the water-cooler of Slack. While I used to plan trips on Pinterest, I now find myself saving inspirational images to Notion. Instead of relying on Facebook or Linkedin to catalog my connections, I’ve been building my own relationship tracker in Notion.

Like the living room, Notion appeals to both the introverted and extroverted sides of my personality. It’s a place where I can create and test things out in private. Then, when I’m craving some external validation, I can show off a part of my workspace to as many or as few people as I want. It’s a place where I can think out loud without worrying about the judgement of strangers or the tracking of ad targeting tools.

Notion is the living room of the cozyweb by by Nick deWilde

Exhausted by my own doomscrolling, I recently pledged to myself to spend less time on social media. But I still had a scribbling habit that needed to be maintained. I found myself researching why so many of the few remaining bloggers that I knew were so obsessed with Notion and other tools that were unfamiliar to me.

It’s the worldwideweb. Let’s share what we know.

The tools of the notearazzi

Notion describes itself as ‘the all-in-one workspace’ for all of “your notes, tasks, and wikis”. That sounds more compelling than the the way that I would describe it: Notion allows you to build workflows from documents using linked, invisible databases.

For example, here is a set of pages that can be arranged as a task board, a kaban board, a calendar, or a list, just by changing your view of the information at hand.

(In this way Notion reminds me of Drupal except all of the database scaffolding is invisible to the user.)

There are other note taking tools that promise to revolutionize the work and the workflow of the user: Roam Research (that turns your “graph connected” notes into a ‘second brain’), RemNote (that turns your study notes into spaced repetition-flashcards), and Obsidian (that turns your markdown notes into a personal wiki / second brain on your computer).

And there is still Evernote.

Personal Knowledge Management

These types of note-taking systems are also known as personal knowledge management or PKM.

https://mobile.twitter.com/Bopuc/status/1305469230725431296

The Digital Garden

From the above diagram, you can see that PKM systems are also called Digital Gardens. Patrick Tanguay wrote a short backgrounder on this concept with a great set of links to explore.

In short: brief notes from your own thinking, heavily linked back and forth, continually added to and edited.

The goal is to have a library of notes of your own thinking so you can build upon what you read and write, creating your own ideas, advancing your knowledge.

Digital Gardens, Patrick Tanguay

The word garden was chosen carefully to describe this concept. We find ourselves in a world in which almost all of our social media systems are algorithm-influenced streams. To find the contemplative space we need to think, we need to find a slower landscape.

Remember a couple months ago when I wrote about Matt Caulfield’s alternative to CRAAP called SIFT? Well, I’m invoking him again for his 2015 post called The Garden and the Stream: A Technopastoral.

I don’t want people to get hung up on the technology angle. I think sometimes people hear “Federated Thingamabob” and just sort of tune out thinking “Oh, he’s talking about a feature of Federated Thingamabob.” But I’m not. I’m really not. I’m talking about a different way to think your online activity, no matter what tool you use. And relevant to this conference, I’m talking about a different way of collaborating as well.

Without going to much into what my federated wiki journal is, just imagine that instead of blogging and tweeting your experience you wiki’d it. And over time the wiki became a representation of things you knew, connected to other people’s wikis about things they knew.

So when I see an article like this I think — Wow, I don’t have much in my wiki about gun control, this seems like a good start to build it out and I make a page.

The first thing I do is “de-stream” the article. The article is about Oregon, but I want to extract a reusable piece out of it in a way that it can be connected to many different things eventually. I want to make a home page for this idea or fact. My hub for thinking about this.

The Garden and the Stream: A Technopastoral, Mike Caulfield

I used to think of blog posts as part of a growing garden, but my framing has shifted and now I think of the blog as the headwaters of the first sluggish stream (and the beginning of the end of the web as we know it):

Whereas the garden is integrative, the Stream is self-assertive. It’s persuasion, it’s argument, it’s advocacy. It’s personal and personalized and immediate. It’s invigorating. And as we may see in a minute it’s also profoundly unsuited to some of the uses we put it to.

The stream is what I do on Twitter and blogging platforms. I take a fact and project it out as another brick in an argument or narrative or persona that I build over time, and recapitulate instead of iterate.

The Garden and the Stream: A Technopastoral, Mike Caulfield

Caulfield alludes to the associative power of links after he compares the original vision of Vannevar Bush’s MEMEX and the topology of the World Wide Web:

Each memex library contains your original materials and the materials of others. There’s no read-only version of the memex, because that would be silly. Anything you read you can link and annotate. Not reply to, mind you. Change. This will be important later.

Links are associative. This is a huge deal. Links are there not only as a quick way to get to source material. They aren’t a way to say, hey here’s the interesting thing of the day. They remind you of the questions you need to ask, of the connections that aren’t immediately evident.

Links are made by readers as well as writers. A stunning thing that we forget, but the link here is not part of the author’s intent, but of the reader’s analysis. The majority of links in the memex are made by readers, not writers. On the world wide web of course, only an author gets to determine links. And links inside the document say that there can only be one set of associations for the document, at least going forward.

The Garden and the Stream: A Technopastoral, Mike Caulfield

Mike Cauldfield’s own digital garden was a personal wiki and there some reader/writers who have opted to go this route using Tiddlywiki or a variation.

There is no one way to grow your own digital garden. Gardens are personal and they grow to suit the space and time that you are able to give them. There are digital gardens that are wild and overgrown like a verdant English garden and then there are the closely controlled and manicured gardens known as BASB.

The Second Brain

BASB stands for Building A Second Brain. Unlike our own feeble wetware, these BASB systems exist so we do not forget passing notions. They are also promoted as environments that lend themselves to creative thinking because, just like our own minds, they encourage the generation of new thoughts by the association of disparate ideas from different fields, places, or times.

To be honest, during most of the time I spent researching for this post, every time I read the phrase second brain, I immediately dismissed it as glib marketing and not as a concept worth serious considering. But then I watched a YouTube video of a medical student who had taken a $1500 course on building brain building and he could not stop singing its praises.

From that video, I learned that Second Brain building wasn’t just making links between concepts and waiting for creativity to descend or a book to emerge. The framing of the activities that it prescribes are closer to a Project Management System in which efforts are directly ultimately to outcomes and outputs. That system is also known as PARA.

Image from: Building a Second Brain: The Illustrated Notes by Maggie Appleton

Not every building a second brain (BASB) system is build on the foundations of PARA. There are those who decide to populate their new Roam Research space using the Smart Note system or the Zettelkasten approach.

Zettelkasten

When I was doing research for my 2015 Access talk about index cards and bibliographic systems, I dimly remember coming across the note taking system of sociologist Niklas Luhmann which turned into a 90,000+ card zettelkasten into over 70 books. I distinctly remember coming across the system again when I was reading about Beck Trench’s Academic Workflow:

I use the Zettelkasten method of note-taking, by which I mean that I create notes that contain a single idea or point that is significant to me. These notes are usually linked to other notes, authors, and citations, allowing me to understand that single idea in the context of the larger literature that I’m exploring. I use the knowledge management software Tinderbox to write these notes and map their associations. I’ve created a series of videos that explain exactly how I do this. I also sync my Tinderbox zettels with DEVONthink using these scripts so that I can search my own notes alongside my articles to find connections I might otherwise miss.

Academic Workflow: Reading, Beck Trench

From what I can tell, many people’s first introduction to the zettelkasten method has been through this website or the 2017 book How to Take Smart Notes by Sonke Ahrens (2017). I haven’t read the book yet but I was so intrigued that I have ordered a copy. From a review of the work:

The book is written in an essayistic and very readable style, humorous and anecdotal, which makes both the practical advice as well as the underlying philosophy very accessible and convincing. Ahrens offers a compelling meta-reflection on the pivotal role of writing in – and as – thinking, and as such, he also formulates a timely and important advocacy of the humanities. It is therefore regrettable that in his emphasis on proliferating personal productivity and ‘boosting’ written output with Luhmann’s slip box system, Ahrens neglects to critically reflect upon the luring dangers of academic careerism for truly original scholarship… The explosion of publishing outlets is in turn tightly connected with the increasing governmentalization and commodification of academic life (Miller 2015), and while Ahrens continually emphasizes the potential of increasing written output with Luhmann’s method, he unfortunately misses the opportunity to reflect on the very conditions of academic life that create a demand for a book like his own in the first place.

Book review: How to Take Smart Notes, Reviewed by Melanie Schiller, Journal of Writing Research (2017)

How might academic libraries figure into these systems

While keeping in mind that the knowledge workers who commit strongly to a holistic note-taking system are a minority of our patrons, how can academic libraries support those students, faculty, and academic staff who use specialized note-taking software?

Personally, I think at a minimum, we must try to keep as much of our material as copy-able as possible. In other words, we should keep our investments in DRM-locked material as small possible.

But I’ll boil it down to this. It came down to who had the power to change things. It came down to the right to make copies.

On the web, if you wanted to read something you had to read it on someone else’s server where you couldn’t rewrite it, and you couldn’t annotate it, you couldn’t copy it, and you couldn’t add links to it, you couldn’t curate it.

These are the verbs of gardening, and they didn’t exist on the early web.

The Garden and the Stream: A Technopastoral, Mike Caulfield

What might happen if we try on the idea that a library is a type of stock that both readers and writers can draw upon for their respective knowledge flow.

Stock and flow are just different ways of expressing garden and stream. Mike Caulfield looks at OER in this context and I found this framing as very useful.

Everything else is either journal articles or blog posts making an argument about local subsidies. Replying to someone. Building rapport with their audience. Making a specific point about a specific policy. Embedded in specific conversations, specific contexts.

Everybody wants to play in the Stream, but no one wants to build the Garden.

Our traditional binary here is “open vs. closed”. But honestly that’s not the most interesting question to me anymore. I know why textbook companies are closed. They want to make money.

What is harder to understand is how in nearly 25 years of the web, when people have told us what they THINK about local subsidies approximately one kajillion times we can’t find one — ONE! — syllabus-ready treatment of the issue.

You want ethics of networked knowledge? Think about that for a minute — how much time we’ve all spent arguing, promoting our ideas, and how little time we’ve spent contributing to the general pool of knowledge.

Why? Because we’re infatuated with the stream, infatuated with our own voice, with the argument we’re in, the point we’re trying to make, the people in our circle we’re talking to.

The Garden and the Stream: A Technopastoral, Mike Caulfield

Conclusion

A scholar reads texts from the library and thoughtfully creates personal notes from their reading. Those notes grow, get connected to other notes, help generate new notes and associations, and, in time, help generate the scholar’s own text that — hopefully — will become part of that same library. “A scholar is just a library’s way of making another library” (Daniel C. Dennett, Consciousness Explained).

Once again, it makes me wonder whether our institutions should consider adopting the professional mission that Dan Chudnov made for himself in 2006: Help people build their own libraries.

Because those scholar’s notes? They are also a library.

The Provenance of Facts

Brian Feldman has a newsletter called BNet and on May 30th, he published an insightful and whimsical take on facts and Wikipedia called mysteries of the scatman.

The essay is an excellent reminder that if a fact without proper provenance makes it way into Wikipedia and is then published in a reputable source, it is nearly impossible to remove said fact from Wikipedia.

Both the Scatman John and “Maps” issues, however, point to a looming vulnerability in the system. What happens when facts added early on in Wikipedia’s life remain, and take on a life of their own? Neither of these supposed truths outlined above can be traced to any source outside of Wikipedia, and yet, because they initially appeared on Wikipedia and have been repeated elsewhere, they are now, for all intents and purposes, accepted as truth on Wikipedia. It’s twisty.

mysteries of the scatman

This is not a problem of only Wikipedia. Last year I addressed a similar issue in an Information Literacy class for 4th year Political Science students when I encouraged students to follow the citation pathways of the data that they plan to cite. I warned them not to fall for academic urban legends:

Spinach is not an exceptional nutritional source of iron. The leafy green has iron, yes, but not much more than you’d find in other green vegetables. And the plant contains oxalic acid, which inhibits iron absorption.

Why, then, do so many people believe spinach boasts such high iron levels? Scholars committed to unmasking spinach’s myths have long offered a story of academic sloppiness. German chemists in the 1930s misplaced a decimal point, the story goes. They thus overestimated the plant’s iron content tenfold.

But this story, it turns out, is apocryphal. It’s another myth, perpetuated by academic sloppiness of another kind. The German scientists never existed. Nor did the decimal point error occur. At least, we have no evidence of either. Because, you see, although academics often see themselves as debunkers, in skewering one myth they may fall victim to another.

In his article “Academic Urban Legends,” Ole Bjorn Rekdal, an associate professor of health and social sciences at Bergen University College in Norway, narrates the story of these twinned myths. His piece, published in the journal Social Studies of Science, argues that through chains of sloppy citations, “academic urban legends” are born. Following a line of lazily or fraudulently employed references, Rekdal shows how rumor can become acknowledged scientific truth, and how falsehood can become common knowledge.

Academic Urban Legends“, Charlie Tyson, Inside Higher Ed, August 6, 2014

I’m in the process of working on an H5P learning object dedicated to how to calculate one’s H-Index and yet, I’m conflicted about doing so. There are many reasons why using citations as a measure of an academic’s value is problematic for reasons far beyond the occasional academic urban legend:

To weed out academic urban legends, Rekdal says editors “should crack down violently on every kind of abuse of academic citations, such as ornamental but meaningless citations to the classics, or exchanges in citation clubs where the members pump up each other’s impact factors and h-indexes.”

Yet even Rekdal – who debunks the debunkers – says his citation record isn’t flawless.

“I have to admit that I published an article two decades ago where I included an academically completely meaningless reference (without page numbers of course) to a paper written by a woman I was extremely in love with,” he said. “I am still a little ashamed of what I did. But on the other hand, the author of that paper has now been my wife for more than 20 years.”

Academic Urban Legends“, Charlie Tyson, Inside Higher Ed, August 6, 2014

Considering dark deposit

I have a slight feeling of dread.

In the inbox of the email address associated with MPOW’s institutional repository are more than a dozen notifications that a faculty member has deposited their research work for inclusion. I should be happy about this. I should be delighted that a liaison librarian spoke highly enough of the importance of the institutional repository at a faculty departmental meeting and inspired a researcher to fill in a multitude of forms so their work can be made freely available to readers.

But I don’t feel good about this because a cursory look of what journals this faculty member has published suggests that we can include none of the material in our IR due to restrictive publisher terms.

This is not a post about the larger challenges of Open Access in the current scholarly landscape. This post is a consideration of a change of practice regarding IR deposit, partly inspired by the article, Opening Up Open Access Institutional Repositories to Demonstrate Value: Two Universities’ Pilots on Including Metadata-Only Records.

Institutional repository managers are continuously looking for new ways to demonstrate the value of their repositories. One way to do this is to create a more inclusive repository that provides reliable information about the research output produced by faculty affiliated with the institution.

Bjork, K., Cummings-Sauls, R., & Otto, R. (2019). Opening Up Open Access Institutional Repositories to Demonstrate Value: Two Universities’ Pilots on Including Metadata-Only Records. Journal of Librarianship and Scholarly Communication, 7(1). DOI: http://doi.org/10.7710/2162-3309.2220

I read the Opening Up… article with interest because a couple of years ago, when I was the liaison librarian for biology, I ran an informal pilot in which I tried to capture the corpus of the biology department. During this time, for those articles from publishers who did not allow publisher PDF versions of deposit and authors who were not interested in depositing a manuscript version, I published the metadata of these works instead.

But part way through this pilot, I abandoned the practice. I did so for a number of reasons. One reason was that the addition of their work to the Institutional Repository did not seem to prompt faculty to start depositing their research on their volition. This was not surprising as BePress doesn’t allow for the integration of author profiles directly into it’s platform (one must purchase a separate product for author profiles and the ability to generate RSS feeds at the author level). So I was not particularly disappointed with this result. While administrators are increasingly interested in demonstrating research outputs at the department and institutional level, you can still generalize faculty as more invested in subject-based repositories.

But during this trial I uncovered a more troubling reason that suggested that uploading citations might be problematic. I came to understand that most document harvesting protocols – such as OAI-PMH and OpenAIRE – do not provide any means by which one can differentiate between metadata-only records and full text records. Our library system harvests our IR and it assumes that every item in IR has a full-text object associated with it. Other services that harvest our IR do the same. To visit the IR is to expect the full text of a text.

But the reason that made me stop the experiment pretty much immediately was reading this little bit of hearsay on Twitter:

Google and Google Scholar are responsible for the vast majority of our IR’s traffic and use. In many disciplines the percentage of Green OA articles as a percentage of total faculty output is easily less than 25%. To publish citations when the fulltext of a pre-print manuscript is not made available to the librarian, is ultimately going to test whether Google Scholar really does have an full-text threshold. And then what do we do when we find our work suddenly gone from search results?

Yet, the motivation to try to capture the whole of a faculty’s work still remains. An institutional repository should be a reflection of all the research and creative work of the institution that hosts it.

If an IR is not able to do this work, an institution is more likely to invest in a CRIS – a Current Research Information System – to represent the research outputs of the organization.

Remember when I wrote this in my post from March of this year?

When I am asked to explain how to achieve a particular result within scholarly communication, more often than not, I find myself describing four potential options:

– a workflow of Elsevier products (BePress, SSRN, Scopus, SciVal, Pure)

– a workflow of Clarivate products (Web of Science, InCites, Endnote, Journal Citation Reports)

– a workflow of Springer-Nature products (Dimensions, Figshare, Altmetrics)

– a DIY workflow from a variety of independent sources (the library’s institutional repository, ORCiD, Open Science Framework)

If the map becomes the territory than we will be lost

The marketplace for CRIS is no different:

But I think the investment in two separate products – a CRIS to capture the citations of a faculty’s research and creative output and an IR to capture the fulltext of the same, still seems a shame to pursue. Rather than invest a large sum of money for the quick win of a CRIS, we should invest those funds into an IR that can support data re-use, institutionally.

(What is the open version of the CRIS? To be honest, I don’t know this space very well. From what I know at the moment, I would suggest it might be the institutional repository + ORCiD and/or VIVO.)

I am imagining a scenario in which every article-level work that a faculty member of an institution has produced is captured in the institutional repository. Articles that are not allowed to be made open access are embargoed until they are in the public domain.

But to be honest, I’m a little spooked because I don’t see many other institutions engaging in this practice. Dark deposit does exist in the literature but it largely appears in the early years of the conversations around scholarly communications practice. The most widely cited article about the topic (from my reading not from a proper literature review), is this 2011 article called The importance of dark deposit from Stewart Sheiber. His blog is licensed as CC-BY, so I’m going to take advantage of this generosity and re-print the seven reasons why dark is better than missing:

  1. Posterity: Repositories have a role in providing access to scholarly articles of course. But an important part of the purpose of a repository is to collect the research output of the institution as broadly as possible. Consider the mission of a university archives, well described in this Harvard statement: “The Harvard University Archives (HUA) supports the University’s dual mission of education and research by striving to preserve and provide access to Harvard’s historical records; to gather an accurate, authentic, and complete record of the life of the University; and to promote the highest standards of management for Harvard’s current records.” Although the role of the university archives and the repository are different, that part about “gather[ing] an accurate, authentic, and complete record of the life of the University” reflects this role of the repository as well.Since at any given time some of the articles that make up that output will not be distributable, the broadest collection requires some portion of the collection to be dark.
  2. Change: The rights situation for any given article can change over time — especially over long time scales, librarian time scales — and having materials in the repository dark allows them to be distributed if and when the rights situation allows. An obvious case is articles under a publisher embargo. In that case, the date of the change is known, and repository software can typically handle the distributability change automatically. There are also changes that are more difficult to predict. For instance, if a publisher changes its distribution policies, or releases backfiles as part of a corporate change, this might allow distribution where not previously allowed. Having the materials dark means that the institution can take advantage of such changes in the rights situation without having to hunt down the articles at that (perhaps much) later date.
  3. Preservation: Dark materials can still be preserved. Preservation of digital objects is by and large an unknown prospect, but one thing we know is that the more venues and methods available for preservation, the more likely the materials will be preserved. Repositories provide yet another venue for preservation of their contents, including the dark part.
  4. Discoverability: Although the articles themselves can’t be distributed, their contents can be indexed to allow for the items in the repository to be more easily and accurately located. Articles deposited dark can be found based on searches that hit not only the title and abstract but the full text of the article. And it can be technologically possible to pass on this indexing power to other services indexing the repository, such as search engines.
  5. Messaging: When repositories allow both open and dark materials, the message to faculty and researchers can be made very simple: Always deposit. Everything can go in; the distribution decision can be made separately. If authors have to worry about rights when making the decision whether to deposit in the first place, the cognitive load may well lead them to just not deposit. Since the hardest part about running a successful repository is getting a hold of the articles themselves, anything that lowers that load is a good thing. This point has been made forcefully by Stevan Harnad. It is much easier to get faculty in the habit of depositing everything than in the habit of depositing articles subject to the exigencies of their rights situations.
  6. Availability: There are times when an author has distribution rights only to unavailable versions of an article. For instance, an author may have rights to distribute the author’s final manuscript, but not the publisher’s version. Or an art historian may not have cleared rights for online distribution of the figures in an article and may not be willing to distribute a redacted version of the article without the figures. The ability to deposit dark enables depositing in these cases too. The publisher’s version or unredacted version can be deposited dark.
  7. Education: Every time an author deposits an article dark is a learning moment reminding the author that distribution is important and distribution limitations are problematic.

There is an additional reason for pursuing a change of practice to dark deposit that I believe is very significant:

There are at least six types of university OA policy. Here we orga-nize them by their methods for avoiding copyright troubles…

3. The policy seeks no rights at all, but requires deposit in the repository. If the institution already has permission to make the work OA, then it makes it OA from the moment of deposit. Otherwise the deposit will be “dark” (non-OA) (See p. 24) until the institution can obtain permission to make it OA. During the period of dark deposit, at least the metadata will be OA.

Good Practices For University Open-Access Policies, Stuart Shieber and Peter Suber, 2013

At least the metadata will be OA is a very good reason to do dark deposit. It might be reason enough. I share many of Ryan Regier’s enthusiasm for Open Citations that he explains in his post, The longer Elsevier refuses to make their citations open, the clearer it becomes that their high profit model makes them anti-open

Having a more complete picture of how much an article has been cited by other articles is an immediate clear benefit of Open Citations. Right now you can get a piece of that via the above tools I’ve listed and, maybe, a piece is all you need. If you’ve got an article that’s been cited 100s of times, likely you aren’t going to look through each of those citing articles. However, if you’ve got an article or a work that only been cited a handful of times, likely you will be much more aware of what those citing articles are saying about your article and how they are using your information.

Ryan Regier,The longer Elsevier refuses to make their citations open, the clearer it becomes that their high profit model makes them anti-open

Regier takes Elsevier to task, because Elsevier is one of the few major publishers remaining that refuses to make their citations OA.

I4OC requests that all scholarly publishers make references openly available by providing access to the reference lists they submit to Crossref. At present, most of the large publishers—including the American Physical Society, Cambridge University Press, PLOS, SAGE, Springer Nature, and Wiley—have opened their reference lists. As a result, half of the references deposited in Crossref are now freely available. We urge all publishers who have not yet opened their reference lists to do so now. This includes the American Chemical Society, Elsevier, IEEE, and Wolters Kluwer Health. By far the largest number of closed references can be found in journals published by Elsevier: of the approximately half a billion closed references stored in Crossref, 65% are from Elsevier journals. Opening these references would place the proportion of open references at nearly 83%.

Open citations: A letter from the scientometric community to scholarly publishers

There would be so much value unleashed if we could release the citations to our faculty’s research as open access.

Open Citations could lead to new ways of exploring and understanding the scholarly ecosystem. Some of these potential tools were explored by Aaron Tay in his post, More about open citations — Citation Gecko, Citation extraction from PDF & LOC-DB.

Furthermore, releasing citations as OA would enable them to be added to platforms such as Wikidata and available for visualization using the Scholia tool, pictured above.

So that’s where I’m at.

I want to change the practice at MPOW to include all published faculty research, scholarship, and creative work in the Institutional Repository and if we are unable to publish these works as open access in our IR, we will include it as embargoed, dark deposit until it is confidently in the public domain. I want the Institutional Repository to live up to its name and have all the published work of the Institution.

Is this a good idea, or no? Are there pitfalls that I have not foreseen? Is my reasoning shaky? Please let me know.

If the map becomes the territory then we will be lost

That which computation sets out to map and model it eventually takes over. Google sets out to index all human knowledge and becomes the source and the arbiter of that knowledge: it became what people think. Facebook set out to map the connections between people – the social graph – and became the platform for those connections, irrevocably reshaping societal relationships. Like an air control system mistaking a flock of birds for a fleet of bombers, software is unable to distinguish between the model of the world and reality – and, once conditioned, neither are we.

James Bridle, New Dark Age, p.39.

I am here to bring your attention to two developments that are making me worried:

  1. The Social Graph of Scholarly Communications is becoming more tightly bound into institutional metrics that have an increasing influence on institutional funding
  2. The publishers of the Social Graph of Scholarship are beginning to enclose the Social Graph, excluding the infrastructure of libraries and other independent, non-profit organizations

Normally, I would try to separate these ideas into two dedicated posts but in this case, I want to bring them together in writing because if these two trends converge together, things will become very bad, very quickly.

Let me start with the first trend:

1. The social graph that binds

When I am asked to explain how to achieve a particular result within scholarly communication, more often than not, I find myself describing four potential options:

  1. a workflow of Elsevier products (BePress, SSRN, Scopus, SciVal, Pure)
  2. a workflow of Clarivate products (Web of Science, InCites, Endnote, Journal Citation Reports)
  3. a workflow of Springer-Nature products (Dimensions, Figshare, Altmetrics)
  4. a DIY workflow from a variety of independent sources (the library’s institutional repository, ORCiD, Open Science Framework)

Workflow is the new content.

That line – workflow is the new content – is from Lorcan Dempsey and it was brought to my attention by Roger Schonfeld. For Open Access week, I gave a presentation on this idea of being mindful of workflow and tool choices in a presentation entitled, A Field Guide to Scholarly Communications Ecosystems. The slides are below.

(My apologies for not sharing the text that goes with the slides. Since January of this year, I have been the Head of the Information Services Department at my place of work. In addition to this responsibility, much of my time this year has been spent covering the work of colleagues currently on leave. Finding time to write has been a challenge.)

In Ontario, each institution of higher education must submit a ‘Strategic Mandate Agreement‘ with its largest funding body, the provincial government. Universities are currently in the second iteration of these types of agreements and are preparing for the third round. These agreements are considered fraught by many, including Marc Spooner, a professor in the faculty of education at the University of Regina, who wrote the following in an opinion piece in University Affairs:

The agreement is designed to collect quantitative information grouped under the following broad themes: a) student experience; b) innovation in teaching and learning excellence; c) access and equity; d) research excellence and impact; and e) innovation, economic development and community engagement. The collection of system-wide data is not a bad idea on its own. For example, looking at metrics like student retention data between years one and two, proportion of expenditures on student services, graduation rates, data on the number and proportion of Indigenous students, first-generation students and students with disabilities, and graduate employment rates, all can be helpful.

Where the plan goes off-track is with the system-wide metrics used to assess research excellence and impact: 1) Tri-council funding (total and share by council); 2) number of papers (total and per full-time faculty); and 3) number of citations (total and per paper). A tabulation of our worth as scholars is simply not possible through narrowly conceived, quantified metrics that merely total up research grants, peer-reviewed publications and citations. Such an approach perversely de-incentivises time-consuming research, community-based research, Indigenous research, innovative lines of inquiry and alternative forms of scholarship. It effectively displaces research that “matters” with research that “counts” and puts a premium on doing simply what counts as fast as possible…

Even more alarming – and what is hardly being discussed – is how these damaging and limited terms of reference will be amplified when the agreement enters its third phase, SMA3, from 2020 to 2023. In this third phase, the actual funding allotments to universities will be tied to their performance on the agreement’s extremely deficient metrics.

Ontario university strategic mandate agreements: a train wreck waiting to happen”, Marc Spooner, University Affairs, Jan 23 2018

The measure by which citation counts for each institution are going to be assessed have already been decided. The Ontario government has already stated that it is going to use Elsevier’s Scopus (although I presume they really meant SciVal).

What could possibly go wrong? To answer that question, let’s look at the second trend: enclosure.

2. Enclosing the social graph

The law locks up the man or woman
Who steals the goose from off the common
But leaves the greater villain loose
Who steals the common from off the goose.

Anonymous, “The Goose and the Commons”

As someone who spends a great deal of time ensuring that the scholarship of the University of Windsor’s Institutional Repository meets the stringent restrictions set by publishers, it’s hard not to feel a slap in the face when reading Springer Nature Syndicates Content to ResearchGate.

ResearchGate has been accused of “massive infringement of peer-reviewed, published journal articles.”

They say that the networking site is illegally obtaining and distributing research papers protected by copyright law. They also suggest that the site is deliberately tricking researchers into uploading protected content.

Who is the they of the above quote? Why they is the publishers, the American Chemical Society and Elsevier.

It is not uncommon to find selective enforcement of copyright within the scholarly communication landscape. Publishers have cast a blind eye to the copyright infringement of ResearchGate and Academia.edu for years, while targeting course reserve systems set up by libraries.

Any commercial system that is part of the scholarly communication workflow can be acquired for strategic purposes.

As I noted in my contribution to Grinding the Gears: Academic Librarians and Civic Responsibility, sometimes companies purchase competing companies as a means to control their development and even to shut their products down.

One of the least understood and thus least appreciated functions of calibre is that it uses the Open Publication Distribution System (OPDS) standard (opds-spec.org) to allow one to easily share e-books (at least those without Digital Rights Management software installed) to e-readers on the same local network. For example, on my iPod Touch, I have the e-reader program Stanza (itunes.apple.com/us/app/stanza/id284956128) installed and from it, I can access the calibre library catalogue on my laptop from within my house, since both are on the same local WiFi network. And so can anyone else in my family from their own mobile device. It’s worth noting that Stanza was bought by Amazon in 2011 and according to those who follow the digital e-reader market, it appears that Amazon may have done so solely for the purpose of stunting its development and sunsetting the software (Hoffelder,2013)

Grinding the Gears: Academic Librarians and Civic Responsibility” Lisa Sloniowski, Mita Williams, Patti Ryan, Urban Library Journal. Vol. 19. No.1 (2013). Special Issue: Libraries, Information and the Right to the city: Proceedings of the 2013 LACUNY Institute.

And sometimes companies acquire products to provide a tightly integrated suite of services and seamless workflow.

If individual researchers determine that seamlessness is valuable to them, will they in turn license access to a complete end-to-end service for themselves or on behalf of their lab?

And, indeed, whatever model the university may select, if individual researchers determine that seamlessness is valuable to them, will they in turn license access to a complete end-to-end service for themselves or on behalf of their lab?  So, the university’s efforts to ensure a more competitive overall marketplace through componentization may ultimately serve only to marginalize it.

“Big Deal: Should Universities Outsource More Core Research Infrastructure?”, Roger C. Schonfeld, January 4, 2018

Elsevier bought BePress in August of 2017. In May of 2016, Elsevier acquired SSRN. Bepress and SSRN are currently exploring further “potential areas of integration, including developing a single upload experience, conducting expanded research into rankings and download integration, as well as sending content from Digital Commons to SSRN.

Now, let’s get to the recent development that has me nervous.

10.2 Requirements for Plan S compliant Open Access repositories

The repository must be registered in the Directory of Open Access Repositories (OpenDOAR) or in the process of being registered.

In addition, the following criteria for repositories are required:

  • Automated manuscript ingest facility
  • Full text stored in XML in JATS standard (or equivalent)
  • Quality assured metadata in standard interoperable format, including information on the DOI of the original publication, on the version deposited (AAM/VoR), on the open access status and the license of the deposited version. The metadata must fulfil the same quality criteria as Open Access journals and platforms (see above). In particular, metadata must include complete and reliable information on funding provided by cOAlition S funders. OpenAIRE compliance is strongly recommended.
  • Open API to allow others (including machines) to access the content
  • QA process to integrate full text with core abstract and indexing services (for example PubMed)
  • Continuous availability

Automated manuscript ingest facility probably gives me the most pause. Automated means a direct pipeline from publisher to institutional repository that could be based on a publishers’ interpretation of fair use/fair dealing and we don’t know what the ramifications of that decision making might be. I’m feeling trepidation because I believe we are already experiencing the effects of a tighter integration between manuscript services and the IR.

Many publishers – including Wiley, Taylor and Francis, IEEE, and IOP – already use a third party manuscript service called ScholarOne. ScholarOne integrates the iThenticate service which produces reports of what percentage of a manuscript has already been published. Journal editors have the option to set what extent a paper can make use of a researcher’s prior work, including their thesis. Manuscripts that exceed these set thresholds can be automatically rejected without human interjection from the editor. We are only just starting to understand how this workflow is going to impact the willingness of young scholars to make their theses and dissertations open access.

It is also worth noting that ScholarOne is owned by Clarivate Analytics, the parent company of Web of Science, Incites, Journal Citation Reports, and others. One on hand, having a non-publisher act as a third party to the publishing process is probably ideal since it reduces the chances of a conflict of interest. On the other hand, I’m very unhappy with Clarivate Analytics’s product called Kopernio which provides “fast, one-click access to millions of research papers” and “integrates with Web of Science, Google Scholar, PubMed” and 20,000 other sites” (including ResearchGate and Academia.edu natch). There are prominent links to Kopernio within Web of Science that essentially positions the product as a direct competitor to a university library’s link resolver service and in doing so, removes the library from the scholarly workflow – other than the fact that the library pays for the product’s placement.

The winner takes it all

The genius — sometimes deliberate, sometimes accidental — of the enterprises now on such a steep ascent is that they have found their way through the looking-glass and emerged as something else. Their models are no longer models. The search engine is no longer a model of human knowledge, it is human knowledge. What began as a mapping of human meaning now defines human meaning, and has begun to control, rather than simply catalog or index, human thought. No one is at the controls. If enough drivers subscribe to a real-time map, traffic is controlled, with no central model except the traffic itself. The successful social network is no longer a model of the social graph, it is the social graph. This is why it is a winner-take-all game.

Childhood’s End, Edge, George Dyson [1.1.19]

Bret Victor, Bruno Latour, the citations that bring them together, and the networks that keep them apart

Occasionally I have the opportunity to give high school students an introduction to research in a university context. During this introduction I show them an example of a ‘scholarly paper’ so they can take in the visual cues that might help them recognize other scholarly papers in their future.

 

After I point out the important features, I take the time to point out this piece of dynamic text on the page:

I know these citation counts come from CrossRef because I have an old screen capture that shows that the citation count section used to looks like this:

I tell the students that this article has a unique identifier number called a DOI and that there is a system called CrossRef that tracks how many bibliographies where this number appears.

And then I scan the faces of the room and if I don’t see sufficient awe, I inform the class that a paper’s ability to express its own impact outside of itself is forking amazing.

The ability to make use of the CrossRef API is reserved for CrossRef members with paid memberships or those who pay for access.

This means that individual researchers cannot make use of the CrossRef API and embed their own citation counts without paying CrossRef.

Not even Bret Victor:

 

The image above is from the end of Bret Victor’s CV.

The image below is from the the top: of Bret Victor’s CV which describes himself through the words of two notable others:

 

I like to think that the library is a humane medium that helps thinkers see, understand, and create systems. As such, I think librarians have much to learn from Bret Victor.

Bret Victor designs interfaces and his thinking has been very influential to many. How can  I express the extent of his influence to you?

Bret Victor chooses not to publish in academic journals but rather opts to publish his essays on his website worrydream.com. The videos of some of his talks are available on Vimeo.

Here are the citation counts to these works, according to Google Scholar:

 

 

It is an accepted notion that the normative view of science expounded by Merton, provided a sociological interpretation of citation analysis in the late 1960s and 70s. According to his theory, a recognition of the previous work of scientists and of the originality of their work is an institutional form of awarding rewards for efforts. Citations are a means of providing such recognition and reward.

The above is the opening paragraph of, “Why hasn’t Latour’s Theory of Citations Been Ignored By the Bibliometric Community? Discussion of Sociological Interpretation of Citation Analysis” by Terttu Luukkonen.

Latour’s views of citations are part of his research on the social construction of scientific facts and laboratories, science in the making as contrasted with ready made science, that is beliefs which are treated as scientific facts and are not questioned… In this phase, according to Latour, references in articles are among the resources that are under author’s command in their efforts at trying to “make their point firm” and to lend support to their knowledge claims. Other “allies” or resources are, for example, the editors of the journals which publish the articles, the referees of the journals, and the research funds which finance the pieces of research…

Latour’s theory has an advantage over that of Merton’s in that it can explain many of the findings made in the so-called citation content and context studies mentioned. These findings relate to the contents of citations, which are vastly different and vary from one situation to another; also the fact that the surrounding textual contexts in which they are used differ greatly. Such differences include whether citations are positive or negational, essential to the references text or perfunctory, whether they concern concepts or techniques or neither, whether they provide background reading, alert readers to new work, provide leads, etc.

The above passage is from page 29 of the article.

On page 31, you can find this passage:

The Latourian views have been largely ignored by the bibliographic community if their discussions about citations. The reasons why this is so are intriguing. An important conceptual reason is presumably the fact that in Latourian theory, the major of references is to support the knowledge claims of the citing author. This explanation does not legitimate major uses of citation indexing, its use as a performance measure – as in the use of citation counts which presupposes that references indicate a positive assessment of the cited document — or as an indication of the development of specialties – as in co-citation analysis.

You may have heard of Bret Victor just earlier this week. His work is described of in an article from The Atlantic called The Scientific Paper is Obsolete. Here’s What’s Next.

 

The article contains this passage:

What would you get if you designed the scientific paper from scratch today? A little while ago I spoke to Bret Victor, a researcher who worked at Apple on early user-interface prototypes for the iPad and now runs his own lab in Oakland, California, that studies the future of computing. Victor has long been convinced that scientists haven’t yet taken full advantage of the computer. “It’s not that different than looking at the printing press, and the evolution of the book,” he said. After Gutenberg, the printing press was mostly used to mimic the calligraphy in bibles. It took nearly 100 years of technical and conceptual improvements to invent the modern book. “There was this entire period where they had the new technology of printing, but they were just using it to emulate the old media.”

Victor gestured at what might be possible when he redesigned a journal article by Duncan Watts and Steven Strogatz, “Collective dynamics of ‘small-world’ networks.” He chose it both because it’s one of the most highly cited papers in all of science and because it’s a model of clear exposition. (Strogatz is best known for writing the beloved “Elements of Math” column for The New York Times.)

The Watts-Strogatz paper described its key findings the way most papers do, with text, pictures, and mathematical symbols. And like most papers, these findings were still hard to swallow, despite the lucid prose. The hardest parts were the ones that described procedures or algorithms, because these required the reader to “play computer” in their head, as Victor put it, that is, to strain to maintain a fragile mental picture of what was happening with each step of the algorithm.

Victor’s redesign interleaved the explanatory text with little interactive diagrams that illustrated each step. In his version, you could see the algorithm at work on an example. You could even control it yourself.

The article goes on to present two software driven alternatives to the PDF paper-mimicking practices of academia : notebooks from private company Mathematica and open source Jupyter Notebooks.

Perhaps it was for length or other editorial reasons but the article doesn’t go into Bret Victor’s own work on reactive documents that are best introduced by his self-published essay called ‘Explorable Explanations‘.  There is a website dedicated to collecting dynamic works inspired by Bret’s essay from Nicky Case, who has created some remarkable examples including Parable of the Polygons and The Evolution of Trust.

Or maybe it’s not odd that his work wasn’t mentioned.  From T. Luukkonen’s Latour’s Theory of Citations:

The more people believe in a statement and use it as an unquestioned fact, as a black box, the more it undergoes transformations. It may even undergo a process which Latour calls stylisation or erosion, but which Garfield called obliteration by information, that is, a scientist’s work becomes so generic tot he field, so integrated into its body of knowledge that people neglect to cite it explicitly.

At the end of 2013, Bret Victor published a page of things that ‘Bret fell in love with this year’. The first item on his list was the paper Visualization and Cognition: Drawing Things Together [pdf] from French philosopher, anthropologist and sociologist, Bruno Latour.

On page five of this paper is this passage, which I came across across again and again during my sabbatical when I was doing a lot of reading about maps:

One example will illustrate what I mean. La Pérouse travels through the Pacific for Louis XVI with the explicit mission of bringing back a better map. One day, landing on what he calls Sakhalin he meets with Chinese and tries to learn from them whether Sakhalin is an island or a peninsula. To his great surprise the Chinese understand geography quite well. An older man stands up and draws a map of his island on the sand with the scale and the details needed by La Pérouse. Another, who is younger, sees that the rising tide will soon erase the map and picks up one of La Pérouse’s notebooks to draw the map again with a pencil . . .

What are the differences between the savage geography and the civilized one? There is no need to bring a prescientific mind into the picture, nor any distinction between the close and open predicaments (Horton, 1977), nor primary and secondary theories (Horton, 1982), nor divisions between implicit and explicit, or concrete and abstract geography. The Chinese are quite able to think in terms of a map but also to talk about navigation on an equal footing with La Pérouse. Strictly speaking, the ability to draw and to visualize does not really make a difference either, since they all draw maps more or less based on the same principle of projection, first on sand, then on paper. So perhaps there is no difference after all and, geographies being equal, relativism is right. This, however, cannot be, because La Pérouse does something that is going to create an enormous difference between the Chinese and the European. What is, for the former, a drawing of no importance that the tide may erase, is for the latter the single object of his mission. What should be brought into the picture is how the picture is brought back. The Chinese does not have to keep track, since he can generate many maps at will, being born on this island and fated to die on it. La Pérouse is not going to stay for more than a night; he is not born here and will die far away. What is he doing, then? He is passing through all these places, in order to take something back to Versailles where many people expect his map to determine who was right and wrong about whether Sakhalin was an island, who will own this and that part of the world, and along which routes the next ships should sail.

Science requires a paper to be brought back from our endeavours.

I thought of Latour when I read this particular passage from The Atlantic article:

Pérez told me stories of scientists who sacrificed their academic careers to build software, because building software counted for so little in their field: The creator of matplotlib, probably the most widely used tool for generating plots in scientific papers, was a postdoc in neuroscience but had to leave academia for industry. The same thing happened to the creator of NumPy, a now-ubiquitous tool for numerical computing. Pérez himself said, “I did get straight-out blunt comments from many, many colleagues, and from senior people and mentors who said: Stop doing this, you’re wasting your career, you’re wasting your talent.” Unabashedly, he said, they’d tell him to “go back to physics and mathematics and writing papers.”

What else is software but writing on sand?

I wanted to highlight Bret Victor’s to my fellow library workers for what I thought were several reasons. But the more I thought about it, the more reasons came to mind. But I don’t want to try your patience any longer so consider this a potential beginning of a short series of blog posts.

I’ll end this section with why wrote about Bret Victor, Bruno Latour, and citations. It has to do with this website, Northwestern University’s Faculty Directory powered by Pure:

 

More and more of our academic institutions are making use of Pure and other similar CRISes that create profiles of people that are generated from the texts we write and the citations we make.

Despite Latour, we are still using citations as a performative measurement.

 

I think we need a more humane medium that helps thinkers see and understand the systems we work in.