How to get through a Scholarly split

Last month I gave a brief run-through about how libraries could use the Wikidata tool Scholia to demonstrate various qualities of a researcher or a group of researcher’s scholarship. My goal was to show that the skilled labour of academic libraries could provide an alternative to expensive third-party services that does not serve disciplines like the law or the humanities very well.

What I did not mention in that post is that we are currently in a very particular time for Wikidata. Let me show you what I mean.

Below is a screen-capture of the Scholia co-author graph of Anneke Smit, one of the faculty at Windsor Law on September 8 2025, from https://scholia.toolforge.org/author/Q133742502#coauthors

The colours in the chart represent the stated gender in each author’s profile. Orange for female, blue for male, and white for not-stated.

Suppose we wanted to use this graph but we didn’t want to include any gender markers. One way we could do this is to take the code from scholia, and re-run in it the Wikidata Query Service (https://query.wikidata.org/), after commenting out the code that colour-codes each author.

Below is a screen-capture of the co-author graph of Anneke Smit (https://w.wiki/FH9D), on September 8 2025, the Wikidata Query Service, from https://query.wikidata.org/

The good news is that we were able to remove the gender markers from the graph. The bad news is that we are now missing a set of authors. You may not have noticed but what this particular graph features are all the co-authors of Smit’s who co-wrote books, magazine articles, and newspaper articles with her.

What happened to all of the authors who co-wrote journal articles with Anneke?

Well, they are on another query service.

Below is a screen-capture of the co-author graph of Anneke Smit on September 8 2025 (https://w.wiki/FH9D), from the Scholarly Wikidata Query Service at https://query-scholarly.wikidata.org

Uh, that’s not what we want. These are the IDs of authors we were looking for but we’re missing their names. We’re also missing the set of co-authors that we first found in our first query; the ones who co-wrote books. Let’s try again

Below is a screen-capture of the co-author graph of Anneke Smit on September 8 2025 (https://w.wiki/FH9g), from the Full Wikidata Query Service at https://query-legacy-full.wikidata.org/

Why are three are three query services?

Well, there are three services because in May of 2025 there was a split in the Wikidata Query Service (WDQS) and WDQS graph.

Wikidata contains a lot of data. It has grown to a size that a single Wikidata Query Service instance can no longer handle it together with the amount of edits and queries it gets. In order to stabilize the Wikidata Query Service it is being split into two distinct query services. This page explains this graph split and what it means for you.

The details of the split

query.wikidata.org used to contain all the data that is in Wikidata. Going forward this is no longer the case. The full graph is split into two distinct graphs: the main graph and the scholarly graph. The split happened on 9 May 2025.

For the scholarly graph a second query service is running at https://query-scholarly.wikidata.org/. It contains the data from entities that match any of the following criteria:

The main graph is on query.wikidata and contains the data from all entities that do not match the above criteria….

Endpoints

These different Wikidata SPARQL query services and endpoints are available:

Because the full-legacy query service goes away in 2026, our co-author graph needs to be amended so that it can run and combine searches on both query services. And with help from the advice on this page, I was able to do this.

My query runs on the Main Wikidata Query service and it can be found at : https://w.wiki/FFdK . It creates the full graph that we need.

Now that you know that the Wikidata service is close to its capacity with its current infrastructure, you might feel some hesitation to experiment with it as a potential infrastructure or service for yourself or your library. And I think that’s understandable. It certainly is a consideration of mine.

In my next post, I’ll describe a more future-proofed project that I’m working on that can replicate some of the functionality of the Wikidata Query service.

I am not abandoning Wikidata and I still continue to contribute to the site but I want to be mindful that the service exists primarily to support Wikipedia. I would love to see if there is any way that GLAM institutions or higher education could help financially support the necessary future infrastructure that Wikidata will need in the next five years.

I would work towards this future if there’s a chance that this could happen.

Fediverse Reactions

4 Responses to “How to get through a Scholarly split”

  1. @MitaWilliams the ability to write SPARQL queries that federate across multiple endpoints is pretty rad. While it's a bit sad that Wikidata had to be split, I think it is setting a good example for a decentralized knowledge ecosystem.

Leave a Reply

Your email address will not be published. Required fields are marked *