§ Occam’s Razor Blade
Just before the school year began, I gave a tour of the law library to a small group of students who were a week away from starting their first year at law school. Our tour group ran into another small tour group: my colleagues at the faculty were giving a retired judge a personal tour of the renovated building. This gentleman was a alumni of the school and he kindly gave the students some advice for their years ahead.
As were among the library stacks, he told them to watch out for colleagues bearing razor blades and then gave me a knowing look. I nodded and smiled in understanding but the students were visibly confused. Seeing this, he then explained that in his day, when their professor would give them cases to read, there would be a mad rush to find the headnotes for the case. And sometimes, an unscrupulous student would cut out the headnotes to either gain exclusive access to them or to avoid the cost and bother of photocopying. Or both.
For those outside of legal research unfamiliar with the term headnote:
headnote (1855) A case summary that appears before a printed judicial opinion in a law report, addresses a point of law, and usu. includes the relevant facts bearing on that point of law. — Also termed syllabus; synopsis; reporter’s syllabus.
B.A. Garner, ed, Black’s Law Dictionary, 12th ed. (Thomas Reuters 2024), “headnote”
§ Headnotes reproduced verbatim and presented as ‘case summaries’
In November 12th, UBC law professor Benhamin Perrin reviewed Lexis+ AI unfavoruably in an article published by the national magazine of the Canadian Bar Association.
Hoping for a better outcome, I asked Lexis+ AI to summarize the Supreme Court of Canada’s Reference re Senate Reform. Instead of generating an original summary, it simply copied verbatim the headnote from the case (including the Supreme Court Reports page numbers), offering no added value. Even worse, when I requested a shorter summary, bizarrely, Lexis+ AI provided another verbatim summary, but this time of an entirely unrelated case involving a construction dispute from Alberta. At this point, I decided to hold off on further tests until a LexisNexis faculty training session…
… When I asked for a summary of Reference re Senate Reform, Lexis+ AI once again reproduced the headnote verbatim. When I followed up and requested “a shorter summary,” it produced the verbatim headnote from the Supreme Court’s decision in Charkaoui v. Canada (Citizenship and Immigration).
For reasons that seem curious to me, Jeff Pfeifer, Chief Product Officer of LexisNexis North America & UK put out a on response on Nov 18th that was sent to Artificial Lawyer who posted excerpts of it. Or perhaps it was posted on LinkedIN? Like others, I can’t find the original source. The full comments, supposedly can be found at the end of this Legal Insider post.
I mention this because I found this comment from Pfeifer as a bit of stretch:
Professor Perrin suggested that users prefer AI-summarized cases to human-summarized cases. We appreciate his suggestion and will consider this in future product development.
When I read Perrin’s article, I am unable to find a passage that would support this claim. Was Pfeifer reading an AI-generated summary of Perrin’s article? Was an AI instructed to provide a response designed to deflect criticism?
§ Not keywords but Key People who help unlock knowledge
Here’s an information literacy hot take: The instructional worksheets that academic librarians make students use to encourage students to learn how to develop research questions, extract keywords from said refined questions, and generate synonyms of those keywords if they proved insufficient, have always been a mistake.
Instead, librarians should have encouraged students to act more like journalists. We should have suggested that they ask themselves, “Who could be a good source to help me answer my question?” It might be a professor who has written extensively in the field, but it also might be a reputable science communicator who might be able to better convey more information to someone new to the discipline. It might be a government agency or it might be a think tank advocating for policy change. It might be someone referenced in their text book. It might be the person with a last name that sounds like who their instructor name-dropped in a recent lecture.
This activity would remind the students that knowledge is situational.
As I have noted previously, I like this definition of Situated Knowledge from The Oxford Dictionary of Human Geography:
The idea that all forms of knowledge reflect the particular conditions in which they are produced, and at some level reflect the social identities and social locations of knowledge producers. The term was coined by historian of science Donna Haraway in Simians, Cyborgs, and Women: the Reinvention of Nature (1991) to question what she regarded as two dangerous myths in Western societies. The first was that it is possible to be epistemologically objective, to somehow be a neutral mouthpiece for the world’s truths if one adopts the ‘right’ method of inquiry. The second myth was that science and scientists are uniquely and exclusively equipped to be objective. Haraway was not advocating relativism. Instead, she was calling for all knowledge producers to take full responsibility for their epistemic claims rather than pretending that ‘reality’ has definitively grounded these claims.
§ Not that one. Eugene Garfield from Web of Science
Large language models notoriously generate knowledge claims with missing, limited, or incorrect citations to the original sources of said knowledge. While there exists at least one very good explainer that tries to show how LLMs work, the systems are still considered a black box.
Ai threatens to stuff all authored knowledge claims into a ‘black box’ so that it may emerge as nameless facts.
From T. Luukkonen’s Latour’s Theory of Citations:
The more people believe in a statement and use it as an unquestioned fact, as a black box, the more it undergoes transformations. It may even undergo a process which Latour calls stylisation or erosion, but which Garfield called obliteration by information, that is, a scientist’s work becomes so generic to the field, so integrated into its body of knowledge that people neglect to cite it explicitly.
§ It is Time for Libraries to Claim Both Kinds of Power
One of the most influential texts in librarianship that has affected my understanding of the field I learned from Karen Coyle’s book, FRBR Before and After. I have already written about this work in a previous blog post, so I will include a single pull-quote that I hope that it will do for my purposes here.
If one accepts Wilson’s statement that users wish to find the text that best suits their need, it would be hard to argue that libraries should not be trying to present the best texts to users. This, however, goes counter to the stated goal of the library catalog as that of bibliographic control, and when the topic of “best” is broached, one finds an element of neutrality fundamentalism that pervades some library thinking. This is of course irreconcilable with the fact that some of these same institutions pride themselves on their “readers’ services” that help readers find exactly the right book for them. The popularity of the readers’ advisory books of Nancy Pearl and social networks like Goodreads, where users share their evaluations of texts, show that there is a great interest on the part of library users and other readers to be pointed to “good books.” How users or reference librarians are supposed to identify the right books for them in a catalog that treats all resources neutrally is not addressed by cataloging theory.
Karen Coyle, FRBR, Before and After: A Look at Our Bibliographic Models, ALA Editions, 2016, p. 40.
I am bringing this up for two reasons. First, I’m still sore that academic libraries have not found ways to capture and re-use all the course reading selections made by their faculty over the years. I can use the Open Syllabus Project to find out what some of the most popular titles recommended by — say – Education faculty from around the English speaking world, but I can’t find and re-use the same information from my own institution.
But the real reason why I want to bring this up is because Patrick Wilson’s Two Kinds of Power theory that Coyle draws on can help us understand why the move of search engines from providing a neutral list of links to a single highlighted best answer feels like a power grab.
Wilson takes up the question of the goals of what he calls “bibliography,” albeit applied to the bibliographical function of the library catalog. The message in the book, as I read it, is fairly straightforward once all of Wilson’s points and counterpoints are contemplated. He begins by stating something that seems obvious but is also generally missing from cataloging theory, which is that people read for a purpose, and that they come to the library looking for the best text (Wilson limits his argument to texts) for their purpose. This user need was not included in Cutter’s description of the catalog as an “efficient instrument.” By Wilson’s definition, Cutter (and the international principles that followed) dealt only with one catalog function: “bibliographic control.” Wilson suggests that in fact there are two such functions, which he calls “powers”: the first is the evaluatively neutral description of books, which was first defined by Cutter and is the role of descriptive cataloging, called “bibliographic control”; the second is the appraisal of texts, which facilitates the exploitation of the texts by the reader. This has traditionally been limited to the realm of scholarly bibliography or of “recommender” services.
This definition pits the library catalog against the tradition of bibliography, the latter being an analysis of the resources on a topic, organized in terms of the potential exploitation of the text: general works, foundational works, or works organized by school of thought. These address what he sees as the user’s goal, which is “the ability to make the best use of a body of writings.” The second power is, in Wilson’s view, the superior capability. He describes descriptive control somewhat sarcastically as “an ability to line up a population of writings in any arbitrary order, and make the population march to one’s command” (Wilson 1968)
Karen Coyle, FRBR, Before and After: A Look at Our Bibliographic Models, ALA Editions, 2016, p. 40.
§ Annual Reviews is pandering to librarians and I am here for it.
If you ask people why they like using LLM or Ai systems, a common answer is that people like its ability to summarize large amounts of text at a level that works for them.
I still can’t bring myself to use LLMs for this purpose. If you ask me to summarize something complicated, I’m still going to reach for Wikipedia or a literature review. When it’s the latter, I see if there’s a recent dissertation on the topic at hand and sometimes I visit Annual Reviews.
Katina Magazine is a new publication from Annual Reviews,
Katina is a digital publication that addresses the value of librarians to society and elevates their role as trusted stewards of knowledge. Named after Katina Strauch, the visionary founder of the Charleston Conference, it is written by and for the international communities of librarians, vendors, and publishers.
I like that this publication is named after a person.
§ Reality Hunger
Those who are new to this blog may not know that sometimes I create a collection of disparate texts and hope that the threads that connects the texts together are apparent to the reader. Sometimes I do this because my managerial responsibilities don’t leave me with the time or energy to put things in a cohesive essay but sometimes I do this because I have a fondness for collages of text.
I recognize that this way of writing demands a lot of the reader. For example, I need to trust that you have clicked on the link above to read it in full in order to fully appreciate that I borrow from it to make my last point.
§ A plagiarism
[Bob Dylan’s] originality and his appropriations are as one.
The same might be said of all art. I realized this forcefully when one day I went looking for the John Donne passage quoted above. I know the lines, I confess, not from a college course but from the movie version of 84, Charing Cross Road with Anthony Hopkins and Anne Bancroft. I checked out 84, Charing Cross Road from the library in the hope of finding the Donne passage, but it wasn’t in the book. It’s alluded to in the play that was adapted from the book, but it isn’t reprinted. So I rented the movie again, and there was the passage, read in voice-over by Anthony Hopkins but without attribution. Unfortunately, the line was also abridged so that, when I finally turned to the Web, I found myself searching for the line “all mankind is of one volume” instead of “all mankind is of one author, and is one volume.”
My Internet search was initially no more successful than my library search. I had thought that summoning books from the vasty deep was a matter of a few keystrokes, but when I visited the website of the Yale library, I found that most of its books don’t yet exist as computer text. As a last-ditch effort I searched the seemingly more obscure phrase “every chapter must be so translated.” The passage I wanted finally came to me, as it turns out, not as part of a scholarly library collection but simply because someone who loves Donne had posted it on his homepage. The lines I sought were from Meditation 17 in Devotions upon Emergent Occasions, which happens to be the most famous thing Donne ever wrote, containing as it does the line “never send to know for whom the bell tolls; it tolls for thee.” My search had led me from a movie to a book to a play to a website and back to a book. Then again, those words may be as famous as they are only because Hemingway lifted them for his book title.
Literature has been in a plundered, fragmentary state for a long time.
The Ecstasy of Influence: A plagiarism by Jonathan Lethem, Harper’s Magazine, February 2007.
Those who produce large language models are trying to reduce all of mankind’s text into one volume.
But literature has been in a plundered, fragmentary state for a long time.