Two Short Papers on Peer-Produced Digital Libraries
Readers of this blog with an interest in open-access issues may enjoy a pair of short essays I recently posted on SSRN. They bring together a fair amount of the thinking I previously deployed in piecemeal fashion on this blog here, here, here, here, here, and here. (My co-bloggers, of course, have written very perceptively in this area; you can find much more via our Open Access tag). These are the first two entries in what I originally thought would be a trilogy on digital libraries; they will be followed later in the year by an article on Section 108 reform that I have been working on on and off. (And it looks like the trilogy may become a tetralogy before I’m finished, with plans for a new piece on digital verification and authenticity beginning to take shape.) Summaries and links below the jump.
The first paper, Crowdsourcing and Open Access: Collaborative Techniques for Disseminating Legal Materials and Scholarship, looks at the state of play in the open-access world. It reviews a lot of projects (some new, like Google Scholar, others very well established, like LII) that are working to make primary legal source materials (statutes, regs, case law, etc.) freely available online. There’s a lot to like about many of these projects, but many of them suffer from similar design flaws, and on the whole they do not (yet!) represent true substitutes for expensive proprietary electronic databases. I then ask whether we can get closer to that goal by enlarging the pool of contributors to those projects via crowdsourcing. I look fairly closely at a couple of efforts that are underway to build digital libraries on the Internet using crowdsourced techniques, including Distributed Proofreaders and my personal favorite, Wikisource. The paper concludes with a look at the advantages and disadvantages of using crowdsourcing in this area. Here is the abstract of Crowdsourcing and Open Access:
This short essay surveys the state of open access to primary legal source materials (statutes, judicial opinions and the like) and legal scholarship. The ongoing digitization phenomenon (illustrated, although by no means typified, by massive scanning endeavors such as the Google Books project and the Library of Congress’s efforts to digitize United States historical documents) has made a wealth of information, including legal information, freely available online, and a number of open-access collections of legal source materials have been created. Many of these collections, however, suffer from similar flaws: they devote too much effort to collecting case law rather than other authorities, they overemphasize recent works (especially those originally created in digital form), they do not adequately hyperlink between related documents in the collection, their citator functions are haphazard and rudimentary, and they do not enable easy user authentication against official reference sources.
The essay explores whether some of these problems might be alleviated by enlarging the pool of contributors who are working to bring paper records into the digital era. The same “peer production” process that has allowed far-flung communities of volunteers to build large-scale informational goods like the Wikipedia encyclopedia or the Linux operating system might be harnessed to build a digital library. The essay critically reviews two projects that have sought to “crowdsource” proofreading and archiving of texts: Distributed Proofreaders, a project frequently held up as a model in the academic literature on peer production; and Wikisource, a sister site of Wikipedia that improves on Distributed Proofreaders in a number of ways. The essay concludes by offering a few illustrations meant to show the potential for using Wikisource as an open-access repository for primary source materials and scholarship, and considers some possible drawbacks of the crowdsourced approach.
Wikisource takes center stage in the second essay, currently captioned Rich Texts: Wikisource as an Open Access Repository for Law and the Humanities. If Crowdsourcing and Open Access is the “how” piece, Rich Texts is the “why” piece. My goal here is to highlight the artificial boundary that currently exists between various types of open access repositories for legal information: some collect primary source materials, others collect research and scholarship, but nobody has both. Bringing primary source texts and scholarly publications together in a common repository, I argue, would enrich both types of collections. There are both legal and cultural obstacles to creating combined repositories of this type, but I am hoping to help start some discussions around how to reduce or eliminate the most substantial problems. Here’s the abstract of Rich Texts:
Open access to research and scholarship, although well established in the sciences, remains an emerging phenomenon in the legal academy. In recent years, a number of open access repositories have been created to permit self-archiving of legal scholarship (either within or across institutional boundaries), and faculties at some leading research institutions have adopted policies supporting open access to their work. Although existing repositories for legal scholarship represent a clear improvement over proprietary, subscription-based repositories in some ways, their architecture, and the narrowly defined missions they have elected to pursue, limit their ability to illuminate the ongoing dialogue among texts that is a defining characteristic of scholarly discourse in law and the humanities. One of the wiki-based projects operated by the nonprofit Wikimedia Foundation — the Wikisource digital library — improves upon the shortcomings of existing open access repositories by bringing source texts and commentary together in a single place, with additional contextual materials hosted on other Wikimedia Foundation sites just a click away. These features of Wikisource, if more widely adopted, may improve academic discourse by highlighting conceptual interconnections among works, fostering interdisciplinary collaboration, and reducing the competitive advantages of proprietary, closed-access legal information services.
I’ll be happy to receive any feedback.