One single web page as the single identifier of every book, author or subject
I like the concept of “the web as common publication platform for libraries“, and “every book its own url“, as described by Owen Stephens in two blog posts:
“Its time to change library systems ”
I’d suggest what we really need to think about is a common ‘publication’ platform – a way of all of our systems outputting records in a way that can then be easily accessed by a variety of search products – whether our own local ones, remote union ones, or even ones run by individual users. I’d go further and argue that platform already exists – it is the web!
and “The Future is Analogue ”
If every book in your catalogue had it’s own URL – essentially it’s own address on your web, you would have, in a single step, enabled anyone in the world to add metadata to the book – without making any changes to the record in your catalogue.
This concept of identifying objects by URL:Unified Resource Locator (or maybe better URI: Unified Resource Identifier) is central to the Semantic Web, that uses RDF (resource Description Framework) as a metadata model.
As a matter of fact at ELAG 2008 I saw Jeroen Hoppenbrouwers (“Rethinking Subject Access “) explaining his idea of doing the same for Subject Headings using the Semantic Web concept of triplets. Every subject its own URL or web page. He said: “It is very easy. You can start doing this right away“.
To make the picture complete we only need the third essential component: every author his or her or its own URL!
This ideal situation would have to conform to the Open Access guidelines of course. One single web page serving as the single identifier of every book, author or subject, available for everyone to link their own holdings, subscriptions, local keywords and circulation data to.
In real life we see a number of current initiatives on the web by commercial organisations and non commercial groups, mainly in the area of “books” (or rather “publications”) and “authors”. “Subjects” apparently is a less appealing area to start something like this, because obviously stand-alone “subjects” without anything to link them to are nothing at all, whereas you always have “publications” and “authors”, even without “subjects”. The only project I know of is MACS (Multilingual Acces to Subjects), which is hosted on Jeroen Hoppenbrouwers’ domain.
For publications we have OCLC’s WorldCat, Librarything, Open Library, to name just a few. And of course these global initiatives have had their regional and local counterparts for many years already (Union Catalogues, Consortia models). But this is again a typical example of multiple parallel data stores of the same type of entities. The idea apparently is that you want to store everything in one single database aiming to be complete, instead of the ideal situation of single individual URI’s floating around anywhere on the web.
Ex Libris’ new Unified Resource Management development (URM, and yes: the title of this blog post is an ironic allusion to that acronym), although it promotes sharing of metadata, it does this within another separate system into which metadata from other systems can be copied.
Of course, the ideal picture sketched above is much too simple. We have to be sure which version of a publication, which author and which translation of a subject for instance we are dealing with. For publications this means that we need to implement FRBR (in short: an original publication/work and all of its manifestations), for authors we need author names thesauri, for subjects multilingual access.
I have tried to illustrate this in this simplified and incomplete diagram:
In this model libraries can use their local URI-objects representing holdings and copies for their acquisitions and circulation management, while the bibliographic metadata stay out there in the global, open area. Libraries (and individuals of course) can also attach local keywords to the global metadata, which in turn can become available globally (“social tagging”).
It is obvious that the current initiatives have dealt with these issues with various levels of success. Some examples to illustrate this:
- Work: Desiderius Erasmus – Encomium Moriae (Greek), Laus Stultitiae (Latin), Lof der Zotheid (Dutch), Praise of Folly (English)
- Author: David Mitchell
- Erasmus in WorldCat Identities (one ID, many forms)
- David Mitchell in WorldCat Identities (one id per author)
- David Mitchell in VIAF (one id per author)
- Erasmus in OpenLibrary (one id, one incomplete form)
- Erasmus in VIAF (one id, although from The Netherlands, preferred forms are Swedish, French and German)
- Erasmus in Librarything (no identifier, numerous forms and occurrences)
- David Mitchell in Librarything (one form, “David Mitchell is composed of at least 12 distinct authors“, no way to distinguish)
- David Mitchell in OpenLibrary (one id for multiple authors)
- Erasmus “Praise of folly” in Librarything (numerous entries for all different title variations)
- Erasmus “Praise of folly” in OpenLibrary (numerous entries for all different title variations)
These findings seem to indicate that some level of coordination (which the commercial initiatives apparently have implemented better than the non-commercial ones) is necessary in order to achieve the goal of “one URI for each object”.
Who wants to start?