Using discovery tools for presenting integrated information
There has been a lot of discussion in recent years about library discovery tools. Basically, a library discovery tool provides a centrally maintained shared scholarly material metadata index, a system for searching and an option for adding a local metadata index. Academic libraries use it for providing a unified access platform to subscribed and open access databases and ejournals as well as their own local print and digital holdings.
I would like to put forward that, despite their shortcomings, library discovery tools can also be used for finding and presenting other scholarly information in the broadest sense. Libraries should look beyond the narrow focus on limitations and turn imperfection into benefits.
The two main points of discussion regarding discovery tools are the coverage of the central shared index and relevance ranking. For a number of reasons of a practical, technical and competitive nature, none of the commercial central indexes cover all the content that academic libraries may subscribe to. Relevance ranking of search results depends on so many factors that it is a science in itself to satisfy each and every end user with their own specific background and context. Discovery tool vendors spend a lot of energy in improving coverage and relevance ranking.
These two problems are the reason that not many academic libraries have been able to achieve the one-stop unified scholarly information portals for their staff and students that discovery tool providers promised them. In most cases the institutional discovery portal is just one of the solutions for finding scholarly publications that are offered by the library. A number of libraries are reconsidering their attitude towards discovery tools, or have even decided to renounce these tools altogether and focus on delivery instead, leaving discovery to external parties like Google Scholar.
I fully support the idea that libraries should reconsider their attitude towards discovery tools, but I would like to stress that they should do so with a much broader perspective than just the traditional library responsibility of providing access to scholarly publications. Libraries must not throw away the baby with the bathwater. They should realise that a discovery tool can be used as a platform for presenting connected scholarly information, for instance publications with related research project information and research datasets, based on linked open data principles. You could call this the “poor person’s linked open data platform”, because the library has already paid the license fee for the discovery platform, and it does not have to spend a lot of extra money on additional linked open data tools and facilities.
Of course this presupposes a number of things: the content to be connected should have identifiers, preferably in the form of URIs, and should be openly available for reuse, preferably via RDF. The discovery tools should be able to process URIs and RDF and present the resolved content in their user interfaces. We all know that this is not the case yet. Long term strategies are needed.
Content providers must be convinced of the added value of adding identifiers and URIs to their metadata and providing RDF entry points. In the case of publishers of scholarly publications this means identifiers/URIs for the publications themselves, but also for authors, contributors, organisations, related research projects and datasets. A number of international associations and initiatives are already active in lobbying for these developments: OpenAIRE, Research Data Alliance, DataCite, the W3C Research Object for Scholarly Communication Community Group, etc. Universities themselves can contribute by adding URIs and RDF to their own institutional repositories and research information systems. Some universities are implementing special tools for providing integrated views on research information based on linked data, such as VIVO.
There are also many other interesting data sources that can be used to integrate information in discovery tools, for instance in the government and cultural heritage domain. Many institutions in these areas already provide linked open data entry points. And then there is WikiPedia with its linked open data interface DBpedia.
On the other side of the scale discovery tool providers must be convinced of the added value of providing procedures for resolving URIs and processing RDF in order to integrate information from internal and external data sources into new knowledge. I don’t know of any plans for implementing linked open data features in any of the main commercial or open source discovery tools, except for Ex Libris’ Primo. OCLC provides a linked data section for each WorldCat search result, but that is mainly focused on publishing their own bibliographic metadata in linked data format, using links to external subject and author authority files. This is a positive development, but it’s not consumption and reuse of external information in order to create new integrated knowledge beyond the bibliographic domain.
With the joint IGeLU/ELUNA Linked Open Data Special Interest Working Group the independent Ex Libris user groups have been communicating with Ex Libris strategy and technology management on the best ways to implement much needed linked open data features in their products. The Primo discovery tool (with the Primo Central shared metadata index) is one of the main platforms in focus. Ex Libris is very keen on getting actual use cases and scenarios in order to identify priorities in going forward. We have been providing these for some time now through publications, presentations at user group conferences, monthly calls and face to face meetings. Ex Libris is also exploring best practices for the technical infrastructure to be used and is planning pilots with selected customers.
The Austrian national library service OBVSG for instance has integrated WikiPedia/DBpedia information about authors in their Primo results.
The Saxon State and University Library Dresden (SLUB) has implemented a multilingual semantic search tool for subjects based on DBpedia in their Primo installation.
At the University of Amsterdam I have been experimenting myself with linking publications from our Institutional Repository (UvA DARE) in Primo with related research project information. This has for now resulted in adding extra external links to that information in the Dutch National Research portal NARCIS, because NARCIS doesn’t provide RDF yet. We are communicating with DANS, the NARCIS provider, about extending their linked open data features for this purpose.
Of course all these local implementations can serve as use cases for discovery tool providers.
I have only talked about the options of using discovery tools as a platform for consuming, reusing and presenting external linked open data, but I can imagine that a discovery tool can also be used as a platform for publishing linked open data. It shouldn’t be too hard to add extra RDF options besides the existing HTML and internal record format output formats. That way libraries could have a full linked open data consumption and publishing workbench at their disposal at minimal cost. Library discovery tools would from then on be known as information discovery tools.