Posted on August 3rd, 2015 7 comments
Interoperability in heterogeneous library data landscapes
Libraries have to deal with a highly opaque landscape of heterogeneous data sources, data types, data formats, data flows, data transformations and data redundancies, which I have earlier characterized as a “data maze”. The level and magnitude of this opacity and heterogeneity varies with the amount of content types and the number of services that the library is responsible for. Academic and national libraries are possibly dealing with more extensive mazes than small public or company libraries.
In general, libraries curate collections of things and also provide discovery and delivery services for these collections to the public. In order to successfully carry out these tasks they manage a lot of data. Data can be regarded as the signals between collections and services.
These collections and services are administered using dedicated systems with dedicated datastores. The data formats in these dedicated datastores are tailored to perform the dedicated services that these dedicated systems are designed for. In order to use the data for delivering services they were not designed for, it is common practice to deploy dedicated transformation procedures, either manual ones or as automated utilities. These transformation procedures function as translators of the signals in the form of data.
Here lies the origin of the data maze: an inextricably entangled mishmash of systems with explicit and
implicit data redundancies using a number of different data formats, some of which systems are talking to each other in some way. This is not only confusing for end users but also for library system staff. End users lack clarity about user interfaces to use, and are missing relevant results from other sources and possible related information. Libraries need licenses and expertise for ongoing administration, conversion and migration of multiple systems, and suffer unforeseen consequences of adjustments elsewhere.
To take the linguistic analogy further, systems make use of a specific language (data format) to code their signals in. This is all fine as long as they are only talking to themselves. But as soon as they want to talk to other systems that use a different language, translations are needed, as mentioned. Sometimes two systems use the same language (like MARC, DC, EAD), but this does not necessarily mean they can understand each other. There may be dialects (DANMARC, UNIMARC), local colloquialisms, differences in vocabularies and even alphabets (local fields, local codes, etc.). Some languages are only used by one system (like PNX for Primo). All languages describe things in their own vocabulary. In the systems and data universe there are not many loanwords or other mechanisms to make it clear that systems are talking about the same thing (no relations or linked data). And then there is syntax and grammar (such as subfields and cataloguing rules) that allow for lots of variations in formulations and formats.
The transformation utilities functioning as translators of the data signals suffer from a number of limitations. They translate between two specific languages or dialects only. And usually they are employed by only one system (proprietary utilities). So even if two systems speak the same language, they probably both need their own translator from a common source language. In many cases even two separate translators are needed if source and target system do not speak each other’s language or dialect. The source signals are translated to some common language which in turn is translated into the target language. This export-import scenario, which entails data redundancy across systems, is referred to as ETL (Extract Transform Load). Moreover, most translators only know a subset of the source and target language dependent on the data signals needed by the provided services. In some cases “data mappings” are used as conversion guides. This term does not really cover what is actually needed, as I have tried to demonstrate. It is not enough to show the paths between source and target signals. It is essential to add the selections and transformations needed as well. In order to make sense of the data maze you need a map, a dictionary and a guidebook.
To make things even more complicated, sometimes reading data signals is only possible with a passport or visa (authentication for access to closed data). Or even worse, when systems’ borders are completely closed and no access whatsoever is possible, not even with a passport. Usually, this last situation is referred to with the term “data silos”, but that is not the complete picture. If systems are fully open, but their data signals are coded by means of untranslatable languages or syntaxes, we are also dealing with silos.
Anyway, a lot of attention and maintenance is required to keep this Tower of Babel functioning. This practice is extremely resource-intensive, costly and vulnerable. Are there any solutions available to diminish maintenance, costs and vulnerability? Yes there are.
First of all, it is absolutely crucial to get acquainted with the maze. You need a map (or even an atlas) to be able to see which roads are there, which ones are inaccessible, what traffic is allowed, what shortcuts are possible, which systems can be pulled down and where new roads can be built. This role can be fulfilled by a Dataflow Repository, which presents an up-to-date overview of locations and flows of all content types and data elements in the landscape.
Secondly it is vital to be able to understand the signals. You need a dictionary to be able to interpret all signals, languages, syntaxes, vocabularies, etc. A Data Dictionary describing data elements, datastores, dataflows and data formats is the designated tool for this.
And finally it is essential to know which transformations are taking place en route. A guidebook should be incorporated in the repository, describing selections and transformations for every data flow.
You could leave it there and be satisfied with these guiding tools to help you getting around the existing data maze more efficiently, with all its ETL utilities and data redundancies. But there are other solutions, that focus on actually tackling or even eliminating the translation problem. Basically we are looking at some type of Service Oriented Architecture (SOA) implementation. SOA is a rather broad concept, but it refers to an environment where individual components (“systems”) communicate with each other in a technology and vendor agnostic way using interoperable building blocks (“services”). In this definition “services” refer to reusable dataflows between systems, rather than to useful results for end users. I would prefer a definition of SOA to mean “a data and utilities architecture focused on delivering optimal end user services no matter what”.
Broadly speaking there are four main routes to establish a SOA-like condition, all of which can theoretically be implemented on a global, intermediate or local level.
- Single Store/Single Format: A single universal integrated datastore using a universal data format. No need for dataflows and translations. This would imply some sort of linked (open) data landscape with RDF as universal language and serving all systems and services. A solution like this would require all providers of relevant systems and databases to commit to a single universal storage format. Unrealistic in the short term indeed, but definitely something to aim for, starting at the local level.
- Multiple Stores/Shared Format: A heterogeneous system and datastore landscape with a universal communication language (a lingua franca, like English) for dataflows. No need for countless translators between individual systems. This universal format could be RDF in any serialization. A solution like this would require all providers of relevant systems and databases to commit to a universal exchange format. Already a bit less unrealistic.
- Shared Store/Shared Format: A heterogeneous system and datastore landscape with a central shared intermediate integrated datastore in a single shared format. Translations from different source formats to only one shared format. Dataflows run to and from the shared store only. For instance with RDF functioning as Esperanto, the artificial language which is actually sometimes used as “Interlingua” in machine translation. A solution like this does not require a universal exchange format, only a translator that understands and speaks all formats, which is the basis of all ETL tools. This is much more realistic, because system and vendor dependencies are minimized, except for variations in syntax and vocabularies. The platform itself can be completely independent.
- Multiple Stores/Single Translation Pool: or what is known as an Enterprise Service Bus (ESB). No translations are stored, no data is integrated. Simultaneous point to point translations between systems happen on the fly. Looks very much like the existing data maze, but with all translators sitting together in one cubicle. This solution is not a source of much relief, or as one large IT vendor puts it: “Using an ESB can become problematic if large volumes of data need to be sent via the bus as a large number of individual messages. ESBs should never replace traditional data integration like ETL tools. Data replication from one database to another can be resolved more efficiently using data integration, as it would only burden the ESB unnecessarily.”.
Overlooking the possible routes out of the data maze, it seems that the first step should be employing the map, dictionary and guidebook concept of the dataflow repository, data dictionary and transformation descriptions. After that the only feasible road on the short term is the intermediate integrated Shared Store/Shared Format solution.
Posted on August 26th, 2014 2 comments
On August 14 the IFLA 2014 Satellite Meeting ‘Linked Data in Libraries: Let’s make it happen!’ took place at the National Library of France in Paris. Rurik Greenall (who also wrote a very readable conference report) and I had the opportunity to present our paper ‘An unbroken chain: approaches to implementing Linked Open Data in libraries; comparing local, open-source, collaborative and commercial systems’. In this paper we do not go into reasons for libraries to implement linked open data, nor into detailed technical implementation options. Instead we focus on the strategies that libraries can adopt for the three objectives of linked open data, original cataloguing/creating of linked data, exposing legacy data as linked open data and consuming external linked open data. Possible approaches are: local development, using Free and open Source Software, participating in consortia or service centres, and relying on commercial vendors, or any combination of these. Our main conclusions and recommendations are: identify your business case, if you’re not big enough be part of some community, and take lifecycle planning seriously.
The other morning presentations provided some interesting examples of a number of approaches we described in our talk. Valentine Charles presented the work in the area of aggregating library and heritage data from a large number of heterogeneous sources in different languages by two European institutions that de facto function as large consortia or service centres for exposing and enriching data, Europeana and The European Library. Both platforms not only expose their aggregated content in web pages for human consumption but also as linked open data, besides other so called machine readable formats. Moreover they enrich their aggregated content by consuming data from their own network of providers and from external sources, for instance multilingual “value vocabularies” like thesauri, authority lists, classifications. The ideas is to use concepts/URIs together with display labels in multiple languages. For Europeana these sources currently are GeoNames, DBPedia and GEMET. Work is being done on including the Getty Art and Architecture Thesaurus (AAT) which was recently published as Linked Open Data. Besides using VIAF for person authorities, The European Library has started adding multilingual subject headings by integrating the Common European Research Classification Scheme, part of the CERIF format. The use of MACS (Multilingual Access to Subjects) as Linked Open Data is being investigated. This topic was also discussed during the informal networking breaks. Questions that were asked: is MACS valuable for libraries, who should be responsible for MACS and how can administering MACS in a Linked Open Data environment best be organized? Personally I believe that a multilingual concept based subject authority file for libraries, archives, museums and related institutions is long overdue and will be extremely valuable, not only in Linked Open Data environments.
The importance of multilingual issues and the advantages that Linked Open Data can offer in this area were also demonstrated in the presentation about the Linked Open Authority Data project at the National Diet Library of Japan. The Web NDL Authorities are strongly connected to VIAF and LCSH among others.
The presentation of the Linked Open Data environment of the National Library of France BnF (http://data.bnf.fr) highlighted a very interesting collaboration between a large library with considerable resources in expertise, people and funding on the one hand, and the non-library commercial IT company Logilab. The result of this project is a very sophisticated local environment consisting of the aggregated data sources of the National Library and a dedicated application based on the free software tool Cubicweb. An interesting situation arose when the company Logilab itself asked if the developed applications could be released as Open Source by the National Library. The BnF representative Gildas Illien (also one of the organizers of the meeting together with Emmanuelle Bermes) replied with considerations about planning, support and scalability, which is completely understandable from the perspective of lifecycle planning.
With all these success stories about exposing and publishing Linked Open Data, the question always remains if the data is actually used by others. It is impossible to incorporate this in project planning and results evaluation. Regarding the BnF data this question was answered in the presentation about Linked Open Data in the book industry. The Electre and Antidot project uses linked open data form among others data.bnf.fr.
The afternoon presentations were focused on creating, maintaining and using various data models, controlled vocabularies and knowledge organisation sysems (KOS) as Linked Open Data: The EDM Europeana data Model, UNIMARC, MODS. An interesting perspective was presented by Gordon Dunsire on versioning vocabularies in a linked data world. Vocabularies change over time, so an assignment of a URI of a certain vocabulary concept should always contain version information (like timestamps and/or version numbers) in order to be able to identify the intended meaning at the time of assigning.
The meeting was concluded with a panel with representatives of three commercial companies involved in library systems and Linked Open Data developments: Ex Libris, OCLC and the afore-mentioned Logilab. The fact that this panel with commercial companies on library linked data took place was significant and important in itself, regardless of the statements that were made about the value and importance of Linked Open Data in library systems. After years of dedicated temporarily funded proof of concept projects this may be an indication that Linked Open Data in libraries is slowly becoming mainstream.
Posted on June 19th, 2014 2 comments
Lingering gold at ELAG 2014
Libraries tend to see themselves as intermediaries between information and the public, between creators and consumers of information. Looking back at the ELAG 2014 conference at the University of Bath however, I can’t get the image out of my head of libraries standing in the way between information and consumers. We’ve been talking about “inside out libraries”, “libraries everywhere”, “rethinking the library” and similar soundbites for some years now, but it looks like it’s been only talk and nothing more. A number of speakers at ELAG 2014 reported that researchers, students and other potential library visitors wanted the library to get out of their way and give them direct access to all data, files and objects. A couple of quotes:
- “We hide great objects behind search forms” (Peter Mayr, “EuropeanaBot”)
- “Give us everything” (Ben O’Steen, “The Mechanical Curator”).
[Lingering gold: data, objects]
In a cynical way this observation tightly fits this year’s conference theme “Lingering Gold”, which refers to the valuable information and objects hidden and locked away somewhere in physical and virtual local stores, waiting to be dug up and put to use. In her keynote talk, Stella Wisdom, digital curator at the British Library, gave an extensive overview of the digital content available there, and the tools and services employed to present it to the public. However, besides options for success, there are all kinds of pitfalls in attempting to bring local content to the world. In our performance “The Lord of the Strings”, Karen Coyle, Rurik Greenall, Martin Malmsten, Anders Söderbäck and myself tried to illustrate that in an allegorical way, resulting in a ROADMAP containing guidelines for bringing local gold to the world.
In recent years it has become quite clear that data, dispersed and locked away in countless systems and silos, once liberated and connected can be a very valuable source of new information. This was very pertinently demonstrated by Stina Johansson in her presentation of visualization of research and related networks at Chalmers University using available data from a number of their information systems. Similar network visualizations are available in the VIVO open source linked data based research information tool, which was the topic of a preconference bootcamp which I helped organize (many thanks especially to Violeta Ilik, Gabriel Birke and Ted Lawless who did most of the work).
[Systems, apis, technology trap]
The point made here also implies that information systems actually function as roadblocks to full data access instead of as finding aids. I have come to realize this some time ago, and my perception was definitely confirmed during ELAG 2014. In his lightning talk Rurik Greenall emphasized the fact that what we do in libraries and other institutions is actually technology driven. Systems define the way we work and what we publish. This should be the other way around. Even APIs, intended for access to data in systems without having to use end user system functions, are actually sub-systems, giving non transparent views on the data. When Steve Meyer in his talk “Building useful and usable web services” said “data is the API” he was right in theory, yet in practice the reverse is not necessarily true. Also, APIs are meant to be used by developers in new systems. Non-tech end users have no use for it, as is illustrated by one of the main general reactions from researchers to the British Library Labs surveys, as reported by Ben O’Steen: “API? What’s that? I don’t care. Just give me the files.”.
[Commercial vs open source]
This technology critique essentially applies to both commercial/proprietary and open source systems alike. However, it could be that open source environments are more favorable to open and findable data than proprietary ones. Felix Ostrowski talked about the reasons for and outcomes of the Regal project, moving the electronic objects repository of the State Library of Rheinland-Pfalz from an environment based on commercial software to one based on open source tools and linked data concepts. One of the side effects of this move was that complaints were received from researchers about their output being publicly available on the web. This shows that the new approach worked, that the old approach was effectively hiding information and that certain stakeholders are completely satisfied with that.
On the side: one of the open source components of the new Regal environment is Fedora , only used for digital objects, not any metadata, which is exactly what is currently happening in the new repository project at the Library of the University of Amsterdam. A legitimate question asked by Felix: why use Fedora and not just the file system in this case?
All these observations also imply that, if libraries really want to disseminate and share their lingering gold with the world, alternative ways of exposing content are needed, instead of or besides the existing ones. Fortunately some libraries and individuals have been working on providing better direct access and even unguided and unsolicited publication of data and objects that might be available but not really findable with traditional library search tools. The above mentioned EuropeanaBot (and other twitter bots) and the British Library Labs’ Mechanical Curator are a case in point. Every hour EuropeanaBot sends a tweet about a random digital object, enriching it with extra information from Wikipedia and other sources.
In the case of the British Library Labs Ben O’Steen described an experiment with free access to large amounts of data that by chance led to the observation that randomly excavated images from that vast amount of content drew people’s attention. As all content was in the public domain anyway, they asked themselves “what’s the harm in making it a bit more acessible?”. So the Mechanical Curator was born, with channels on tumblr, twitter and flickr.
Another alternative way to expose and share library content, a game, was presented by Ciaran Talbot and Kay Munro: LibraryGame. In brief, students are encouraged to use and visit the library and share library content with others by awarding them points and badges as members of an online community. The only two things students didn’t like about the name LibraryGame were “library” and “game”, so the name was changed to “BookedIn”.
No matter if you like bots and games or not, the important message here is that it is worthwhile exploring alternative ways by which people can find the content that libraries consider so valuable.
In the end, it’s people that libraries work for. At Utrecht University Library they realised that they needed simpler ways to make it possible for people to use their content, not only APIs. Marina Muilwijk described how they are experimenting with the Lean Startup method. In a continuous cycle of building, measuring and learning, simple applications are released to end users in order to test if they use them and how they react to them.
“Focus on the user” was also the theme of the workshop given by Ken Chad around the Jobs-to-be-done methodology.
Interestingly, “How people find” instead of: “How people search” was one of the perspectives of the Jisc “Spotlight on the Digital” project, presented by Owen Stephens in his lightning talk.
[Collections and findability]
Another perspective of that Jisc project was how to make collections discoverable. It turns out that collections as such are represented on the web quite well, whereas items in these collection aren’t.
Valentine Charles of The European Library demonstrated the benefits of collection level metadata for the discoverability of hidden content, using the CENDARI project as example.
What’s a library technology conference without linked data? Implicitly and explicitly the instrument of connecting data from different sources relates quite well to most of the topics presented around the theme of lingering gold, with or without the application of the official linked data rules. I have already mentioned most cases, I will only go into a couple of specific sessions here.
Niklas Lindström and Lina Westerling presented the developments with the new linked data based cataloguing system for the Swedish LIBRIS union catalogue. This approach is not simply a matter of exposing and consuming linked data, but in essence the reconstruction of existing workflows using a completely new architecture.
The data management and integration platform d:swarm, a joint open source project of SLUB State and University Library Dresden and the commercial company AvantgardeLabs was presented in a lightning talk by Jan Polowinski. This tool aims at harvesting and normalising data from various existing systems and datastores into an intermediate platform that in turn can be used for all kinds of existing and new front end systems and services. The concept looks very useful for library environments with a multitude of legacy systems. Some time ago I visited the d:swarm team in Dresden together with a group of developers from the KOBV library consortium in Berlin, two of whom (Julia Goltz and Viktoria Schubert) presented their own new K2 portal solution for the data integration challenge in a lightning talk.
Linked data is all about unique identifiers on the web. The recent popular global identifier for researchers ORCiD, at last year’s ELAG topic of one of the workshops, was explained by Tom Demeranville. As it happened, right after the conference it became clear that ORCiD implemented the Turtle linked data format.
The problem of matching string based personal names from various data sources without matching identifiers was tackled in the workshop “Linking Data with sameAs” which I attended. Jane and Adrian Stevenson of the ArchivesHub UK showed us hands-on how to use tools like LOD-Refine and Silk for reconciling string value data fields and producing “sameAs” relationships/triples to be used in your local triple store. They have had substantial experience with this challenge in their Linking Lives project. I found the workshop very useful. One of the take-aways was that matching string data is hard work.
Hard work also goes on in the caves and basements of the library world, as was demonstrated by Toke Eskildsen in his war stories of the Danish State Library with scanning companies, and by Eva Dahlbäck and Theodor Tolstoy in their account of using smartphones and RFID technology in fetching books from the stacks.
Once again I have to say that a number of unofficial sessions, at breakfast, dinner, in pubs and hotel bars, were much more informative than the official presentations. These open discussions in small groups, fostering free exchange of ideas without fear of embarrassment, while being triggered by the talks in the official programme, can simply not take place within a tight conference schedule. Nevertheless, ELAG is a conference small and informal enough to attract people inclined to these extracurricular activities. I thank everybody who engaged in this. You know who you are. Or check Rurik Greenall’s conference report, which is a very structured yet personal account of the event.
Lots of thanks to the dedicated and very helpful local organisation team of the Library of the University of Bath, who have done a wonderful job doing something completely new to them: organising an international conference.
Posted on November 11th, 2013 4 comments
Using discovery tools for presenting integrated information
There has been a lot of discussion in recent years about library discovery tools. Basically, a library discovery tool provides a centrally maintained shared scholarly material metadata index, a system for searching and an option for adding a local metadata index. Academic libraries use it for providing a unified access platform to subscribed and open access databases and ejournals as well as their own local print and digital holdings.
I would like to put forward that, despite their shortcomings, library discovery tools can also be used for finding and presenting other scholarly information in the broadest sense. Libraries should look beyond the narrow focus on limitations and turn imperfection into benefits.
The two main points of discussion regarding discovery tools are the coverage of the central shared index and relevance ranking. For a number of reasons of a practical, technical and competitive nature, none of the commercial central indexes cover all the content that academic libraries may subscribe to. Relevance ranking of search results depends on so many factors that it is a science in itself to satisfy each and every end user with their own specific background and context. Discovery tool vendors spend a lot of energy in improving coverage and relevance ranking.
These two problems are the reason that not many academic libraries have been able to achieve the one-stop unified scholarly information portals for their staff and students that discovery tool providers promised them. In most cases the institutional discovery portal is just one of the solutions for finding scholarly publications that are offered by the library. A number of libraries are reconsidering their attitude towards discovery tools, or have even decided to renounce these tools altogether and focus on delivery instead, leaving discovery to external parties like Google Scholar.
I fully support the idea that libraries should reconsider their attitude towards discovery tools, but I would like to stress that they should do so with a much broader perspective than just the traditional library responsibility of providing access to scholarly publications. Libraries must not throw away the baby with the bathwater. They should realise that a discovery tool can be used as a platform for presenting connected scholarly information, for instance publications with related research project information and research datasets, based on linked open data principles. You could call this the “poor person’s linked open data platform”, because the library has already paid the license fee for the discovery platform, and it does not have to spend a lot of extra money on additional linked open data tools and facilities.
Of course this presupposes a number of things: the content to be connected should have identifiers, preferably in the form of URIs, and should be openly available for reuse, preferably via RDF. The discovery tools should be able to process URIs and RDF and present the resolved content in their user interfaces. We all know that this is not the case yet. Long term strategies are needed.
Content providers must be convinced of the added value of adding identifiers and URIs to their metadata and providing RDF entry points. In the case of publishers of scholarly publications this means identifiers/URIs for the publications themselves, but also for authors, contributors, organisations, related research projects and datasets. A number of international associations and initiatives are already active in lobbying for these developments: OpenAIRE, Research Data Alliance, DataCite, the W3C Research Object for Scholarly Communication Community Group, etc. Universities themselves can contribute by adding URIs and RDF to their own institutional repositories and research information systems. Some universities are implementing special tools for providing integrated views on research information based on linked data, such as VIVO.
There are also many other interesting data sources that can be used to integrate information in discovery tools, for instance in the government and cultural heritage domain. Many institutions in these areas already provide linked open data entry points. And then there is WikiPedia with its linked open data interface DBpedia.
On the other side of the scale discovery tool providers must be convinced of the added value of providing procedures for resolving URIs and processing RDF in order to integrate information from internal and external data sources into new knowledge. I don’t know of any plans for implementing linked open data features in any of the main commercial or open source discovery tools, except for Ex Libris’ Primo. OCLC provides a linked data section for each WorldCat search result, but that is mainly focused on publishing their own bibliographic metadata in linked data format, using links to external subject and author authority files. This is a positive development, but it’s not consumption and reuse of external information in order to create new integrated knowledge beyond the bibliographic domain.
With the joint IGeLU/ELUNA Linked Open Data Special Interest Working Group the independent Ex Libris user groups have been communicating with Ex Libris strategy and technology management on the best ways to implement much needed linked open data features in their products. The Primo discovery tool (with the Primo Central shared metadata index) is one of the main platforms in focus. Ex Libris is very keen on getting actual use cases and scenarios in order to identify priorities in going forward. We have been providing these for some time now through publications, presentations at user group conferences, monthly calls and face to face meetings. Ex Libris is also exploring best practices for the technical infrastructure to be used and is planning pilots with selected customers.
The Austrian national library service OBVSG for instance has integrated WikiPedia/DBpedia information about authors in their Primo results.
The Saxon State and University Library Dresden (SLUB) has implemented a multilingual semantic search tool for subjects based on DBpedia in their Primo installation.
At the University of Amsterdam I have been experimenting myself with linking publications from our Institutional Repository (UvA DARE) in Primo with related research project information. This has for now resulted in adding extra external links to that information in the Dutch National Research portal NARCIS, because NARCIS doesn’t provide RDF yet. We are communicating with DANS, the NARCIS provider, about extending their linked open data features for this purpose.
Of course all these local implementations can serve as use cases for discovery tool providers.
I have only talked about the options of using discovery tools as a platform for consuming, reusing and presenting external linked open data, but I can imagine that a discovery tool can also be used as a platform for publishing linked open data. It shouldn’t be too hard to add extra RDF options besides the existing HTML and internal record format output formats. That way libraries could have a full linked open data consumption and publishing workbench at their disposal at minimal cost. Library discovery tools would from then on be known as information discovery tools.
Posted on December 21st, 2010 43 comments
Mobile services have to fulfill information needs here and now
Like many other libraries, the Library of the University of Amsterdam released a mobile web app this year. For background information about why and how we did it, have a look at the slideshow my colleague Roxana Popistasu and I gave at the IGeLU 2010 conference.
For now I want to have a closer look at the actual reception and use of our mobile library services and draw some conclusions for the future. I have expressed some expectations earlier about mobile library services in my post “Mobile library services”. In summary, I expected that the most valued mobile library services would be of a practical nature, directly tied to the circumstances of internet access ‘any time, anywhere’, and would not include reading and processing of electronic texts.
Let me emphasise that I define mobile devices as smart phones and similar small devices that can be carried around literally any time anywhere, and that need dedicated apps to be used on a small touchscreen. So I am not talking about tablets like the iPad, which are large enough to be used with standard applications and websites, just like netbooks.
As you can see, most, if not all of the services in the Library of the University of Amsterdam mobile app are of a practical nature: opening hours, locations, contact information, news. And of course there is a mobile catalogue. This is the general situation in mobile library land, as has been described by Aaron Tay in his blog post “What are mobile friendly library sites offering? A survey”.
In my view these practical services are not really library services. They are learning or study centre services at best. There is no difference with practical services offered by other organisations like municipal authorities or supermarkets. Nothing wrong with that of course, they are very useful, but I don’t consider these services to be core library services, which would involve enabling access to content.
Real mobile devices are simply to small to be used for reading and processing large bodies of scholarly text. This might be different for public libraries.Their customers may appreciate being able to read fiction on their smart phones, provided that publishers allow them to read ebooks via libraries at all.
Even a mobile library catalogue can be considered a practical service intended to fulfill practical needs of a physical nature, like finding and requesting print books and journals to be delivered to a specific location and renewing loans to avoid paying fines. Let’s face it: an Integrated Library System is basically nothing more than an inventory and logistics management system for physical objects.
Usage statistics of the Library of the University of Amsterdam mobile web app show that between the launch in April and November 2010 the number of unique visits evolves around 30 per day on average, with a couple of peaks (350) on two specific days in October. The full website shows around 6000 visits per day on normal weekdays.
For the mobile catalogue this is between 30 and 50 visits per day. The full OPAC shows around 3000 visits on normal weekdays.
In November we see a huge increase in usage. Our killer mobile app was introduced: an overview of currently available workstations per location. The number of unique visits rises to between 300 and 400 a day. The number of pageviews rises from under 100 per day to around 1000 on weekdays in November. The ‘available workstations’ service accounts for 80% of these. In December 2010, an exam period, these figures rise to around 2000 pageviews per day, with 90% for the ‘available workstations’ service.
We can safely conclude that our students are mainly using our mobile library app on their smart phones to locate the nearest available desktop PC.
Mobile users expect services that are useful to them here and now.
What does this mean for core library services, aimed at giving access to content, on small mobile devices? I think that there is no future for providing mobile access on smart phones to traditional library content in digital form: electronic articles and ebooks. I agree with Aaron Tay when he says “I don’t believe there is any reason to think that it will necessarily lead to high demand for library mobile services” in his post “A few heretical thoughts about library tech trends“.
Rather, mobile services should provide information about specific subjects useful to people here and now.
In the near future anybody interested in a specific physical object or location will have access via their location aware smart phones and augmented reality to information of all kinds (text, images, sound, video, maps, statistics, etc.) from a number of sources: museums, archives, government agencies, maybe even libraries. To make this possible it is essential that all these organisations publish their information as linked open data. This means: under an open license using a generic linked data protocol like RDF.
I expect that consumers of this new type of mobile location based augmented linked information would appreciate some guidance in the possibly overwhelming information landscape, in the form of specific views, with preselection of information sources and their context taken into account.
There may be an opportunity here for libraries, especially public libraries, taking on a new coordinating role as information brokers on the intersection of a large number of different information providers. Of course if libraires want to achieve that, they need to look beyond their traditional scope and invest more in new information technologies, services and expertise.
The future of mobile information services lies in the combination of location awareness, augmented reality and linked open data. Maybe libraries can help.