Library2.0 and beyond
RSS icon Home icon
  • Local library data in the new global framework

    Posted on January 5th, 2012 Lukas Koster 34 comments

    2011 has in a sense been the year of library linked data. Not that libraries of all kinds are now publishing and consuming linked data in great numbers. No. But we have witnessed the publication of the final report of the W3C Library Linked Data Incubator Group, the Library of Congress announcement of the new Bibliographic Framework for the Digital Age based on Linked Data and RDF, the release by a number of large libraries and library consortia of their bibliographic metadata, many publications, sessions and presentations on the subject.

    All these events focus mainly on publishing library bibliographic metadata as linked open data. Personally I am not convinced that this is the most interesting type of data that libraries can provide. Bibliographic metadata as such describe publications, in the broadest sense, providing information about title, authors, subjects, editions, dates, urls, but also physical attributes like dimensions, number of pages, formats, etc. This type of information, in FRBR terms: Work, Expression and Manifestation metadata, is typically shared among a large number of libraries, publishers, booksellers, etc. ‘Shared’ in this case means ‘multiplied and redundantly stored in many different local systems‘. It doesn’t really make sense if all libraries in the world publish identical metadata side by side, does it?

    In essence only really unique data is worth publishing. You link to the rest.

    Currently, library data that is really unique and interesting is administrative information about holdings and circulation. After having found metadata about a potentially relevant publication it is very useful for someone to know how and where to get access to it, if it’s not freely available online. Do you need to go to a specific library location to get the physical item, or to have access to the online article? Do you have to be affiliated to a specific institution to be entitled to borrow or access it?

    Usage data about publications, both print and digital, can be very useful in establishing relevance and impact. This way information seekers can be supported in finding the best possible publications for their specific circumstances. There are some interesting projects dealing with circulation data already, such as the research project by Magnus Pfeffer and Kai Eckert as presented at the SWIB 11 conference, and the JISC funded Library Impact Data project at the University of Huddersfield. The Ex Libris bX service presents article recommendations based on SFX usage log analysis.

    The consequence of this assertion is that if libraries want to publish linked open data, they should focus on holdings and circulation data, and for the rest link to available bibliographic metadata as much as possible. It is to be expected that the Library of Congress’ New Bibliographic Framework will take care of that part one way or another.

    In order to achieve this libraries should join forces with each other and with publishers and aggregators to put their efforts into establishing shared global bibliographic metadata pools accessible through linked open data. We can think of already existing data sources like WorldCat, OpenLibrary, Summon, Primo Central and the like. We can only hope that commercial bibliographic metadata aggregators like OCLC, SerialsSolutions and Ex Libris will come to realise that it’s in everybody’s interest to contribute to the realisation of the new Bibliographic Framework. The recent disagreement between OCLC and the Swedish National Library seems to indicate that this may take some time. For a detailed analysis of this see the blog post ‘Can linked library data disrupt OCLC? Part one’.

     

    An interesting initiative in this respect is LibraryCloud, an open, multi-library data service that aggregates and delivers library metadata. And there is the HBZ LOBID project, which is targeted at ‘the conversion of existing bibliographic data and associated data to Linked Open Data‘.

    So what would the new bibliographic framework look like? If we take the FRBR model as a starting point, the new framework could look something like this. See also my slideshow “Linked Open Data for libraries”, slides 39-42.

    The basic metadata about a publication or a unit of content, on the FRBR Work level, would be an entry in a global datastore identified by a URI ( Uniform Resource Identifier). This datastore could for instance be WorldCat, or OpenLibrary, or even a publisher’s datastore. It doesn’t really matter. We don’t even have to assume it’s only one central datastore that contains all Work entries.

    The thing identified by the URI would have a text string field associated with it containing the original title, let’s say “The Da Vinci Code” as an example of a book. But also articles can and should be identified this way. The basic information we need to know about the Work would be attached to it using URIs to other things in the linked data web. A set of two things linked by a URI is called a ‘triple’. ‘Author’ could for instance be a link to OCLC’s VIAF (http://viaf.org/viaf/102403515 = Dan Brown), which would then constitute a triple. If there are more authors, you simply add a URI for every person or institution. Subjects could be links to DBPedia/Wikipedia, Freebase, the Library of Congress Authority files, etc. There could be some more basic information, maybe a year, or a URI to a source describing the background of the work.

    At the Expression level, a Dutch translation would have it’s own URI, stored in the same or another datastore. I could imagine that the publisher who commissioned the translation would maintain a datastore with this information. Attached to the Expression there would be the URI of the original Work, a URI pointing to the language, a URI identifying the translator and a text string contaning the Dutch title, among others.

    Every individual edition of the work could have it’s own Manifestation level URI, with a link to the Expression (in this case the Dutch translation), a publisher URI, a year, etc. For articles published according to the long standing tradition of peer reviewed journals, there would also be information about the journal. On this level there should also be URIs to the actual content when dealing with digital objects like articles, ebooks, etc., no matter if access is free or restricted.

    So far we have everything we need to know about publications “in the cloud”, or better: in a number of datastores available on a number of servers connected to the world wide web. This is more or less the situation described by OCLC’s Lorcan Dempsey in his recent post ‘Linking not typing … knowledge organization at the network level’. The only thing we need now is software to present all linked information to the user.

    No libraries in sight yet. For accessing freely available digital content on the web you actually don’t need a library, unless you need professional assistance finding the correct and relevant information. Here we have identified a possible role of librarians in this new networked information model.

    Now we have reached the interesting part: how to link local library data to this global shared model? We immediately discover that the original FRBR model is inadequate in this networked environment, because it implies a specific local library situation. Individual copies of a work (the Items) are directly linked to the Manifestation, because FRBR refers to the old local catalogue which describes only the works/publications one library actually owns.

    In the global shared library linked data network we need an extra explicit level to link physical Items owned by the library or online subscriptions of the library to the appropriate shared network level. I suggest to use the “Holding” level. A Holding would have it’s own URI and contain URIs of the Manifestation and of the Library. A specific Holding in this way would indicate that a specific library has one or more copies (Items) of a specific edition of a work (Manifestation), or offers access to an online digital article by way of a subscription.

     

    If a Holding refers to physical copies (print books or journal issues for instance) then we also need the Item level. An Item would have it’s own URI and the URI of the Holding. For each Item, extra information can be provided, for instance ‘availability’, ‘location’, etc. Local circulation administration data can be registered for all Holdings and Items. For online digital content we don’t need Items, only subscription information directly attached to the Holding.

    Local Holding and Item information can reside on local servers within the library’s domain or just as well on some external server ‘in the cloud’.

    It’s on the level of the Holding that usage statistics per library can be collected and aggregated, both for physical items and for digital material.

    Now, this networked linked library data model still allows libraries to present a local traditional catalogue type interface, showing only information about the library’s own print and digital holdings. What’s needed is software to do this using the local Holdings as entry level.

    But the nice thing about the model is that there will also be a lot of other options. It will also be possible to start at the other end and search all bibliographic metadata available in the shared global network, and then find the most appropriate library to get access to a specific publication, much like WorldCat does, but on an even larger scale.

    Another nice thing of using triples, URIs and linked data, is that it allows for adding all kinds of other, non-traditional bibliographic links to the old inward looking library world, making it into a flexible and open model, ready for future developments. It will for instance be possible for people to discover links to publications and library holdings from any other location on the web, for instance a Wikipedia page or a museum website. And the other way around, from an item in local library holdings to let’s say a recorded theatre performance on YouTube.

    When this new data and metadata framework will be in place, there will be two important issues to be solved:

    • Getting new software, systems and tools for both back end administrative functions and front end information finding needs. For this we need efforts from traditional library systems vendors but also from developers in libraries.
    • Establishing future roles for libraries, librarians and information professionals in the new framework. This may turn out to be the most important issue.
    Share
  • FRBR outside the box

    Posted on September 2nd, 2011 Lukas Koster 11 comments
    Shifting focus from information carriers back to information
    This blog post is based on a presentation I did at Datasalon 6 in Brussels, January 21, 2011.

    © TheArtGuy

    Library catalogues have traditionally been used to describe and register books and journals and other physical objects that together constitute the holdings of a library. In an integrated library system (ILS), the public catalogue is combined with acquisition and circulation modules to administer the purchases of book copies and journal subscriptions on one side, and the loans to customers on the other side. The “I” for “Integrated” in ILS stands for an internal integration of traditional library workflows. Integration from a back end view, not from a customer perspective.

    Because of the very nature of such a catalogue, namely the description of physical objects and the administration of processing them, there are no explicit relations between the different editions and translations of the same book, nor are there descriptions of individual journal articles. If you do a search on a specific person’s name, you may end up with a large number of result records, written by that person or someone with a similar name, or about that person, even with identical titles, without knowing if there is a relationship between them, and what that relationship might be. What’s certain is that you will not find journal articles written by or about that person. The same applies to a search on title. There is no way of telling if there is any relation between identical titles. A library catalogue user would have to look at specific metadata in the records (like MARC 76X-78X – Linking Entries534 – Original Version Note or 580 – Linking Entry Complexity Note), if available, to reach their own conclusions.

    Most libraries nowadays also purchase electronic versions of books and journals (ebooks and ejournals) and have free or paid subscriptions to online databases. Sometimes these digital items (ebooks, ejournals and databases) are also entered into the traditional library catalogues, but they are sometimes also made available through other library systems, like federated search tools, integrated discovery tools, A-Z lists, etc. All kinds of combinations occur.

    In traditional library catalogues digital items are treated exactly the same as their physical counterparts. They are all isolated individual items without relations. As Karen Coyle put it November 2010 at the SWIB10 conference: “The main goal of cataloguing today is to keep things apart” .
    Basically, integrated library systems and traditional catalogues are nothing more than inventory and logistics systems for physical objects, mainly focused on internal workflows. Unfortunately in newer end user interfaces like federated search and integrated discovery tools the user experience in this respect has in general been similar to that of traditional public catalogues.

    At some point in time during the rise of electronic online catalogues, apparently the lack of relations between different versions of the same original work became a problem. I’m not sure if it was library customers or librarians who started feeling the need to see these implicit connections made explicit. The fact is that IFLA (International Federation of Library Associations) started developing FRBR in 1998.

    FRBR (Functional Requirements for Bibliographic Records) is an attempt to provide a model for describing the relations between physical publications, editions, copies and their common denominator, the Work.

    FRBR Model © Library of Congress/Barbara Tillett

    FRBR Group 1 describes publications in terms of the entities Work, Expression, Manifestation and Item (WEMI).
    FRAD (Functional Requirements for Authority Data – ‘authors’) and FRSAD (Functional Requirements for Subject Authority Data – ‘subjects’) have been developed later on as alternatives for the FRBR Group 2 and 3 entities.

    Anne Frank's Diary

    As an example let’s have a look at The Diary of Anne Frank. The original handwritten diary may be regarded as the Work. There are numerous adaptations and translations (Expressions) of the original unfinished and unedited Work. Each of these Expressions can be published in the form of one or more prints, editions, etc. These are the Manifestations, especially if they have different ISBN’s. Finally a library can have one or more physical copies of a Manifestation, the Items.

    Some might even say the actual physical diary is the only existing Item embodying one specific (the first) Expression of the Work (Anne’s thoughts) and/or the only Manifestation of that Expression.

    Of course, this model, if implemented, would be an enormous improvement to the old public  catalogue situation. It makes it possible for library customers to have an automatic overview of all editions, translations, adaptations of one specific original work through the mechanism of Expressions and Manifestations. RDA (Resource Description and Access) is exactly doing this.
    However there are some significant drawbacks, because the FRBR model is an old model, based on the traditional way of library cataloguing of physical items (books, journals, and cd’s, dvd’s), etc. (Karen Coyle at SWIB10).

    • In the first place the FRBR model only shows the Works and related Manifestations and Expressions of physical copies (Items) that the library in question owns. Editions not in the possession of the library are ignored. This would be a bit different in a union catalogue of course, but then the model still only describes the holdings of the participating libraries.
    • Secondly, the focus on physical copies is also the reason that the original FRBR model does not have a place for journal titles as such, only for journal issues. So there will be as many entries for one journal as the library has issues of it.
    • Thirdly, it’s a hierarchical model, which incorporates only relations from the Work top down. There is no room for relations like: ‘similar works’, ‘other material on the same subject’, ‘influenced by’, etc.
    • In the fourth place, FRBR still does not look at content. It is document centric, instead of information centric. It does however have the option for describing parts of a Work, if they are considered separate entities/works, like journal articles or volumes of a trilogy.
    • Finally, the FRBR Item entity is only interesting in a storage and logistics environment for physical copies, such as the Circulation function in libraries, or the Sales function in bookstores. It has no relation to content whatsoever.

    FRBR definitely is a positive and necessary development, but it is just not good enough. Basically it still focuses on information carriers instead of information (it’s a set of rules for managing Bibliographic Records, not for describing Information). It is an introverted view of the world. This was OK as long as it was dictated by the prevailing technological, economical and social conditions.
    In a new networked digital information world libraries should shift their focus back to their original objective: being gateways to information as such. This entails replacing an introverted hierarchical model with an extroverted networked one, and moving away from describing static information aggregates in favour of units of content as primary objects.

    The linked data concept provides the framework of such a networked model. In this model anything can be related to anything, with explicit declarations of the nature of the relationship. In the example of the Diary of Anne Frank one could identify relations with movies and theater plays that are based on the diary, with people connected to the diary or with the background of World War 2, antisemitism, Amsterdam, etc.

    Unlinked data

    In traditional library catalogues defining relations with movies or theater plays is not possible from the description of the book. They could however be entered as a textual reference in the description of a movie, if for instance a DVD of that movie is catalogued. Relations to people, World War 2, antisemitism and Amsterdam would be described as textual or coded references to a short concept description, which in turn could provide lists of other catalogue items indexed with these subjects.
    In a networked linked data model these links could connect to information entities in their own right outside the local catalogue, containing descriptions and other material about the subject, and providing links to other related information entities.

    FRBR would still be a valuable part of such a universal networked model, as a subset for a specific purpose. In the context of physical information carriers it is a useful model, although with some missing features, as described above. It could be used in isolation, as originally designed, but if it’s an open model, it would also provide the missing links and options to describe and find related information.

    Also, the FRBR model is essential as a minimal condition for enabling links from library catalogue items to other entity types through the Work common denominator.

    In a completely digital information environment, the model could be simplified by getting rid of the Item entity. Nobody needs to keep track of available copies of online digital information, unless publishers want to enforce the old business models they have been using in order to keep making a profit. Ebooks for instance are essentially Expressions or Manifestations, depending on their nature, as I stated in my post ’Is an e-book a book?’.

    The FRBR model can be used and is used also in other subject areas, like music, theater performances, etc. The Work – Expression – Manifestation – Item hierarchy is applicable to a number of creative professions.

    The networked model provides the option of describing all traditional library objects, but also other and new ones and even objects that currently don’t exist, because it is an open and adaptable model.
    In the traditional library models it is for instance impossible, or at least very hard, to describe a story that continues through all volumes of a trilogy as a central thread, apart from and related to the descriptions of the three separate physical books and their own stories. In the Millennium trilogy by Stieg Larsson, Lisbeth Salander’s life story is the central thread, but it can’t be described as a separate “Work” in MARC/FRBR/RDA because it is not the main subject of one physical content carrier (unless we are dealing with an edition in one physical multi part volume). The three volumes will be described with the subjects ‘Missing girl mystery‘, ‘Sex trafficking‘ and ‘Illegal secret service unit‘ respectively.

    In an open networked information model on the contrary it would be entirely possible to describe such a ‘roaming story’.

    Millennium trilogy and FRBR

    New forms of information objects could appear in the form of new types of aggregates, other than books or journal articles, for instance consisting of text, images, statistics and video, optionally of a flexible nature (dynamic instead of static information objects).

    Existing library systems (ILS’s and Integrated Discovery tools  alike), using bibliographic metadata formats and frameworks like MARC, FRBR and RDA, can’t easily deal with new developments without some sort of workaround. Obviously this means that if libraries want to continue playing a role in the information gateway world, they need completely different systems and technology. Library system vendors should take note of this.

    Finally, instead of only describing information objects, libraries could take up a new role in creating new objects, in the form of subject based virtual information aggregates, like for instance the Anne Frank Timeline, or Qwiki.This would put libraries back in the center of the information access business.

    See also
    http://dynamicorange.com/2009/11/11/bringing-frbr-down-to-earth/
    http://www.slideshare.net/SarahBartlett/what-place-for-libraries-in-a-linked-data-world
    http://kcoyle.blogspot.com/2011/08/models-of-bibliographic-data.html

    Share
  • Missing links

    Posted on March 28th, 2011 Lukas Koster 1 comment

    The challenges of generating linked data from legacy databases

    © extranoise

     

    Some time ago I wrote a blog post about the linked data proof of concept project I am involved in, connecting bibliographic metadata from the OPAC of the Library of the University of Amsterdam with the theatre performances database maintained by the Theatre Institute of The Netherlands.
    I ended that post with a list of next steps to take:

    • select/adapt/create a vocabulary for the Production/Performance subject area
    • select/adapt/create vocabularies for Persons (FOAF?) and Subjects (SKOS?)
    • add internal relationships with the other entities (Play, Production, etc.) in the JSON structure (implement RDF in JSON)
    • Add RDF/XML as output option, besides JSON
    • add external relationships (to other linked data sources like DBPedia, etc.)
    • extend the number of possible URI formats (for Play, Production, etc.)
    • add content negotiation to serve both human and machine readable redirects
    • extend the options on the OPAC side
    • publish UBA bibliographic data as linked open data (probably an entirely new project)

    So, what have we achieved so far? I can be brief about all the ‘real’ linked data stuff (RDF, vocabularies, external links, content negotiation): we are not there yet. This will be dealt with in the next phase.
    Instead, we have focused on getting the simple JSON implementation right, both on the data publishing side and on the data using side. We have added more URIs and internal relationships, and we are using these in the OPAC user interface.
    But we have also encountered a number of crucial problems that are in my view inherent to the type of legacy data models used in libraries and cultural heritage institutions.

    Theatre Production data in the Library Catalogue

     

    Progress

    First let me describe the improvements we have added so far.

    The URI for ‘person<baseurl>/person/<personname> now also returns a link to all the ‘titles’ that person is connected to (not only with the ‘author’ role, but for all roles, like director, performer, etc.): <baseurl>/gettitles/<personname>. This link will return a set of URIs of the form <baseurl>/title/<personname>/<title>. The /<personname>/<title> bit is at the moment the only way that a more or less unique identifier can be constructed from the OPAC metadata for the ‘play’ in the TIN database. There are a number of really important problems related to this that I will discuss below.

    The URI:

    <baseurl>/person/Beckett, Samuel

    returns among others:

    /title/Beckett, Samuel/Waiting for Godot
    /title/Beckett, Samuel/En attendant Godot
    /title/Beckett, Samuel/Endgame
    etc.

    The URI for a ‘play<baseurl>/title/<personname>/<title> now returns a set of ‘production’ URIs of the form <baseurl>/production/<personname>/<title>/<openingdate>/<idnr>.
    The ‘production’ URI returns information about ‘theatre company’, ‘venue‘ and all persons connected to that production, including their URIs, and when available also a link to an image of a poster, and a video.

    The URI

    <baseurl>/title/Beckett, Samuel/Waiting for Godot

    returns:

    /production/Beckett, Samuel/Waiting for Godot/1988-07-28/5777
    /production/Beckett, Samuel/Waiting for Godot/1988-11-22/6750
    /production/Beckett, Samuel/Waiting for Godot/1992-04-16/10728
    /production/Beckett, Samuel/Waiting for Godot/1981-02-18/43032

    The last ‘production’ URI returns:

    “name”:”Beckett, Samuel”,
    “title”:”Waiting For Godot”,
    “opening”:”1981-02-18″,
    “people”:
    “description”:”Beckett, Samuel (auteur: toneelspel van)”,
    “uri”:”/person/Beckett, Samuel”

    “description”:”Hartnett, John (regie)”,
    “uri”:”/person/Hartnett, John”

    “description”:”Muller, Frans (decor: ontwerp)”,
    “uri”:”/person/Muller, Frans”

    “description”:”Newell, Kym (licht: ontwerp)”,
    “uri”:”/person/Newell, Kym”

    “description”:”Zaal, Kees (geluid)”,
    “uri”:”/person/Zaal, Kees”

    “description”:”Tolstoj, Alexander (uitvoerende: Lucky)”,
    “uri”:”/person/Tolstoj, Alexander”

    “description”:”Weeks, David (uitvoerende: Estragon)”,
    “uri”:”/person/Weeks, David”

    “description”:”Coburn, Grant (uitvoerende: Vladimir)”,
    “uri”:”/person/Coburn, Grant”

    “description”:”Evans, Rhys (uitvoerende: Pozzo)”,
    “uri”:”/person/Evans, Rhys”

    “description”:”Geiringer, Karl (uitvoerende: A Boy)”,
    “uri”:”/person/Geiringer, Karl”

    “description”:”Guidi, Peter (uitvoering muziek)”,
    “uri”:”/person/Guidi, Peter”

    “description”:”Kimmorley, Roxanne (uitvoering muziek)”,
    “uri”:”/person/Kimmorley, Roxanne”

    “description”:”Vries, Hessel de (uitvoering muziek)”,
    “uri”:”/person/Vries, Hessel de”

    “description”:”Phillips, Margot (uitvoering muziek)”,
    “uri”:”/person/Phillips, Margot”

     

    Challenges/problems

    Now, the problems (or challenges) that we are facing here are essential to the core concept of linked data:

    • we don’t have actual matching unique identifiers (URIs)
    • we don’t have explicit internal relations with a common entity in both sources
    • part of the data consists of literal strings in a specific language

    These three problems are interrelated, they are linked problems, so to speak.

     

    Missing identifiers

    To start with the identifiers. Of course we have internal system identifiers in our local Aleph catalogue database. Because we contribute to the Dutch Union Catalogue (originally a PICA system, now OCLC), our bibliographic records also have national Dutch PICA identifiers. And because the Dutch Union Catalogue records are copied to WorldCat, these records in WorldCat also have OCLC numbers.
    Also the Theatre Institute has internal system identifiers in their Adlib database. But at the moment we do not have a match between these separate internal identifier schemes. The Theatre Production database records are not in WorldCat because they’re not bibliographic records.
    We are more or less forced to use the string values of the title and author fields to construct a usable URI, on both sides. Clearly this is the basis of lots of errors, because of the great number of possible variations in author and title descriptions.
    But even if the Theatre Institute’s records were in the Union Catalogue or WorldCat as well, then we still would not have an automatic match without some kind of broker mechanism ascertaining that the library catalogue record describes the same thing as the theatre production database record. The same applies to the author, which of course should be a relation of the type “written by” between the play and a person record instead of string values. Both systems do have internal author or person authority files, but there is no direct matching. For authors this could theoretically be achieved by linking to an online person authority file like VIAF. But in the current situation this is not available.

     

    Missing relations

    This brings me to the second problem. The fact that we are using the string values of title instead of unique identifiers, means that we connect plays and productions with a specific title variety or language. In our current implementation this means that we are not able to link to all versions of one specific play.
    For instance, from our OPAC the following URIs are constructed (two in English, one in French, one in Dutch):

    /title/Beckett, Samuel/Waiting for Godot
    /title/Beckett, Samuel/Waiting for Godot : a tragicomedy in two acts
    /title/Beckett, Samuel/En attendant Godot : pièce en deux actes
    /title/Beckett, Samuel/Wachten op Godot

    In the Theatre Production database (two in English, four in Dutch, one in French, one in German):

    /title/Beckett, Samuel/Waiting for Godot
    /title/Beckett, Samuel/Waiting For Godot
    /title/Beckett, Samuel/Wachten op Godot
    /title/Beckett, Samuel/Wachtend op Godot
    /title/Beckett, Samuel/Wachten op Godot (De favorieten)
    /title/Beckett, Samuel/Wachten op Godot (eerste bedrijf)
    /title/Beckett, Samuel/En attendant Godot
    /title/Beckett, Samuel/Warten auf Godot

    Only the first and fourth URI from the OPAC will find corresponding titles in the Theatre Production database. The second and third one, using a subtitle within the main title, don’t even have equivalents. And only two of the eight entries from the Theatre Production database have a match in the catalogue.
    In a library catalogue environment we are used to this problem, because catalogues are used for describing physical objects in the form of editions and copies. Unfortunately, also the Theatre Production database just contains records describing productions of a specific ‘edition’ or translation of a play, with only the opening performance information attached.

    This is where I need to talk about FRBR. Basically in a library catalogue environment this means that we should describe the relations between the ‘work’ (original text), the ‘expression’ (the version or translation), the ‘manifestation’ (edition, format, etc.) and the ‘items’ (the physical copies). Via the relations with higher level expression and work, the physical copy could be linked to the unifying work level, and then ideally through some universally valid unique identifier to, in our case, the theatre plays.
    Although FRBR is a publication centered schema used only in libraries, the same concepts can be applied to theatre performances: the original work (which is the same as the work in a bibliographical sense) has expressions (adaptations, translations, etc.), which have manifestations (productions), and in the end the individual items (actual performances on a specific date, time and location).

    Linking library and theatre in theory through FRBR

    If both the library catalogue and the theatre production database were FRBRised, we could in theory link on the Work level and cover all individual versions. But we would still need a matching mechanism on that Work level of course.

    In reality however we can only try to link on the Manifestation level in an imperfect way.

    Linking library and theatre in reality

    At the moment, in our project, on the catalogue side we extract the title and author from the generated OPAC HTML. It could be an option to get available linking information form the internal MARC records (like the 240, 246, 765, 767, 775 tags), but that is not easy because of a number of reasons. Something similar could be done in the theatre production database, making implicit links explicit. But all this makes the effort to get something sensible out there much bigger.

     

    Literal strings

    The third problem, the literal strings in Dutch both in the library catalogue and in the theatre production database, prevents the effective use of the data in multilingual environments, equally in the traditional native interfaces and as linked data. Obviously for English speaking end users the Dutch terms mean nothing. And in a linked data environment the Dutch strings can’t easily be linked to other data, in Dutch, English, or any language, without unique identifiers.

     

    Implicit to explicit

    People calling on institutions to publish their data as linked open data tend to say it’s easy once you know how to do it . And of course it must be done. But if the published datasets have a flat internal structure designed to fulfill a specific local business objective, then they just don’t provide sufficient added value for third party use. In order to make your published open data useful for others, you have to make implicit relations explicit. And this requires something more than just making the data available in RDF ‘as is’, it requires a lot of processing.

    Share
  • Do we need mobile library services? Not really

    Posted on December 21st, 2010 Lukas Koster 43 comments

    Mobile services have to fulfill information needs here and now

    Any time anywhere © Simona K

    Like many other libraries, the Library of the University of Amsterdam released a mobile web app this year. For background information about why and how we did it, have a look at the slideshow my colleague Roxana Popistasu and I gave at the IGeLU 2010 conference.
    For now I want to have a closer look at the actual reception and use of our mobile library services and draw some conclusions for the future. I have expressed some expectations earlier about mobile library services in my post “Mobile library services”. In summary, I expected that the most valued mobile library services would be of a practical nature, directly tied to the circumstances of internet access ‘any time, anywhere’, and would not include reading and processing of electronic texts.

    Let me emphasise that I define mobile devices as smart phones and similar small devices that can be carried around literally any time anywhere, and that need dedicated apps to be used on a small touchscreen. So I am not talking about tablets like the iPad, which are large enough to be used with standard applications and websites, just like netbooks.

    As you can see, most, if not all of the services in the Library of the University of Amsterdam mobile app are of a practical nature: opening hours, locations, contact information, news. And of course there is a mobile catalogue. This is the general situation in mobile library land, as has been described by Aaron Tay in his blog post “What are mobile friendly library sites offering? A survey”.

    In my view these practical services are not really library services. They are learning or study centre services at best. There is no difference with practical services offered by other organisations like municipal authorities or supermarkets. Nothing wrong with that of course, they are very useful, but I don’t consider these services to be core library services, which would involve enabling access to content.
    Real mobile devices are simply to small to be used for reading and processing large bodies of scholarly text. This might be different for public libraries.Their customers may appreciate being able to read fiction on their smart phones, provided that publishers allow them to read ebooks via libraries at all.

    Even a mobile library catalogue can be considered a practical service intended to fulfill practical needs of a physical nature, like finding and requesting print books and journals to be delivered to a specific location and renewing loans to avoid paying fines. Let’s face it: an Integrated Library System is basically nothing more than an inventory and logistics management system for physical objects.

    Usage statistics of the Library of the University of Amsterdam mobile web app show that between the launch in April and November 2010 the number of unique visits evolves around 30 per day on average, with a couple of peaks (350) on two specific days in October. The full website shows around 6000 visits per day on normal weekdays.
    For the mobile catalogue this is between 30 and 50 visits per day. The full OPAC shows around 3000 visits on normal weekdays.

    In November we see a huge increase in usage. Our killer mobile app was introduced: an overview of currently available workstations per location. The number of unique visits rises to between 300 and 400 a day. The number of pageviews rises from under 100 per day to around 1000 on weekdays in November. The ‘available workstations’ service accounts for 80% of these. In December 2010, an exam period, these figures rise to around 2000 pageviews per day, with 90% for the ‘available workstations’ service.

    We can safely conclude that our students are mainly using our mobile library app on their smart phones to locate the nearest available desktop PC.

    Mobile users expect services that are useful to them here and now.

    What does this mean for core library services, aimed at giving access to content, on small mobile devices? I think that there is no future for providing mobile access on smart phones  to traditional library content in digital form: electronic articles and ebooks. I agree with Aaron Tay when he says “I don’t believe there is any reason to think that it will necessarily lead to high demand for library mobile services” in his post “A few heretical thoughts about library tech trends“.

    Rather, mobile services should provide information about specific subjects useful to people here and now.

    Anne Frank House AR example

    In the near future anybody interested in a specific physical object or location will have access via their location aware smart phones and augmented reality to information of all kinds (text, images, sound, video, maps, statistics, etc.) from a number of sources: museums, archives, government agencies, maybe even libraries. To make this possible it is essential that all these organisations publish their information as linked open data. This means: under an open license using a generic linked data protocol like RDF.

    I expect that consumers of this new type of mobile location based augmented linked information would appreciate some guidance in the possibly overwhelming information landscape, in the form of specific views, with preselection of information sources and their context taken into account.
    There may be an opportunity here for libraries, especially public libraries, taking on a new coordinating role as information brokers on the intersection of a large number of different information providers. Of course if libraires want to achieve that, they need to look beyond their traditional scope and invest more in new information technologies, services and expertise.

    The future of mobile information services lies in the combination of location awareness, augmented reality and linked open data. Maybe libraries can help.

    Share
  • Dutch Culture Link

    Posted on October 7th, 2010 Lukas Koster 6 comments

    Linking library and cultural heritage data

    Culture links © Scott Beale/Laughing Squid (http://laughingsquid.com/)

    Interested to publishing a test collection as linked open data to help @StichtingDEN with practical guide for heritage institutions?” That’s what my former colleague at the Library of the University of Amsterdam, now project manager at DEN (Digital Heritage Foundation The Netherlands), Marco Streefkerk asked me in April 2010.

    Was I interested? Of course I was. I had written a blog post “Linked data for libraries” almost a year before, and I had been very interested in the subject since then. Unfortunately in my day job at the Library of the University of Amsterdam (UBA) until very recently there was no opportunity to put my theoretical knowledge to practice. However, in the Library’s “Action plans 2010-2011” (January 2010), the Semantic Web is mentioned in the Innovation chapter as one of the areas with room for a small pilot involving linked data and RDF. I like to think it was me who managed to get it in there ;-)

    To come back to Marco’s question, I was at the time actually trying to think of a linked data/RDF test, and it so happened that I had talked to Ad Aerts of the Theater Institute of The Netherlands (TIN) about organising such a test the day before! So that’s what I told Marco. And things started from there.

    The first idea was to publish a small test set of records from one of the University Library’s own heritage collections. The goal from the point of view of DEN was to publish a short practical guide how to publish heritage collection as linked data, targeted at heritage institutions.
    But after some email discussions and meetings we decided to incorporate TIN in this test and apply both sides of the linked data concept: publish linked data and use linked data.
    Apart from a library catalogue, TIN also has a large database containing metadata on theater performances and a large collection of audiovisual material related to these performances. The plan was to publish the performance metadata and related digital material as linked data.
    The UBA would then use this TIN linked data in their traditional MARC based OPAC to enrich the plain bibliographic metadata if the OPAC search results related to theater plays.

    We decided to name our little proof of concept project “Dutch Culture Link”. The people involved for DEN are Marco Streefkerk, Annelies van Nispen and Monika Lechner. For TIN it’s Ad Aerts. For UBA: Roxana Popistasu and myself. Of these five people I knew four already face to face and one (Monika) on Twitter. I think this helps.

    To start with, we described the data model of the TIN Productions and Performances database (in terms of relationships or triples) as follows:

    • a Play is written by one or more Persons (as author)

      DCL data model

    • a Play can be ‘effectuated’ in one or more Productions
    • a Production can be ‘staged’ in one or more Performances
    • a Performance takes place in one Venue on a specific date and time
    • a Person can be producer of a Production
    • a Person can be director of a Production
    • a Person can play a character in a Production, or even in an individual Performance

    Besides the metadata TIN also has links from the database to digital collections (sound and video recordings, photographs, reviews). The model is strikingly similar to the bibliographic FRBR model. The Play is a FRBR Work, the Production is a FRBR Expression and/or Manifestation, the Performance is a FRBR Item.

    Now we knew who and what, but not yet how. We needed to know how to actually apply the theoretical concepts of linked data to our subject area. Questions we had were:

    • which ontology/vocabulary (‘data model’) do we need for publishing the production data?
    • how to format URIs (the linked data unique identifiers)
    • how do we implement RDF?
    • which publication techniques and platforms do we use?
    • which scripting languages can we use?
    • how do we find and get the published linked data?
    • how do we process and present the retrieved linked data?

    We definitely needed some practical hands-on tutorials or training. We could not find an institution organising practical linked data training courses in The Netherlands at short notice. Via Twitter Ian Davis referred us to their TALIS training options. Unfortunately, because we are only an informal proof of concept pilot project without any project funding, we were unable to proceed on this track.
    However, through a contact at The European Library we managed to enter two members of our project team as participants in the free Linked Data Workshop at DANS in The Hague, with Herbert Van De Sompel, Ivan Herman and Antoine Isaac as trainers. This workshop proved to be very useful. Unfortunately I could not attend myself.

    After the workshop we decided to adopt an “agile” aproach: just start and proceed with small steps. For the short term this meant on the TIN side: implementing a script that accesses the XML gateway of the Adlib system underlying the Theater Production Database and produces result in JSON format. The script accepts as input URIs of the form <baseurl>/person/<name>, <baseurl>/play/<person>/<title>, etc. For now only the <baseurl>/person/<name> works, but there are more to come.

    An example: the request <baseurl>/person/joost van den vondel gives the JSON result:

    jsonTIN({
    “key”:”vondel, joost van den”,
    “name”:”Vondel, Joost van den”,
    “birth.country”:”Duitsland”,
    “birth.date”:”17 november 1587*”,
    “birth.place”:”Keulen”,
    “death.date”:”5 februari 1679*”,
    “death.place”:”Amsterdam”
    })

    On the UBA side, if there is an author and/or title field in an individual OPAC result, a JavaScript addon to the Aleph OPAC HTML templates directs a query at the TIN linked data URL using one or both fields as input. The resulting JSON data from TIN is then processed and displayed. At the moment only the author field is used in the <baseurl>/person/<name> query. But here is more to come.

    UBA test OPAC with TIN data

    Next steps in this project:

    • select/adapt/create a vocabulary for the Production/Performance subject area
    • select/adapt/create vocabularies for Persons (FOAF?) and Subjects (SKOS?)
    • add internal relationships with the other entities (Play, Production, etc.) in the JSON structure (implement RDF in JSON)
    • Add RDF/XML as output option, besides JSON
    • add external relationships (to other linked data sources like DBPedia, etc.)
    • extend the number of possible URI formats (for Play, Production, etc.)
    • add content negotiation to serve both human and machine readable redirects
    • extend the options on the  OPAC side
    • publish UBA bibliographic data as linked open data (probably an entirely new project)

    The team will be blogging about project developments (in Dutch) on the DEN blog (addition July 7 2011: new DEN blog location).

    To be continued…

    Share
  • New users – new libraries – new librarians

    Posted on July 8th, 2010 Lukas Koster 2 comments

    Meeting new user expectations at ELAG 2010

    In the near future libraries and librarians will be very different from what they are now. That’s the overall impression I took away from the ELAG 2010 conference in Helsinki, June 8-11, 2010.  ELAG stands for “European Library Automation Group”, which is an indication of its age (34 years): “automation” was then what is now “ICT”. The meetings are characterised by a combination of plenary presentations and parallel workshops.

    This year’s theme was “Meeting new users’ expectations”, where the term “users” refers to “end users”, “customers” or “patrons”, as library customers are also called. When you hear the phrase “end user expectations” in relation to library technology you first of all think of front end functionality (user interfaces and services) and the changing experiences there. A number of presentations and workshops were indeed focused on user experience and user studies.
    Keywords: discovery, guidance, knowing/engaging users, relevance ranking, context.

    But a considerable number of sessions, maybe even the majority, were dedicated to backend technology and systems development.
    Keywords: webservices, API, REST, JSON, XML, Xpath, SOLR, data wells, aggregation, identifiers, FRBR,  linked data, RDF.

    It is becoming ever more obvious that improving libraries’ digital user experience cannot be accomplished without proper data infrastructures and information systems and services. This is directly related to the shift of existing library traditions to the new web experience, which was the leading topic of the presentation given by Rosemie Callewaert and myself: “Discovering the library collections”. We are experiencing a move from closed local physical collections to open networked digital information.

    First of all, library collections will be digital. If you don’t believe that, look at the music industry. The recording of stories started 5000 years ago already. The first music recordings only date from the 19th century.

    Next, collections will be networked, interlinked and virtual. Data, metadata, and digital objects will be fetched from all kinds of databases on the web, not only traditional bibliographic metadata from library catalogues, and mixed into new result sets, using mashup or linked data techniques.

    In this open digital environment, existing and new library systems and discovery tools simply cannot incorporate all possible data services available now and in the future. That is why libraries (or maybe we should start saying ‘information brokers’) MUST have ‘developer skills’ in one form or another. This can range from building your own data wells and discovery tools on one end to using existing online service builders for enriching third party frontends on the other, and everything in between, with different levels of skills required.

    Another inevitable development in this open information environment is “cooperation” in all kinds of areas with all kinds of partners in all kinds of forms. Cooperation in development, procurement, hosting and sharing of software (systems, services) and aggregation of data, with libraries, museums, archives, educational institutions, commercial partners, etc.

    Last but not least there is the question of the value of the physical library building in the digital age. A number of people stress the importance of libraries as places where students like to come to study. But being a learning center in my view is not part of the core business of a library, which is providing access to information. In pre-digital times it was obviously a natural and necessary thing to study information at the location of the physical collection. But this direct physical link between access to and processing of information does not exist anymore in an open digital information environment.

    Back to the ELAG 2010 theme “Meeting new users’ expectations”. In the last slide of our presentation we asked the question “Can LIBRARIES meet new user expectations?” Because we did not have time to discuss it then and there, I will answer it here: “No, not libraries as they are now!”.

    New users don’t expect libraries, they expect information services. Libraries were once the best way of providing access to information. Instead of taking the defensive position of trying to secure their survival as organisation (as is the natural aspiration of organisations) libraries should focus on finding new ways of achieving their original mission. This may even lead to the disappearance of libraries, or rather the replacement of the library organisation by other organisational structures. This may of course vary between types of libraries (public, academic, special, etc.).

    We may need to redefine the concept of library from “the location of a physical collection” to “a set of information services administered by a group of specialists”.

    To summarise: the new digital and networked nature of collections of information leads to a focus on new information services, supported by library staff with information and technology skills, in new organisational structures and in cooperation with other organisations.

    Share
  • User experience in public and academic libraries

    Posted on April 9th, 2010 Lukas Koster 5 comments

    User experience treasure map © Peter Morville

    A couple of recent events got me thinking about differences and similarities of public and academic libraries in the digital age. I used to think that current and future digital developments would in the end result in public and academic libraries moving closer to each other, but now I’m not so sure anymore. Let me explain how this happened.

    On April 1st I attended the UgameUlearn 2010 symposium organised by Delft University Library and DOK, “The library Concept Center“, a public library. Two of the speakers there were David Lee King, Digital Branch & Services Manager at Topeka & Shawnee County Public Library, and Michael Stephens, Assistant Professor Library and Information Science at Dominican University. Have a look at David’s and Michael’s slides.
    A week before that I had the opportunity to see Helene Blowers speak at DB Update 2010 about “Reality check 2.010 – 5 trends shaping libraries“. She is Digital Strategy Director for the Columbus Metropolitan Library and the inventor of 23 Things. Also John Blyberg was there, Assistant Director for Innovation and User Experience of Darien Library, another well known innovative public library. He presented SOPAC, “the Social OPAC”.

    © UgameUlearn - Geert van den Boogaard

    The central topic of the presentations of these four people, all working for, or mainly dealing with, public libraries, was and is “connecting with the public“, both in material and digital ways: by creating inviting, welcoming and collaborative spaces, both in physical library locations and online.
    What particularly hit me was the fact that David Lee King’s official title is “digital branch manager“, meaning that the library’s online activities, or web presence, constitute just another branch beside and equal to the physical branch locations, which has to be managed as one front office in a coherent way.

    Judging by all four people’s job descriptions and presentations, public libraries appear to be very much involved in improving user experience (UX) and web presence. How is this in academic libraries? I think it’s different. I started to think that university and public libraries are moving in different directions the day before UgameUlearn, when I participated in the final session of a brainstorming and discussion track about the future of the library, more particularly of the Library of the University of Amsterdam, the institution I work for. Some of our conclusions are:

    • 90% of the university library’s material will be digital
    • Physical library buildings will disappear
    • Library tasks and services will be more closely tied to the university’s core business, education and research.

    Only the Heritage and Special Collections departments will still have lots of physical books, journals and other objects, and become a separate museum-like entity. I think they should have a look at Michael Edson‘s UgameUlearn slides about the efforts that are being done at the Smithsonian Institute to engage their audience.

    The “digital branch” concept was not identified as a separate development in the future of the library discussion, probably because effectively we will see separate branches developing per subject area. Currently, at the Library of the University of Amsterdam, managing the Library’s web presence is the shared responsibility of the three main central divisions: Acquisition and Metadata Services, Public Services and Electronic Services, and a number of Faculty/Departmental Libraries. Still, the digital branch idea is an interesting concept to investigate, at least for the short term. But at university libraries a digital branch should extend its tasks beyond user experience alone.

    Anyway, the day after UgameUlearn Helene Blowers, David Lee King and Michael Stephens were guests in the live stream show “This week in libraries”, organised by Jaap van de Geer and Erik Boekesteijn, both DOK, and also two of the four UgameUlearn organisers.

    I took my chance and asked them the question: “What should a digital branch in an academic library look like? Different from public library?” If you’re interested, you will find the question and answers towards the very end of the stream.
    Helene, David and Michael agreed more or less that there probably isn’t much difference, that you should ask your community what they want, and that, just like with public libraries, every community has different needs. A clear distinction is that in universities there are three groups of customers: students, faculty and staff, with somewhat different needs. University libraries should at least support the learning process, for instance by creating spaces for collaboration.

    Since then I had the chance of giving this some more thought, and I came to the conclusion that there are probably more differences than similarities between public and academic libraries. Not in the least because Aad Janson of the Peace Palace Library commented (in Dutch) that their library is an academic library too but without students and scientists. This made me realise that there are several different types of libraries, distinguished by a number of important characteristics (audience, subscription, collection type, funding) that influence their position in digital developments. I have tried to compose a, definitely not complete, summary of these library types in a table. I guess the Peace Palace Library would qualify as a “research/scientific library” in this classification scheme.

    Library type
    Audience
    Subscription
    Collection
    Funding
    Public
    local community voluntary local, mostly physical subscriptions, public (local)
    National
    national, global voluntary national physical + digital public (national)
    Research/scientific
    specific professions, students voluntary local physical, remote digital pubic, private
    Museums/archives global community voluntary local physical + digital public, private
    Special explicitly defined voluntary, automatic local physical + digital public, private
    Company
    staff automatic local physical + digital, remote digital private
    Governmental bodies
    staff automatic local physical + digital, remote digital public
    International organisations
    staff automatic local physical + digital, remote digital public, private
    University/higher education students, faculty, staff automatic local physical + digital, remote digital public, private

    I presume that libraries that are dependent on voluntary subscriptions, like public and research/scientific libraries, will put more effort into improving “user experience”. This will be reinforced if the library’s collection is not a unique selling point, and funding is partly based on patron fees. Public libraries have to compete for customers (and not with their collections) and at the same time satisfy local city councils.

    On the other end of the scale we see university libraries that get there customers “into the bargain”, customers who need their affiliation to get access to restricted databases and e-journal articles. Contrary to public libraries, the collections of university/higher education libraries consist of more than the local catalogue: numerous local and remote repositories, databases, e-journals, etc. Consequently, these libraries will put relatively more effort into consolidating and linking all these databases, especially when they have the technical staff to do so. The contributions by academic libraries, and also some national and museum libraries, to linked data and mashup developments for instance seem to confirm this.

    Libraries between these two extremes will probably merge both approaches in various ways, depending on the actual mix of audience, subscription, collection and funding type.

    This is not to say that academic libraries are not interested in improving user experience at all. But it’s just different. Unlike public libraries, academic libraries don’t have to attract new customers with staff recommendations, themes of the month, etc., because students, teachers and researchers each have their own fairly well described subject areas. They just have to provide them with the right finding aids. And these finding aids do not necessarily have to be provided by the libraries themselves, as long as they do offer their patrons efficient delivery mechanisms.

    Of course all types of libraries can learn and benefit from each other’s work and even cooperate. After all, good data structures and relations are indispensable for an optimal user experience.

    Share
  • Mobile library services

    Posted on March 4th, 2010 Lukas Koster 3 comments

    Location aware services in a digital library world

    This is the third post in a series of three

    [1. Mainframe to mobile - 2. Mobile app or mobile web? - 3. Mobile library services]

    © Christchurch City Libraries

    While library systems technology and mobile apps architecture make up the technical and functional infrastructure of mobile web access, mobile library services are what it’s all about. What type of mobile services should libraries offer to their customers?

    As stated before, the two main features that distinguish mobile, handheld devices from other devices are:

    • web access any time anywhere
    • location awareness

    It seems obvious that libraries should take these two conditions into account when providing mobile services, not in the least the first one. I don’t think that mobile devices will completely replace other devices like pc’s and netbooks, like Google seems to think, but they will definitely be an important tool for lots of people, simply because they always carry a mobile phone with them. So in order to offer something extra, mobile applications should be focused on the situational circumstance of potential access to information any time anywhere, and make use of the location awareness of the device as much a possible. But does this also apply to services for library customers? That partly depends on the type of library (public, academic, special) and the physical and geographical structure of the library (one central location, branch locations).

    As a starting point we can say that mobile library services should cover the total range of online library services already offered through traditional web interfaces. However, mobile users may not want to use certain library services on their mobile devices. For instance, from an analysis of usage statistics of EBSCO Mobile at the Library of Texas A&M University, generously provided by Bennett Ponsford, it appears that although the number of searches in EBSCO mobile is increasing, only 1% of mobile searches leads to a fulltext download, against 77% of regular EBSCO searches. These findings suggest that library customers, at least academic ones, are willing to search for books and articles on their mobile devices, but will postpone actually using them until they are in a more convenient environment. Apparently small screens and/or mobile PDF readers are not very reader friendly in academic settings. This may be different for public library customers and e-books.

    So, libraries should concentrate on offering those mobile services that are wanted and will actually be used. In the beginning this may involve analysis of usage statistics and customer feedback to be able to determine the perfect mobile services suite for your library. Libraries should be prepared for “perpetual beta” and “agile development”.

    There are two main areas of information in which libraries can offer mobile services:

    Curtin library mobile

    • practical information
    • bibliographical information

    This is no different from other library information channels, like normal websites and printed guides and catalogues.

    Practical information may consist of contact address, email and telephone information, opening hours, staff information, rules and regulations of any kind, etc. In most cases this is information that does not change very often, so static information pages will be sufficient. However, especially with mobile devices who’s owners are on the move, providing dynamic up to date information will give an advantage. For instance: today’s and tomorrow’s opening hours, number of currently available public workstations per location, etc.

    The information provided will be even more precisely aimed at the user’s personal situation, if the “location awareness” feature is added to the “any time anywhere” feature, and up to date static and dynamic information for the locations in the immediate vicinity of the customer is shown first, using the device’s automatic geolocation properties. And all this gets better still if the library’s own information is mashed up with available online tools, like showing a location on Google Maps when selecting an address, and with the device’s tools, like making a phone call when clicking on a phone number.

    Bibliographical information should be handled somewhat differently. Searching library catalogues or online databases is in essence not location dependent. Online digital bibliographical metadata is available “in the cloud” any time anywhere. It’s not the discovery but the delivery that makes the difference. We have already seen that mobile academic library customers do not download fulltext articles to their mobile devices. But mobile customers will definitely be interested in the possibility of requesting a print item to be delivered to them in the nearest location. WorldCat Mobile, like “normal” WorldCat, for instance offers the option to select a library manually from a list in order to find the nearest location to obtain an item from. It would of course be nice if the delivery location would be automatically determined by the mobile request service, using the device’s location awareness and the current opening hours of the library branches.

    The funny thing here is that we have the paradoxical situation of state-of-the-art technology in a world of global online digital information being used to obtain “old fashioned” physical carriers of information (books) from the nearest physical location.

    Augmented reality, as a link between the physical and virtual world, may be a valuable extension of mobile services. A frequently mentioned example is scanning a book cover or a barcode with the camera of a mobile phone and locating the item on Amazon. It would be helpful if your phone could automatically find and request the item in the nearest library branch. Personally I am not convinced that this is very valuable. Typing in ISBN or book title will do the job just as fast. Moreover, bookshop staff may not appreciate this behaviour.

    A more common use of augmented reality would be to point the camera of your mobile device to a library building, after which a variety of information about the building is shown. The best known augmented reality app at the moment is Layar. This tool allows you to add a number of “layers”, with which you can for instance find the nearest ATM’s or museums, or Wikipedia information about physical objects or locations around you.

    Layar - LibraryThing Local

    There is also a LibraryThing Local layer for Layar, with which you can find

    information about all libraries, bookshops and book related events in the neighbourhood. It may even be possible to find a specific book in an open stack using this technology.

    All these extended mobile applications suggest that users of apps may not just be a specific group of people (like library customers), but that mobile users will be interested in all kinds of useful information about their current location. Library information may be only a part of that. Maybe mobile apps should be targeted at a more general audience and include related information from other sources, making use of the linked data concept.

    A search in a library catalog in this case may result in a list of books with links to related objects in a museum nearby or a historic location related to the subject of the book. Alternatively, an item in a museum website might have links to related literature in catalogs of nearby libraries. Anything is possible.

    The question that remains is: should libraries take care of providing these generic location based services, or will others do that?

    Share
  • Mobile app or mobile web?

    Posted on February 21st, 2010 Lukas Koster 11 comments

    Technology, users and business models

    This is the second post in a series of three

    [1. Mainframe to mobile - 2. Mobile app or mobile web? - 3. Mobile library services]

    © turkeychik

    Mobile access to information on the internet is the latest step in the development of information systems technology, as described in the previous post in this series. The two main features that distinguish mobile devices from other devices are:

    • Access to the web literally any time, anywhere
    • Location awareness using GPS or the mobile network

    Let’s focus on web access first. There are two main ways in which information providers can provide access to their data: by a mobile web browser or by apps.

    The easiest way to provide mobile access is: do nothing. Users of mobile internet devices can simply visit all existing websites with their mobile browser. However, in doing so they will experience a number of problems: performance is slow, pages are too large, navigation is difficult, certain parts of websites don’t work. These problems are caused by the very physical characteristics of mobile technology that make mobile internet access possible: the small size of devices and displays, the wireless network, the limited features of dedicated mobile operating systems and browsers.
    Fortunately, technological development is an interactive, reciprocal, cyclic process. Technology continuously needs to find solutions to problems that were caused by new uses of existing technology.

    Dumbed Down

    Many organisations have solved this problem by creating separate “dumbed down” mobile versions of their websites, containing mainly text only pages and textual links to their most important services and information. In the case of libraries for instance “locations and addresses“,  “opening hours“, etc. See this list of examples (with thanks to Aaron Tay). Another example is LibraryThing Mobile, which also has a catalog search option. In these cases you have to manually point your browser to the dedicated mobile URL, unless the webserver is configured to automatically recognise mobile browsers and redirect them to the mobile site.

    Of course this not the optimal solution for two reasons:

    • On the front end: as an information provider you are complete ignoring all graphical, dynamic, interactive and web 2.0 functionality on the end user side. This means actually going back to the early days of the world wide web of static text pages.
    • On the back end: duplicating system and content administration. In most cases it will come down to manually creating and editing HTML pages, because most website content management systems may not offer manual or automatic editing of pages for mobile access. Some systems offer automatic recognition of mobile browsers and display content in the appropriate format, like the WordPress plugin “WordPress Mobile Edition” that automatically shows a list of posts if mobile browsers are detected. This is what happens on this blog.

    SCCL App

    Because of this situation we are witnessing a re-enactment of the client-server alternative to static HTML that I described previously: mobile apps! “Apps” is short for “applications“, apparently everything needs to be short in the mobile online web 2.0 age. Apps are installed on mobile devices, they run locally making use of the hardware, operating system and user friendly interface of the device, and they only connect to the internet for retrieving data from a database system in the cloud (on a remote server).
    A disadvantage of this solution obviously is that you have to multiply development and maintenance in order to support all mobile platforms that your customers are using, or just support the most used platform (iPhone) and ignore the rest of your end users. Alternatively you can support one mobile platform with an app, and the rest with a mobile web site. Organisations have the choice of developing apps themselves from scratch, or using one of the commercial parties that offer library apps, such as Boopsie, Blackboard or the recently announced LibraryThing Anywhere, that is meant to offer both mobile web and apps for iPhone, Blackberry and Android.

    Some examples:

    An alternative solution to the client-server and “dumbed down” models would be to use the new HTML5 and CSS3 options to create websites that can easily be handled by all PC and mobile webbrowsers alike. HTML 5 has geolocation options, and browsers are made location aware this way too. The iWebKit Framework is a free and easy package to create web apps compatible for all mobile platforms. See this demo on PC, iPhone, Android, etc.
    Some say that HTML5/CSS3 will make apps disappear, but I suspect performance may still be a problem, due to slow connections. But it’s not only a technology issue. It’s also a matter of business models, as Owen Stephens and Till Kinstler pointed out.

    Apps can be distributed for free by organisations that want to draw traffic to their own data, ignoring the open web. This method fits their clasic business model, as Till remarked, mentioning the newspaper business as an example.
    But there is also another side to this: apps can be created by anybody, making use of APIs to online systems and databases, and be shared with others for free or for a small fee, as is the case with the iPhone Apps Store, the Android Market, the Nokia Ovi Store, or the newly announced Wholesale Applications Community (WAC). This model will never be possible with web based apps (like HTML5), because nobody has access to a system’s web server other than the system administrators. It is also much too complicated for developers and consumers of apps to host web apps on a server that mobile device users can connect too.
    And there is more: independent developers are more likely to look beyond the boundaries of the classic model of giving access to your own data only. Third party apps have the opportunity to connect data from a number of data sources in the cloud in order to satisfy mobile user needs better. To take the newspaper business example, I mentioned this in my post “Mobile reading“: general news apps vs dedicated newspaper apps. The rise of the open linked data movement will only boost the development and use of the mobile client server model.

    In my view there will be a hybrid situation: HTML5/CSS3 based web apps and local mobile apps will coexist, depending on developer, audience, and objectives.

    What services library mobile apps should offer, including location awareness and linking data, is the topic of another post.

    Share
  • Mainframe to mobile

    Posted on February 16th, 2010 Lukas Koster 11 comments

    The connection between information technology and library information systems

    This is the first post in a series of three

    [1. Mainframe to mobile - 2. Mobile app or mobile web? - 3. Mobile library services]

    The functions, services and audience of library information systems, as is the case with all information systems, have always been dependent on and determined by the existing level of information technology. Mobile devices are the latest step in this development.

    © sainz

    In the beginning there was a computer, a mainframe. The only way to communicate with it was to feed it punchcards with holes that represented characters.

    © Mirandala

    If you made a typo (puncho?), you were not informed until a day later when you collected the printout, and you could start again. System and data files could be stored externally on large tape reels or small tape cassettes, identical to music tapes. Tapes were also used for sharing and copying data between systems by means of physical transportation.

    © ajmexico

    Suddenly there was a human operable terminal, consisting of a monitor and keyboard, connected to the central computer. Now you could type in your code and save it as a file on the remote server (no local processing or storage at all). If you were lucky you had a full screen editor, if not there was the line editor. No graphics. Output and errors were shown on screen almost immediately, depending on the capacity of the CPU (central processing unit) and the number of other batch jobs in the queue. The computer was a multi-user time sharing device, a bit like the “cloud”, but every computer was a little cloud of its own.
    There was no email. There were no end users other than systems administrators, programmers and some staff. Communication with customers was carried out by sending them printouts on paper by snail mail.

    I guess this was the first time that some libraries, probably mainly in academic and scientific institutions, started creating digital catalogs, for staff use only of course.

    © n.kahlua72

    © RaeA

    Then came the PC (Personal Computer). Terminal and keyboard were now connected to the computer (or system unit) on your desk. You had the thing entirely to yourself! Input and output consisted of lines of text only, one colour (green or white on black), and still no graphics. Files could be stored on floppy disks, 5¼-inch magnetic things that you could twist and bend, but if you did that you lost your data. There was no internal storage. File sharing was accomplished by moving the floppy from one PC to another and/or copy files from one floppy to another (on the same floppy drive).

    © suburbanslice

    Later we got smaller disks, 3½-inch, in protective cases. The PC was mainly used for early word processing (WordStar, WordPerfect) and games. Finally there was a hard disk (as opposed to “floppy” disk) inside the PC system unit, which held the operating system (mainly MS-DOS), and on which you could store your files, which became larger. Time for stand-alone database applications (dBase).

    Client server GUI

    Then there was Windows, a mouse, and graphics. And of course the Internet! You could connect your PC to the Internet with a modem that occupied your telephone line and made phone calls impossible during your online session. At first there was Gopher, a kind of text based web.
    Then came the World Wide Web (web 0.0), consisting of static web pages with links to other static web pages that you could read on your PC. Not suitable for interactive systems. Libraries could publish addresses and opening hours.
    But fortunately we got client-server architecture, combining the best of both worlds. Powerful servers were good at processing, storing and sharing data. PC’s were good at presenting and collecting data in a “user friendly” graphical user interface (GUI), making use of local programming and scripting languages. So you had to install an application on the local PC which then connected to the remote server database engine. The only bad thing was that the application was tied to the specific PC, with local Windows configuration settings. And it was not possible to move the thing around.

    Now we had multi-user digital catalogs with a shared central database and remote access points with the client application installed, available to staff and customers.

    Luckily dynamic creation of HTML pages came along, so we were able to move the client part of client-server applications to the web as well. With web applications we were able to use the same applications anywhere on any computer linked to the world wide web. You only needed a browser to display the server side pages on the local PC.

    Now everybody could browse through the library catalog any time, anywhere (where there was a computer with an internet connection and a web browser). The library OPAC (Online Public Access Catalog) was born.

    Web OPAC

    The only disadvantage was that every page change had to be generated by the server again, so performance was not optimal.
    But that changed with browser based scripting technology like JavaScript, AJAX, Flash, etc. Application bits are sent to the local browser on the PC at runtime, to be executed there. So actually this is client server “on the fly”, without the need to install a specific application locally.

    © nxtiak

    In the meantime the portable PC appeared, system unit, monitor and keyboard all in one. At first you needed some physical power to move the thing around, but later we got laptops, notebooks, netbooks, getting smaller, lighter and more powerful all the time. And wifi of course, no need to plug the device in to the physical network anymore. And USB-sticks.

    Access to OPAC and online databases became available anytime, anywhere (where you carried your computer).

    The latest development of course is the rise of mobile phones with wireless web access, or rather mobile web devices which can also be used for making phone calls. Mobile devices are small and light enough to carry with you in your pocket all the time. It’s a tiny PC.

    Finally you can access library related information literally any time, anywhere, even in your bedroom and bathroom.

    Mobile library app

    It’s getting boring, but yes, there is a drawback. Web applications are not really accommodated for use in mobile browsers: pages are too large, browser technology is not really compatible, connections are too slow.

    Available options are:

    • creating a special “dumbed down” version of a website for use on mobile devices only: smaller text based pages with links
    • creating a new HTML5/CSS3 website, targeted at mobile devices and “traditional” PC’s alike
    • creating “apps”, to be installed on mobile devices and connect to a database system in the cloud; basically this is the old client-server model all over again.

    A comparison of mobile apps and mobile web architecture is the topic of another post.

    Share