Data. The final frontier.
RSS icon Home icon
  • Collection 2.0

    Posted on February 15th, 2009 Lukas Koster No comments

    Henk Ellerman of Groningen University Library writes about the “Collection in the digital age” reacting to Mary Frances Casserly’s article “Developing a Concept of Collection for the Digital Age“. I haven’t read this 2002 article yet, but Henk Ellerman goes into the problem of finding a metaphor describing collections that for a large part consists of resources available on the internet.
    Henk says:
    …the collection (the one deemed relevant for… well whatever) is a subset that needs to be picked from the total set of available online resources.”
    “I find it quite remarkable that the
    new collection is seen as the result of a process of picking elements, a process similar to finding shells on a beach.”
    “What if we expand the notion of a collection in such a way that the sea becomes part of it?”
    “The main issue with any sensible collection is quality control. We don’t want ugly things in our collections.”
    “Then a collection is not a simple store of documents anymore, but a rather complex system of interrelated documents, controlled by a selected group of people.”
    “Librarians ‘just’ need to make the system searchable.”

    I have a couple of thoughts about collections myself that I would like to add to these.
    Originally, a collection is the total number of physical objects of a specific type that are in the possession of a person, or an organisation. Merriam-Webster says: “an accumulation of objects gathered for study, comparison, or exhibition or as a hobby“. People can collect Barbie dolls or miniature cars as a hobby, or rare books or monkey skulls for scientific reasons.
    (By the way, individual collectors of rare books are often described in movies as rich, old, excentric people with a small but very valuable collection of very old books about topics such as satanism, who end up being killed in a horrible way, and having there collections destroyed by fire, like I saw some time ago in Polanski’s “The ninth gate”.)

    When organisations have collections then it is almost always for study or exhibition, but also for practical reasons. We are talking mainly about museums and libraries. In the case of libraries there is a rough distinction between public libraries and libraries belonging to scientific and/or educational institutions. Let’s focus on educational libraries, or “university libraries” to make the picture a bit simple.
    University libraries have collected written and/or printed texts (books, journals, also containing images, maps, diagrams, etc.) in order to provide their staff and students with material to be able to teach and study. A library’s collection then describes all objects in the possession of the library. In the digital age, electronic journals and databases have been added to these collections, but in most cases this concerns only resources the library owns or for which the library pays money to gain access to. The collection then becomes the totality of objects (physical or digital) that the library owns or is granted access to by means of a contract. Freely available resources are explicitly not counted here.

    Now, here we have to make an important distinction between a library’s total collection (“the collection”), meaning “all items the library owns or has access to”), and a collection on a specific topic or for a specific subject (“subject collections”), meaning “all items that have been selected by professionals to be part of the material that is necessary for studying a specific topic”, for instance “the University of Amsterdam library’s Chess collection”. In the past, people would have to go to a specific library to consult a specific collection on a specific subject.
    “The collection” is merely the sum of all the library’s “subject collections”, nothing more.

    Before we go to the collection in the digital age, an interesting intermediate question is: what is the position of interlibrary loan in the concept of collection? Are books from other libraries that are available to a specific library’s patrons to be considered as part of that second library’s collection? In the strict sense of the collection concept (“all items the library owns”), the answer is “no”. But if we expand the notion of collection to mean: “everything a library has access to”, then the answer clearly would have to be “yes”.

    Now, in the digital age, the limitation that a collection’s objects should be available physically in a specific location, disappears. This means that anything can be part of “the collection” of a specific library, also objects or texts that have not been judged as scientific before, like blog posts. This is the “sea” that Henk Ellerman is talking about. A subject collection is also not limited by physical borders anymore. Subject collections can contain material, physical and digital, from anywhere. In this case, there is no reason that a subject collection should be a specific library’s subject collection, obviously. Key is “quality control”, or as Henk Ellerman puts it: “We don’t want ugly things in our collections“. Subject collections should be universal, global, virtual collections of physical and digital objects, “controlled by a selected group of people“.

    Now, the most important question: who decides who will be part of these selected groups of people? The answer to this question is still to be found. I guess we will see several types of “expert groups” emerge: coalitions between university libraries nationally or globally, but also between not-for-profit and commercial organisations, and of course also between individuals cooperating informally, like in the blogosphere, or in wikipedia .
    The collections that will be controlled by these coalitions will not have fixed boundaries, but will have more “professional” cores with several “less professional” spheres around it or intersecting with other collections.

    It is time we start building.

  • Unique authors

    Posted on February 4th, 2009 Lukas Koster No comments

    Jonathan Rochkind, in his post “How do name authorities work¬†anyway?“, wonders if catalogers will confuse him with another writer of the same name that has an LC authority record, whereas he does not have one.

    I guess the relevance of this problem depends entirely on the question: do you think it’s important to know that an author of a specific work is the same as the author of another work? A former colleague of mine whom I respect very much, used to say that it does not matter, as long as the correct name appears with the work in question. This was only six years ago, before the emergence of web 2.0 and library 2.0 type services. It is just like looking at a printed book: you read the author’s name, and if there is no further information on the back cover, or a list of publications by the author inside, then that’s all there is to it. In normal life, if you read a book or an article for pleasure, or even for business, study or research, that is no problem. No need for author authority records at all.

    However, the picture is completely different from the point of view of the authors, especially in the case of professional scientific and research staff, where the exact number of publications and citations is crucial. For these authors it is vital that the correct authority record is used for their publications. Here we definitely need authority records with unique identifiers. But of course there are so many different systems in use: LC authority records , WorldCat Identities , national systems etc., they all use their own identifiers.

    There is the proposal to develop the UAI, Universal Author Identifier . This system depends on authors registering and maintaining their own personal information in a freely accessible web based database. There was a pilot system for a while, but it is not clear if any results were reached.

    In The Netherlands a similar project on a national scale has led to a live implementation: the DAI, Digital Author Identifier . The DAI is based on the identifier used for authors in the OCLC-PICA Dutch National Union Catalog /Common Catalog system “PPN”, and is assigned to every author who has been appointed to a position at a Dutch university or research institute or has some other relevant connection with one of these organisations. The DAI is used in the Dutch university repositories, the Dutch national Research Database and in the national integrated portal NARCIS .
    The difference with UAI is that DAI is assigned by catalogers in one of the participating organisations, whereas UAI depends on voluntary cooperation of the authors themselves.

    Of course a “universal author identifier” still does not solve Jonathan’s initial question: confusion is still possible if the authors do not have a clear interest in maintaining their information themselves.

    Another issue here, about which something more can be said in a future post, is that for a real universal system we should use URI’s, as for unique works (see Owen Stephens’ post “The Future is Analogue “) and subject headings.