Posted on November 19th, 2009 9 commentsAbout cataloging physical items or units of content
2009 is the year of the e-book, or perhaps better: of the e-book reader. This is an important distinction that I will explain below. E-books are becoming more popular because of the increasing availability of various cheap e-book readers.
But what is an e-book? Is it the same as a book? Some people say yes, some people say no. This question shouldn’t be so hard to answer, should it? We just have to define what a book is first. So, what is a book?
When people think of a book, they picture something like the archetypal book: printed, medium sized, hardcover, no illustrations on the front. The thing that you can actually hold in your hands and read.
But if they say: “This book was written by that author”, they don’t think that the author actually wrote that particular item they are holding in their hands. Now we already have two different meanings of the concept “book”: one is a tangible object, the other is the content that is made available in this tangible object by means of printed text.
Besides these conceptual levels, there are more ways by which books can be described, as shown by this incomplete list of examples:
Physical form: Historically there have been clay tablets, inscribed stones, handwritten scrolls, handwritten bound pages, printed pages. We also know different formats targeted at specific uses or audiences: audio books, braille books, pop up books.
Content: A book can contain text only, or images only (for instance a children’s picture book, or a book of photographs), or a combination of both.
Units: A book can consist of one “story” ( for instance a novel), optionally subdivided in chapters, or be made up of several stories, or articles (like a text book about a certain subject). Chapters and stories can be written by the same or by several authors. A book can also contain two or more other books by the same author (“collected works”), etc.
Content type: A book can contain fiction, aimed at entertaining readers. Books can be purely administrative, like accounting books. There are religious books to be used in religious ceremonies (sometimes these are referred to as “THE book“). Some books are for studying and learning (“text books”, which may also contain images by the way). There are scientific books and instructional books (travel guides, cook books, manuals).
First, we need see how all this fits together before we can answer the question “Is an e-book a book?” or more precise: “In which sense is an e-book a book?“. Fortunately there is already a conceptual model for bibliographic entities and the relationships between them that describes this: FRBR (Functional Requirements for Bibliographic Records), published by IFLA. The IFLA Final Report (2009 version) says it all, but there are also a couple of short summaries: Barbara Tillet’s (LoC) “What is FRBR?”, Jenn Riley’s “FRBR” blog post, and there is William Denton‘s FRBR Blog for more information.
The FRBR model is targeted at libraries, maybe even at publishers and booksellers too.
I will not go into the FRBR “Group 2” (persons and corporate bodies) and “Group 3” (subjects) entities here, but focus on the “Group 1” entities.
The FRBR “Group 1 entities” consist of Work, Expression, Manifestation and Item (also referred to as WEMI). FRBR entities not only apply to books or textual works, but also to movies, theater plays, music, etc.
There are hierarchical relationships between the entities:
- Work – a distinct intellectual or artistic creation
- Expression – the intellectual or artistic realization of a work
- Manifestation – the physical embodiment of an expression of a work
- Item – a single exemplar (or copy) of a manifestation
- A work (for instance a book) can have (“is realized through“) one or more expressions (for instance the original English text and the Dutch translation).
- Each expression can have (“is embodied in“) one or more manifestations (for instance a specific edition with an ISBN, or one of more works/expressions in a “collected works” edition).
- Each manifestation has (“is exemplified by“) one or more items, the things you can actually hold in your hands.
- A manifestation can also consist of several expressions, as in the “collected works” example.
Besides these hierarchical relationships between different entity types there are also recursive relationships between entities of the same type: hierarchical and other. Some examples:
- A work is part of another work (hierarchical), as in a series like Harry Potter
- A work is an adaptation of another work
- An expression is a sequel to another expression
- A manifestation is a facsimile of another manifestation
So far so good. The FRBR conceptual model describes (or aims to describe) real world things and relationships on an abstract level. The model can be implemented in actual systems (both computerised and manual!). In these systems you are free to refer to the conceptual model entities (“work”, “expression”, “manifestation”, “item”) by names that are actually used in daily life. This is what Rob Styles is trying to do when he talks about “stories” and “editions” in his recent blog post “Bringing FRBR Down to Earth…” I think. I will define the “story” concept in a different way below.
Until now, catalogers and library systems have been targeted at describing the thing they have in their hands (or better the items that make up the library’s collection). In FRBR terms this means that catalogs describe manifestations and items, not works and expressions (or implicitly at best). In short, a bottom up approach. This is understandable, because in the past there was nothing else to go by than the explicit manifestation information available on the physical item (author, title, ISBN, edition, publisher, etc.) .
Of course, MARC21 provides some options to describe relationships with expressions and works and other manifestations, like the 250 – Edition Statement, the 490 – Series Statement and the 76X-78X – Linking Entries-General Information. But these fields can only be used if the information is known to the cataloger.
Also, in traditional catalogs, works that are distinct expressions in one manifestation (like articles, chapters, stories, poems) are not described separately, because of the same reason: you only catalog the item you have before you. In the ideal world, or better in the new digital world, the unit to be cataloged or described should always be the work, which we may call “story”. In other words: we should catalog units of content (“stories”) instead of, or supplementary to, physical items.
Current library practice is that we catalog books and journals in the catalog and offer article descriptions through subscribed article metadata databases separately.
So, back to the e-book. Where does that fit in? An e-book could be considered nothing more than a manifestation and/or an item belonging to a certain work/expression, because an e-book can be everything a printed book is. As such it is equivalent to a braille or audio book. Some libraries treat e-books as something different, as works/expressions as such. They catalog e-books separately, just like all other items/manifestations are treated as separate works. There are even separate e-book overviews.
But there is more to it than that. The big difference with books until now is that an e-book is not inseparably linked to the physical carrier. A printed book can only be read if the reader has a physical copy (a FRBR item) consisting of bound paper pages containing the text printed on them with ink. The same applies to handwritten texts, scrolls, clay tablets, etc.
Even more so, the physical form, together with economical conditions and possibilities for distribution, often determines the actual manifestation of a book and a journal. A book (or volume) can only contain a certain number of pages in order to be manageable. There is also a cost consideration in the size and distribution of the items.
What we call an e-book is actually only a digital, abstract manifestation of a work/expression. In order to be able to read it you have to download it in a specific format (PDF, epub, etc.) onto a physical carrier (USB-stick, computer disk, etc.), and then you need a physical reading device with dedicated software (dedicated e-book readers like Kindle, a computer, a mobile phone, etc.).
Libraries do not have e-books as items, only as manifestations. These e-book manifestations can be available on an online server somewhere in whatever form, and can be made into an item on-the-fly, using a specific format on-the-fly, choosing a physical carrier on-the-fly. What’s more, the content of e-books can also be selected out of several works/expressions on-the-fly, this way creating manifestations or even expressions on demand.
Now, is the FRBR conceptual model suited for describing e-books? If we treat e-books as manifestations without items (like we handle e-journals in our catalogs), how do we proceed? The FRBR Manifestation item among others has these attributes:
- form of carrier
- extent of the carrier
- physical medium
- system requirements (electronic resource)
- file characteristics (electronic resource)
- mode of access (remote access electronic resource)
- access address (remote access electronic resource)
But we have just seen that in the case of e-books these are features of the items generated on-the-fly, which are not known before. Does this mean that we have to describe as manifestations all possible physical forms that one e-book can take? This would also mean that an e-book as such should be described on the level of a FRBR Expression. This may be correct in some cases (the creation of aggregated content on-the-fly), but not in all: where an e-book is similar to manifestations like braille, audio book, etc.
Does FRBR need an extra level? I am not sure. Let’s look briefly at how e-journals are handled. As far as I can see, journal and e-journal issues are described as separate manifestations of journals and e-journals (with a “part-of” relationship to the higher level). These issue manifestations are treated as aggregates that contain articles, that are also described as manifestations with a “part-of” relationship to the issue. In MARC21 this handled by the 773 Host Item Entry tag.
I am not sure if and how different physical formats (PDF, HTML) for articles in e-journals are handled. The obvious difference with e-books is that the described unit is the article (or “story” as definition of unit of content), which can be downloaded as separate items. The e-journal articles are ideally also identified by unique identifiers (DOI‘s).
What does this mean for e-books? I think we can treat an e-book as either an expression or a manifestation, depending on the nature of the specific e-book in question. For the e-book manifestation we would only need to register the mode of acces, access address and manifestation identifier attributes, preferably in the form of a URI.
I also think we should use the possibilities of the FRBR model to start describing, cataloging and identifying the “stories” (chapters, articles, etc.) that make up books and e-books separately, as units of content in their own right. People are interested in the content, the “stories”, not the physical items or artificial digital aggregate units like e-books or e-journals.
In this sense, the “e-journal” is an archaic concept, where the limitations of the physical journal are translated as such to the digital world. There is no real need to bundle articles in electronic form into one electronic issue of an e-journal that is published at regular intervals in time. Electronic articles can be published individually immediately after peer review and approval. Published articles can be aggregated in one nor more virtual online serials.
Like ISBN’s and ISSN’s we need an identifier for the units of content other than journal articles. As a matter of fact, there already is one, the DOI:
“A DOI name can be used to identify any resource involved in an intellectual property transaction. Intellectual property includes both physical and digital manifestations, performances and abstract works. An entity can be identified at any arbitrary level of granularity.” (see http://www.doi.org/faq.html#2). Thanks to Owen Stephens for pointing this out to me in a twitter discussion with Inga Overkamp.
I may be wrong about all this. I am open for comments and suggestions.