22 thoughts on “Who needs MARC?

  1. So what alternative do you suggest?
    “WDC” is a nice Peter like answer, it does not help very much in the real world. or am I missing the point?

    RDA is certainly not the answer. It’s no more then a very complicated set of rules for librarians about how to describe resources, not about the format to describe the resource in. better then AACR2 or Fobid, but not an answer to your question.

    As far as I can tell there is no other internationally widely used standard format for describing bibliographic objects which could replace MARC at this moment. Marc has got it’s flaws, so deos any other format.

    I do support “WDC”, please make you own choices for internal storage, but we also need some workable standard. Marc supplies just that

  2. Bas, I think we actually agree! “We don’t care” what internal format we use, but we need a standard format for exchanging data. At the moment that standard is MARC, with its flaws. But it is the best we have now. But only for data exchange!

  3. Of course one could make use of OAI-PMH conversion, like the XC-project showed.

    If MARC served a retrieval purpose, and only that purpose, it would make sense to normalise. Luckily for us, it doesn’t, which is why the unformalised data is available for identification e.g. while matching legacy records without unified identifier. Of course this is true if you only look at the MARC Bib record, which is only one-fifth of the complete format; much of the normalisation takes place in the other parts.

    As MARC relates to Z39.50, so does MARCXML (or MODS, or DCMI) relate to SRU/SRW. To compare MARC to SRU borders the absurd.

    > MARC is NOT a data storage format. In my opinion MARC is not even an exchange format, but merely a presentation format.

    MARC is anything but a presentation format, it is foremost about storage of structured bibliographic data, and a means of exchange. How much of the presentation layer (ISBD) is visible, depends on the software you use.

    > But I think MARC was invented by old school cataloguers who did not have a clue about data normalisation at all.

    It was invented by Henriette Avram, who knew nothing about cataloguing and whose assignment it was to create a low storage solution for the ton of data of the Library of Congress.

    > Maybe the projected new standard for resource description and access RDA will be the solution, but that may take a while yet.

    RDA is a set of description rules, exactly like the AACR you criticised earlier. The difference is that RDA gives the cataloguer a context sensitive advice in an online environment. RDA can be used without ISBD presentation, either with MARC, MODS, EAD, VRA, CDWA or DCMI, depending on the specificity the institution requires. In other words, there are a lot of choices, and we make those choices because WDC (We Do Care).

    1. Thank you for mentioning XC project’s OAI-PMH implementation (OAI Toolkit). It is very found for me, as the creator of the first versions of that tool. OAI Toolkit converts MARC to MARCXML and stores that format. While creating the tool, we converted and loaded millions of MARC records from different great US university and school libraries for testing purposes. I should say, that a number of percentage (somewhere more than 10%) of the records were somehow invalid: they did not fit the MARC standards simple formating rules. The greater part of such errors were detected in the leader. I don’t know the reason, and even the librarians did not know why are those error takes place (like strange characters in the last two positions of the Leader, where the standardized way should be ’00’). My mere guess is, that the ILSes use them a creative way for their internal (and secret?) purpose.

      Anyway: MARC is a good exchange standard, but those critiques regarding to normalisation in the post are relevant. You are right, that MARC Authority controls Authority Records and Authority Names, but inside the bib standard they do not reflect to this standard, and the meanings of the subfields are not defined (eg. “$t – Title of a work” – in which langauge? which normalized form? etc.).

      I can provide another critique: the evolvement of the regional/national MARC standards, which are not 100% compatible. I know HUMMARC, which is the Hungarian MARC, and somehow I run compatibility problems with the OCLC MARC (Marc4j has problems with OCLC Leader). I don’t know the real reason, but I guess, that MARC was not enough for these communities, and they would like to extend it with new or modfied features.

      You can read Roy Tennant’s witty article about the main problems with MARC: “MARC must die”
      http://roytennant.com/column/?fetch=data/58.xml

      Best wishes!
      Péter

  4. @ Peter Schouten
    A short reaction: I never compared MARC to SRU! It surprises me that you have concluded that. I merely mentioned SRU as a useful protocol to PRESENT for instance MARC-records (MARCXML if you want to be precise) after converting from whatever data storage format is available. Just like Z39.50 of course.

    To my eyes MARC is NOT the best storage format there is, as datamodel experts will confirm. At best it is an exchange format, and that is fine with me, but also as exchange format it is not perfect, as I have tried to show.
    I agree that MARC is structured, only not structured enough for the present needs of linking to other online tools. I do not doubt that Henriette Avram did a very good job in the time that she created MARC, and it may have served its purpose well for a number of years.

    I did not actually criticise AACR2 at all. I just mentioned it as a set of cataloguing rules.

    And I must object to the suggestion that I don’t care! Of course I care, otherwise I would not have written this.
    What the post is about, if you read carefully, is that I don’t care in which format the data is stored, as long as it is stored in such a way that we can retrieve it in any format we need. And I do not think MARC is such a way.

    1. @ Péter Király
      Thanks for the link to Roy Tennant’s article. Someone else told me about it too. Had not seen that before….

  5. I do agree with Peter Schouten about two things. Marc is not a presentation format. Marc is invented to create a low storage solution. It was a low storage solution, especially in the ISO2709 packing it used to be exchanged on tape. (I spend too much time in the past struggling with it).

    Marc is an exchange format, and not a very good one as explained in Roy’s article. It can be easily created from a normalised data model. No problem.
    (you can do it in many different ways however, which is a bit of a problem :))

    The big problem is that a new, better structured, exchange format is hard to define since many organizations can not convert to it. They have Marc as their native storage format. It is easy to convert to Marc. It is impossible to convert from Marc to something better.

    Now we have MarcXMl and anybody who knows a litle bit about XML is Laughing Out Loud when looking at Marc XML. If you don’t know about the history of Marc you wonder how anyone could even think of considering a schema like this.

    Marc21 is widely accepted and that is the only good reason to use it.

  6. The main role of MARC21 today is to allow commercial systems vendors to create a single system that can be used by any library. It has value because it is a standard — and to replace it, we will need another standard. I think that the new standard should focus on data elements, not record format. If we standardize the data elements, then any record format can be used. MARC standardizes both (and not necessarily very well by today’s technology) in a way that they cannot be easily separated.

  7. As a cataloger who deals with MARC on a daily basis (sometimes I even think in MARC code), I completely agree that it is outdated and needs to be replaced. But the problem lies in the fact that there is not yet any agreement about a replacement. Produce a replacement that is flexible, easy to use, fits the cataloging standards that we already have (so that legacy records are not lost), AND IS WIDELY ACCEPTED, and I’m sure you would find many takers.

    Until that comes along, MARC is still the best we’ve got.

  8. I understand the difference between marc and RDA. As a cataloger though, what I see in RDA is such minimal information as to be totally worthless as to being able to distinguish authors with the same names or to distinguish variant editions. To me RDA is a lot like the World Wide Web: it will offer much more information that will require a lot more time to get to what you want to see.

  9. As someone who has been working with MARC data since 1974 (I started my career as an OCLC trainer for SOLINET in the days of one format, Books, and a LOT fewer tags) and who knew or met many of the key players in the development of MARC, I just have to say that it is very easy to judge MARC from today’s perspective. However, it isn’t fair to do that. If you weren’t working during that era, you might find it hard to imagine developing in an environment in which computing power was so low and storage costs were so high that we actually looked for ways to reclaim critical storage by deleting periods at the end of fields. 😉

    MARC was succinct. Tags were short and fixed length. Fixed fields and subfields were encoded for maximum meaning in minimal space.

    MARC is a communications format designed to support the communication of cataloging information between what were, at the time, a very, very small number of entities – mostly LC and other national libraries, a few universities that were doing local development, and a nascent bibliographic utility industry — all of which was geared to supporting printing catalog cards since online systems were still decades away. MARC had to support the cataloging rules (pre-AACR2)and the realities of printed cards (thus names in inverted form and data to comply with LC filing rules). The uptake of MARC for internal storage by ILS vendors was largely driven by the customer libraries’ RFPs and by the reality that computing power to do data conversion wasn’t always available. It also vastly pre-dated any form of keyword indexing thus not allowing for normalized data.

    It was also driven by the realities that catalogers found that MARC-speak facilitated communication among themselves; communication that might not have been possible if all the local systems had stored bibliographic data in WDC. Everyone was learning this new stuff all at once and trying to learn from the originators and each other. If LC’s 100, had been my “Author” data element, your “Main entry-personal name” and OCLC’s “WDC-1”, we would have had a much harder time sharing what we were learning and helping each other create all this standardized data. Referencing the 100 field short cut a lot of “what are you talking about?” in communications. And remember, most of this communications was in person and in print format that took a lot longer to reach an audience since it pre-dated email, WWW, blogs, and Twitter by 30 years!

    I fully agree that MARC has been very badly managed regarding the 773 field. I’ve felt this since the early 90s when we started loading journal article citation data and tried to create a workable TOC index to all our data — parsing that field is just ridiculous and and would have been so much easier with some standard encoding much earlier.

    Having said all this, I’d fully support a new communications format and discussion of how we get there. I just wish we could do it without making MARC the enemy – it has been a reliable workhorse that has been absolutely key to our getting where we are in both systems and magnitude of bibliographic data. I do regret that the good features of other MARCs worldwide were not incorporated. And not just worldwide. The WLN system (long since absorbed by OCLC) had a 3rd indicator – which was wonderful for handling non-filing indicators for the subfield t of 7XX fields. A loss we constantly decry when trying to integrate 245 fields with titles in author added entries.

  10. When I decided, triggered by a conversation with some colleagues, to write a blog post about something that has been bothering me off and on since I first started working with library systems in 2003, I did not expect it to be picked up so widely. It has been cited and linked to in many different places. But the most surprising part to me is that MARC generates so many diverse, even emotional reactions.

    It looks like a classic case of “If your not with us, you’re against us“. But I would like to try to reconcile both opposing parties anyway.

    I have received a couple of other reactions outside of this blog as well, both on line (via twitter among others) and off line, and I had some conversations about MARC and PICA+ formats with colleagues. All this has been cause to refine my opinion slightly.

    First, a couple of experienced cataloger colleagues have convinced me that PICA+ is not always better than MARC. Like always, both have their pros and cons, as undoubtedly will apply to the MARC-MAB relationship as well.

    Second, I must thank my valued IGeLU colleague Michele Newberry for her very clear description of the historical circumstances of the birth of MARC and her experience of many years.
    Michele says that it is not fair to judge MARC from today’s perspective only. And I have to admit that she is right. In the early days of MARC it was a very useful and efficient tool in the circumstances of those days.
    She also emphasises that MARC was intended as a communications format, which was subsequently used as storage format by ILS vendors.
    Michele concludes by saying that she would “fully support a new communications format and discussion of how we get there”. “I just wish we could do it without making MARC the enemy“.
    I hereby apologise to everybody who I may have offended by my attack on MARC. I have no intention of discrediting its usefulness in the past and present.

    However I stick with my opinion that in a digital web environment it is not an efficient storage and exchange format for bibliographic metadata anymore. We should aim at bringing about a new general efficient and flexible bibliographic format suitable for future developments. As far as I’m concerned this could be MARC22.
    MARC can be adapted, this has been done before. As an example: in 2003 a proposal has been made to replace the 773 subfield g tag by either a subfield 773$q, or a new tag 363, with subfields for all levels contained in the 773$g subfield free-text string. Both options are available now. The 363 tag appears to have been accepted as “Normalized Date and Sequential Designation (R)“. But as long as this new field is not used, it has no value. I expect that AACR2 still does not require using 363, but I am not an expert. OCLC recently stated that implementation of 363 is “under consideration”.

    Someone pointed out to me recently, that all the problems associated with a massive migration by libraries around the world from MARC to something else, whatever it may be, will be avoided in the case that we all will migrate to one of the new SaaS models of cataloguing (Ex Libris URM, OCLC WorlCat Web, etc.). We will see…

  11. Sometimes we need to make bold steps to move ahead, such as Google is trying with moving the SMTP (e-mail) protocol further by thinking anno 2009 with the introduction of Google Wave

  12. Since my “MARC Must Die” Library Journal column was mentioned here I want to point you to a much longer follow-up piece I wrote for Library Hi Tech. My author’s copy is available at http://roytennant.com/metadata.pdf and it describes the world I would really like to see, which is what Lukas talks about here — “We Don’t Care”.

Leave a Reply

Your email address will not be published. Required fields are marked *