Karen M. Drabenstott
Associate Professor
School of Information
University of Michigan
Ann Arbor, MI 48109-1092 USA
Voice: (734) 763-3581
Fax: (734) 764-2475
karen.drabenstott@umich.edu
New topic -- "The Paper Library: Beyond the Automated Card Catalog"
What a way to end our third year and begin our fourth -- with a splendid discussion hosted by Anne Abate. Many thanks to Anne and the many CRISTAL-ED members who participated in our discussion on "What's in a Name?" and provided their insight and gave clarification to the many issues surrounding this topic. And thanks, Anne, for a terrific job at guest editing. How appropriate that this is Anne's fourth time serving as guest editor!
L. Hunter Kevil will serve as our second guest editor in this CRISTAL-ED's fourth year. He is the Head of Serials Department at Ellis Library, the University of Missouri-Columbia. He has worked over 10 years as a librarian and has designed two acquisitions modules. He has also worked for Geac Computers and Data Research Associates. In an earlier life, Kevil was a lecturer in the School of French, University of Warwick (Coventry, U.K.). Kevil has a doctorate in French literature and master's degrees in management and library science.
Please welcome Kevil as our guest editor and join in the discussion.
L. Hunter Kevil
Head, Serials Department
Ellis Library
University of Missouri-Columbia
mulkevil@showme.missouri.edu
I am concerned that without reform and adaptation to contemporary technology libraries as we know them will become increasingly marginalized and eventually little more than book museums. If we librarians are to influence our future, we must seriously and creatively examine our most deeply held assumptions about libraries. I am fully aware that to say the sky may fall down soon is not new news, nor is it widely accepted. The comments below will evoke in many a sense of skepticism I share. But I do believe the dictum that the unexamined life is not worth living applies as much to our professional life as to our personal. In this spirit I offer a few thoughts and invite discussion on the direction libraries should take.
There are four major problems with contemporary library practice.
(Note 1) There is an extensive literature describing the problems users experience with library catalogues. Examples include Christine Borgman's articles, "Why are Online Catalogs Hard to Use?" and "Why are Online Catalogs Still Hard to Use?" in Journal of the American Society for Information Science, 37:6 (1986) and 47:7 (1996). This idea has not been generally accepted within the library profession. Perhaps after performing countless searches we can no longer see with fresh eyes the limitations of the typical catalogue. Recently a list of humorous questions actually asked at the reference desk has circulated the internet. It is interesting to ask why these "howlers" are so funny and by which of our cherished assumptions we consider library users naive for asking them. One of these questions is: "Do you have a list of all the books in the English language?" I laughed too but on reflection found this a very reasonable question. There are many uses for such a list. Why have libraries not yet created one?
I of course do not claim to have any ultimate answers. Since our problems are much more social and political than technical, discussions such as those in this forum are well worth the effort. Perhaps we could start our discussion with the question of how to gain a customer focus and change our data structures and our concepts of organization of materials to take full advantage of contemporary database management technology. Without proactive change the danger is that at some point a critical mass of users will become very familiar with handling digital materials and very intolerant of the anomalies and archaisms of our card-based automated catalogues. Unless libraries transform their local catalogues and assume a leadership role in the emerging knowledge management systems of their parent organizations, our future may not be pleasant. In short, rather than shoehorn the future to fit past practice and "catalogue the Internet," I suggest libraries should "internet" their local catalogues and create an open, shared, easy-to-use, co- operatively maintained database management system.
As a search tool the card catalogue is conceptually a set of codified controls over access points and form of entry. The object is to create consistent, usable "headings" at the top of cards so that they can be filed in expected sequences in the author, title, subject, and shelflist catalogues. This system relies on staff and users to compare the text strings of the headings with something else, a citation, a book in hand, another card. In time special concepts and practices incomprehensible to users were developed, such as uniform entries and titles proper.
This system still governs the operation of our automated catalogues. It appears we cannot get beyond the mental model of catalogue cards organized in filing order by their headings. (I still have these mental images.) Libraries have used potentially liberating new technology to do little more than what card catalogues could do. Our automated systems are based on a file structure of MARC bibliographic records with indexes of text strings just like headings on cards. MARC authority records perform the same function as their card predecessors. The tools given users have not changed much. The old practices and compromises remain in place, except that now they appear to be obstacles.
Our emphasis has always been on treatment of separate bibliographic items ("books"). This means that materials such as serials or conference proceedings, sometimes even books in series or sets, still can still be very hard to locate. Our system is one of extraordinary complexity. Changed cataloguing rules and local practice, inconsistency of interpretation, human error, the inability of library staff to perform all desirable record maintenance, all have created an edifice ready to crumble of its own weight. New search types, such author-title, keyword, and ISBN, sometimes enable the persistent user to get through the tight control of access points. Yet despite all this expensive work, our catalogues are no easier to use than before and libraries still must maintain large reference staffs. But many users ask whether the persistence required is worth the effort. It is no wonder that despite some very good efforts with Z39.50, machine searching and processing have not progressed as far as needed.
I want now to look at a possible new way to organize access to our paper resources. My suggestions concern the very simplified logical structure of a possible system independently of any actual implementation. Relational databases have been around more than 25 years. Many of us are familiar with them. This model is interesting for libraries because along with the data structure of tables and relationships we get a built-in query language and something akin to authority control.
In the relational model data are stored in grid-like two- dimensional tables. Each table represents a simple concept or object. The columns in a table represent characteristics ("fields") and the rows represent field values from actual instances. A field called the "key" uniquely identifies each row. The key is a unique ID similar to an OCLC number. It is assigned by the system. Let us consider a very simplified model of three tables: authors, titles, books. The authors table could include several fields, including surname, given name, dates, and key. The titles table could include fields for title and key and the books table fields such as ISBN, publisher catalogue number, and key. To establish a relationship between two tables, we enter a key from one into the rows of the other. For example, to find an author's books, we create a "query," which is simply a scan through the books table looking for rows identified by the presence of that author's key or unique ID. In this one-to-many relationship, one author can be associated with many books. Many- to-many relationships are possible by creating intermediate tables. (See note 2 below.)
(Note 2) A basic explanation of how this works is available in Microsoft's short document, "Designing a Database -- Understanding Relational Design." A very good six-page introduction to databases for library use is provided by Biblio Tech Review. The "books" table above is of course a gross simplification, shorthand for a complex structure including the concepts of work, instances of a work, containing, contained-in, and related-to. It may well turn out that object database management systems are better suited than relational databases for the relationships that would be required to handle items in series and articles. (Articles are contained in issues, which are contained in one or more volumes, which are contained in a periodical of a certain title and also a subscription, etc.)
There are many advantages for libraries. Very complex queries are possible and the user can control which fields are returned by the system and the order in which they are displayed. Frequent queries can be saved and easily modified. Each data element is stored only once; unlike with MARC records, there is minimal redundancy. There is no longer a need for library-maintained unique headings, as the system has been designed to handle multiple identical values such as titles. Global updating is built in. Control of headings or entries and authority control in the usual sense are no longer needed. Relational databases support verification of data entry, so that an invalid author ID cannot be entered in the books table, as well as several other kinds of security and integrity and enforcement of "business rules" (library procedures.) Relational databases excel in supporting queries, particularly when there are simple data types, such as text fields.
The important concepts here are relationships by means of unique IDs and a data-centered view rather than a process-centered one. Existing library procedures rely on cataloguing, intensive "preprocessing" of data for an item by the Library of Congress, bibliographic utilities, and the staff of every library holding that item. The approach explored here assumes as a goal a shared system of well designed tables and relationships, as well as sophisticated tools to handle controlled vocabulary terms such as subjects and tools to handle books whose full text has been indexed. This system would then rely on preprocessing of items only to the extent needed to enter once "clean" data logically organized. The system could exist "on top of" our local MARC databases, with the addition of a few fields in the MARC records and support of a master look-up table with many mirror sites. The general emphasis would be on "post-processing," enabling the user and in particular user clients and agents to perform virtually unlimited searching.
Let me conclude by listing just a few of the many requirements libraries should demand of any new system.
User control of searching. The user should be able to search all data, not just what is in the indexes. This implies that data elements should make sense to the user, that all data should be accurate and "clean," not transformed or altered by libraries. Accurate data implies automatic updates. It also implies that labels, operations, and displays should be readily comprehensible to the majority of users. Users for instance should not have to know the distinction between "monographs" and serials. Libraries should concentrate effort on development of standards for a broad variety of different search tools, for machines as well as humans. Customizable and personal agent-based front-ends and user interfaces are urgently needed.
The system should be fully useful for acquisitions, serials, inter-library lending, and document delivery. Several things follow: system should accommodate prepublication title and publisher's catalogue number or other ID similar to the Digital Object Identifier. Item control should be extended to the article level.
Sharing and cooperation are built into the design of the system. This implies a national or worldwide search by ID, which would enable users for the first time to determine positively that a given library does not hold a given book, serial, or article. These IDs would help prevent the creation of duplicate records, something current systems cannot do well.
The system should increase library productivity. One way to do this is to standardize the objects libraries work on so that human editing and interpreting of local text fields is minimized or eliminated. Right now every library reviews the same Library of Congress cataloguing copy for misspellings and other errors. If this duplication of effort could be eliminated, a lot of library staff would be freed to do some of the work required by the new system. The proposed new standard for holdings, Z39.74, being voted on now by NISO members, still recommends a set of user-unfriendly, manually maintained local codes.
It should be possible to construct the new system automatically from existing MARC records with a minimum of manual work. The hard part here is deciding on the structure of the tables of the new system and the mapping of MARC fields for data importing. Once this has been accomplished, internet spiders could read through LC and local catalogues, populating the tables with data. Each cooperating library could link its call numbers and other local information. These tables of local data would have to be maintained by libraries in a distributed arrangement. The processing or "cataloguing" of new paper works could be shared with publishers and distributors. Libraries could continue to use their MARC catalogues as long as they wished. Users in principle would be free to use the old or the new system. Libraries could use any savings realized from streamlined work processes to perform this work and to digitize their unique and important items. There is much that is not in MARC records, including periodical articles and chapters and contributions in books. The example of Internet search engines shows that a new national catalogue could be created. If libraries do not start to do this, I fear a competitor, such as a company in the business of outsourcing libraries, will.
Tom Wilson
T.D.Wilson@sheffield.ac.uk
You may join the discussion and look over the list of past and future topics.
![]() |
|