Characteristics of Duplicate Records in OCLC's Online Union Catalog

Research Projects

Organizational Units

Journal Issue

Abstract

Duplicate records in the Online Union Catalog of the OCLC Online Computer Library Center, Inc., were analyzed. Bibliographic elements comprise information found in one or more fields of a bibliographic record; e.g., the author element comprises the main and added author entry fields. Bibliographic element mismatches in duplicate record pairs were considered relative to the number of records in which each element was present. When a single element differed in a duplicate record pair, that element was most often publication date. This finding shows that a difference in the date of publication is not a reliable indicator of bibliographic uniqueness. General cataloging and data entry patterns such as variations in title transcription and form of name, typographical errors, mistagged fields, misplaced subfield codes, omissions, and inconsistencies between fixed and variable fields often caused records that were duplicates to appear different. These factors can make it extremely difficult for catalogers to retrieve existing bibliographic records and thus avoid creating duplicate records. They also prevent duplicate detection algorithms used for tape-loading records from achieving desired results. An awareness of particularly problematic bibliographic elements and general factors contributing to the creation of duplicate records should help catalogers identify and accept existing records more often. This awareness should also help to direct system designers in their development of more sensitive algorithms to be used for tape loading. The resulting general reduction in the number of duplicate records in union catalogs will be a major step toward increased cataloger productivity, user satisfaction, and overall online database quality.

Description

Keywords

duplicate records, OCLC, cataloging, bibliographic records

Citation

Edward T. O'Neill, Sally A. Rogers, and Michael W. Oskins, "Characteristics of Duplicate Records in OCLC's Online Union Catalog," Library Resources & Technical Services 37, no. 3 (1993): 59-71.