A systematic analysis of duplicate records in Scopus

Juan-Carlos Valderrama-Zurián; Remedios Aguilar-Moya; David Melero-Fuentes; Rafael Aleixandre-Benavent; Juan-Carlos Valderrama-Zurián; Remedios Aguilar-Moya; David Melero-Fuentes; Rafael Aleixandre-Benavent; Juan-Carlos Valderrama-Zurián; Remedios Aguilar-Moya; David Melero-Fuentes; Rafael Aleixandre-Benavent; Juan-Carlos Valderrama-Zurián; Remedios Aguilar-Moya; David Melero-Fuentes; Rafael Aleixandre-Benavent

Juan-Carlos Valderrama-Zurián; Remedios Aguilar-Moya; David Melero-Fuentes; Rafael Aleixandre-Benavent

Journal of Informetrics

In recent years, the Web of Science Core Collection and Scopus databases have become primary sources for conducting studies that evaluate scientific investigations. Such studies require that duplicate records be excluded to avoid errors of overrepresentation. In this line, we identify duplicate records in Scopus and examine their origins. Identifying journals with duplicate records in Scopus, selecting and downloading bibliographic journal records, and identifying and analyzing the duplicate records is the methodology adopted. Duplicate records are found when articles published in a journal are incorrectly mapped by Scopus to this journal and to a different journal from the same publisher and when there are journal title changes, orthographic differences in the presentation of a journal name, and journal name variants. In these last three cases, one bibliographic record of each duplicate is mapped to Medline coverage of Scopus. Consequently, the identified duplicates and the significant differences in the number of citations received in duplicate articles may influence bibliometric studies. Thus, there is a need for rigorous quality control guidelines to govern database managers and editors to prevent the creation of duplicates.