Quality matters: A new approach for detecting quality problems in web archives

Abstract

Since the practice of web archiving, or the act of preserving websites as historical, legal, and informational records, become more commonplace in the 2000s, web archives have become valuable sources for historical research. Unfortunately, many archived websites are of low quality and are missing crucial elements. In this paper, we examine the issue of quality and focus on visual correspondence, the similarity in appearance between the original website and its archived counterpart. We examine how the visual correspondence of an archived website can be measured using image similarity measures. Our results indicate that the Structural Similarity Index metric (SSIM) was able to successfully measure visual correspondence. If applied to the Quality Assurance process of an institution, this similarity metric could help web archivists quickly detect quality problems in their web archives, and fix them in order to create high-quality web archives.

Date
Sep 24, 2020 13:00 ET — 13:30 ET
Brenda Reyes Ayala
Brenda Reyes Ayala
Assistant Professor, University of Alberta

Brenda Reyes Ayala is an Assistant Professor at the School of Library and Information Studies at the University of Alberta. Her research interests include Web Archiving, Digital Humanities, and Information Retrieval.