
This project examines validation of quality in large-scale digitization, specifically the HathiTrust Digital Library. The two-year research project is investigating possible methods for detecting and measuring errors and other quality issues within mass-digitized literature. It is also analyzing the potential impact of found errors on educational and scholarly use within a representative set of use cases: reading online, printing copies, mining texts, and managing print collections.
The findings of this study will make a significant contribution to the field of information quality, and will inform digital repositories about assessing the quality of objects they have committed to preserving on a large scale. Understanding how to judge the quality of the HathiTrust digital deposits will help libraries make decisions about re-digitization of materials and about managing collections of print volumes with secure and useable copies held in digital repositories. The ability to assess and document the quality of volumes will pave the way for certification of these volumes in relation to specific uses, enhancing the decisionāmaking capabilities of users and stakeholders when selecting a volume or set of volumes for particular purposes.
Ongoing mass digitization of books and serials is generating vast digital collections and transforming education and research at all levels. However, these efforts have also raised questions about value of the digital copies produced by such large-scale projects. For digital repositories and their communities of users to trust that deposited objects have the capacity to meet the uses envisioned for them, repositories must validate the quality and fitness for use of the objects they preserve. This project addresses some questions concerning the value of digital copies and takes a major step toward automating quality review and sharing the characteristics of digitized books and journals.
$674,722
11/01/2010
10/31/2012