$1 million in grant funding for groundbreaking research on data curation

A team of researchers from the University of Michigan School of Information (UMSI) and the Inter-university Consortium for Political and Social Research (ICPSR) have received grants totaling nearly $1 million from the Institute of Museum and Library Services and the National Science Foundation to study the impact of data curation on reuse. 

The aim is to better align curatorial actions with the needs of data reusers.

Funding agencies increasingly are requiring researchers to share and archive their data, which enables other researchers to pursue new research questions while also providing the transparency necessary for evaluation, replication and verification of results.

But not all data needs to be kept forever, says Libby Hemphill, associate professor of Information at UMSI and research associate professor at ICPSR and one of the principal investigators on the study. 

“The care and feeding of data long term is expensive, so we want to make sure that we make good investments,” explains Hemphill. “But we don’t know how much it really costs to prepare data for reuse. That’s the question that initially sparked this research proposal.”

This joint project between UMSI and ICPSR, called Measuring Impacts of Curatorial Actions (MICA) will take a two-pronged approach to this question. 

The first prong, made possible in part by a $480,637 IMLS grant (#LG-37-19-0134-19), examines the practice of curating data. 

“ICPSR and the School of Information have always been interested in promoting greater access to research data,” says Elizabeth Yakel, Professor and Senior Associate Dean for Academic Affairs at UMSI. 

“One of the most costly things about making data available is the work of curators,” says Yakel. “We will be studying curatorial metrics to better understand what actions have the most impact on successful reuse and making reuse easier for researchers in the future.”

A $498,643 NSF grant (#1930645) will help support the development of evidence-based data sharing policies for funders, archives and researchers. This award supports the team’s efforts to use text mining and machine learning to uncover implicit references to data and better capture the full scope of data reuse. 

“There are many different reasons why we ask people to share data,” says Hemphill. “Some of them are about transparency in science, some of them are about replication, and some are about reuse generally. You may want to do different things to the data depending on what your goal is.”

Right now, Hemphill says, researchers do not have a good way to align data sharing policies with those goals. “We just have a general ‘share your data’ prescription from funders, but no way to measure impact of that sharing. 

The research team will develop quantitative measures that help show the impact and efficacy of different kinds of data curation activities over time. 

“ICPSR is uniquely well suited to this study,” says UMSI assistant professor and co-principal investigator Andrea Thomer. “They have long employed a dedicated team of data curators; they host a huge collection of datasets; and they also maintain records of how each dataset was curated, and if and when it was reused.  By comparing specific curatorial actions to evidence of reuse, we can build statistical models and metrics that show the impact and value of specific curatorial actions.”

“Few repositories have the staff, collections, and records that ICPSR has, which makes us uniquely well positioned to identify important relationships between specific curatorial tasks and reuse.”

People often underestimate just how expensive it is to care for data long term, explains Hemphill.

“The scale of these budgets signals to the research field that this matters in a way that is really important.”

About the researchers

Libby Hemphill is an Associate Professor in the School of Information and Director of the Resource Center for Minority Data at ICPSR

Andrea Thomer is an Assistant Professor in the School of Information.

Amy Pienta, Research Scientist, is Director of Business and Collection Development at ICPSR.

Dharma Akmon is Director of Project Management and User Support and an assistant research scientist at ICPSR.

Elizabeth Yakel is a Professor and Associate Dean for Academic Affairs in UMSI.

About The Institute of Museum and Library Services

The Institute of Museum and Library Services is the primary source of federal support for the nation's libraries and museums. We advance, support, and empower America’s museums, libraries, and related organizations through grantmaking, research, and policy development. Our vision is a nation where museums and libraries work together to transform the lives of individuals and communities. To learn more, visit www.imls.gov and follow us on Facebook and Twitter.

The National Science Foundation

The National Science Foundation (NSF) is an independent federal agency that supports fundamental research and education across all fields of science and engineering. In fiscal year (FY) 2019, its budget is $8.1 billion. NSF funds reach all 50 states through grants to nearly 2,000 colleges, universities and other institutions. Each year, NSF receives more than 50,000 competitive proposals for funding and makes about 12,000 new funding awards.

 - Jessica Webster, UMSI PR Specialist

Posted October 21, 2019