Faces of UMSI: Shiyan Yan

When PhD student Shiyan Yan enrolled at UMSI in 2013, he wasn’t expecting to learn about North American birds.

Now, after two years of working in collaboration with The Cornell Lab of Ornithology’s eBird project, he can easily identify common species such as blue jays, pigeons and cardinals. But he doesn’t need to memorize every type of bird—“I have the data,” he says.

His main project at UMSI, with associate professor Carl Lagoze, his advisor, involves developing ways to evaluate the citizen science data from the eBird project, which uses crowdsourcing for data collection. Birdwatchers from across North America—and, as it has expanded, from other continents—submit information about the birds they’ve seen, which the Cornell lab uses for research and tracking.

“It’s very exciting. You have data from all over the world. But it also raises more problems of reliability,” Shiyan says.

While citizen science has many positive qualities, like involving the population and gathering extensive data, the reliability of data can be a problem: When non-professional volunteers are gathering data, they sometimes make mistakes, such as reporting a rare bird they thought they saw when it was actually a more common one, Shiyan explains.

Another problem can be the irregularity of citizen participation—for example, on cold days, fewer birdwatchers may go outside, leading to fewer reports of birds, or reports may increase on holidays and weekends, when people have more free time. Scientists must find a way to control for these errors and variations, Shiyan says, which is where his work comes in. He works with Lagoze to explore methods of data cleaning and other data quality control methods.

While Shiyan didn’t know much about birds before coming to Michigan, he had been interested in crowdsourcing and data mining since his undergraduate career at Peking University, where he joined a data lab as an upperclassman and participated in social media data research and analysis.

In addition to the eBird project, he has also worked on improving bibliometrics systems, which “evaluate a scholar according to breadth of research or interdisciplinarity,” he explains.

This is hard to judge because people have developed so many competing systems of evaluation that no one system can be dominant, he says. To this end, he developed a system to evaluate the evaluations; the related paper he coauthored with Lagoze won the Outstanding Student Paper award at the 2015 conference of the International Society of Scientometrics and Informetrics.

Shiyan hopes to graduate in 2018 and go into academia, where he relishes the prospect of enjoying the freedom to research any topic and collaborate with all sorts of researchers. “I’m always seeking collaborators, whether in the university or outside the university,” he says.