MADS capstone team works toward more nuanced predictive tools for healthcare

Tuesday, 09/14/2021

Four UMSI Master of Applied Data Science (MADS) grads are helping clinicians embrace more complex, nuanced predictive models that could outperform the current simple models used by many healthcare systems.

For a MADS capstone course, Michelle LeBlanc, David Franks, Henry Luong and Tyvand McKee used their healthcare backgrounds to identify and address a problem that they had seen firsthand.

“The real world isn’t simple”

Healthcare professionals have traditionally restricted themselves to simple predictive models, says David Franks, because they are more intuitively accessible when talking with clinicians.

“Simple models tend to be very easy to understand; every feature that we’re looking to use to predict something is assumed to have one linear relationship with the chance of the outcome,” says David, a business intelligence developer at Parkview Health. “So I can say, for example, this patient’s length of stay was 0.4 days longer because their blood pressure was 20 points higher.

“The problem there is that the real world isn’t simple, and so we have models that are much more complicated but can capture these very intricate relationships between individual features and the outcome, and even how features interact together.”

These complex predictive tools, called black box models, are created directly from data by an algorithm. Humans — even those who design black box models — often have little insight into the potentially thousands of stages between data and prediction. This can lead to problems with explainability between data scientists and clinicians, although complex models can be better in terms of performance.

So when MADS capstone instructors launched a team-matching thread on Slack about a month before the start of the course, David quickly connected with Michelle over their shared interest in addressing real issues they’d encountered in healthcare.

“We threw around a bunch of ideas, but the one that got us the most excited was this idea of trying to drag the healthcare industry, kicking and screaming, away from this strong preference for very simple predictive models into the future of machine learning,” Michelle says. During this time, Michelle was lead analyst at Vizient, a vendor of data tools with about 750 client hospitals.

Michelle received approval to acquire real data from hospitals for her team to work with.

“It took three weeks, including talking to a lawyer, but we were able to get approved to use this dataset if we signed data use agreements, which we did,” she says.

With practical data in hand, the team welcomed its third member, Henry, with his background in clinical research. Michelle’s former colleague Tyvand joined shortly thereafter, bringing his mechanical engineering and business development skills.

“Can we develop complex models that beat the simple models currently used by a lot of healthcare systems?” David asks. “That’s what Henry, Tyvand and Michelle have all been working on quite a bit, building complex black box models that outperform existing ones in use. And then I’ve been working on modules in Python for explainability so that we can say we can improve performance and potentially even improve explainability, no matter what model type is used.”

Leveraging every step

Capstone instructors provided students extra time to connect with teammates, conceptualize projects and gather data before the project started.

“I think we were already pretty far ahead at that point, having everything in hand,” David says. For the first month of the two-month course, Michelle, Henry and Tyvand worked on retraining simple predictive models used in healthcare to perform as well as they possibly could. Then they began developing the complex models.

Meanwhile, David worked on a parallel task, constructing explainability modules in Python, building understanding into how such modules work, what they say and how they can be interpreted.

“In addition, I’ve also taken a qualitative approach to part of the project,” David says. “Since I have access to a lot of clinicians, administrators and other analysts and operational staff, I’ve worked within Parkview to get a survey out about how their understanding of a predictive model impacts how much they will use or trust it.

“Over the next two weeks, I’ve got six or seven sit-down interviews to talk through survey results with the people who filled them out just to get a nice proof of concept that complex models can outperform simple models, and to support our other statement, that explainability matters in the healthcare domain.”

The team says that as they have worked through their process, input from the four capstone instructors — Elle O’Brien, Winston Featherly-Bean, Neha Bhomia and Kirtana Choragudi — has been ongoing and invaluable. With each instructor offering weekly office hours, students can ask questions and bounce ideas at least every other day.

Additionally, students have an open communication channel with course instructors via Slack if they need to reach them outside of office hours, Henry says.

“We always have the opportunity to speak out on whatever issues we have,” he continues. “They’ve been very helpful, very knowledgeable.”

Presenting to the world while protecting healthcare data

Elle O’Brien, the MADS faculty member leading the capstone course (SIADS 697) this term, shared in a Q&A about the course that the final result of teams’ work would be a blog post or research paper that’s ready to share with the world. However, working with protected, real-world healthcare data comes with some constraints for this capstone team.

“Unfortunately, working with a dataset that has those protections in place, we can't just put it out there and try to get our blog posts published on Medium or something,” Michelle says. “In this form we would have to change out the dataset, but I think that’s something that we’ll probably do.”

To keep protections in place while fulfilling course requirements, the team posted a private GitHub repository to submit with their final research paper. However, Michelle bought an older set of healthcare data from a governmental agency which the team will swap for the proprietary data set if they decide to publish their work on a broader platform.

Complexity meets explainability

The team’s report, “Explainability and Complex Predictive Model Adoption in Healthcare,” makes the point that complex models are sometimes the best option. Their report concludes that the crucial explainability factor can be added to black box models, “allowing healthcare developers to choose the best algorithm based on only performance and resource restrictions.”

“Our main goal here is actually to open up the black box for healthcare providers and say, OK, this is why this AI is suggesting this, this is what it’s focusing on,” Tyvand says, “and in doing so hopefully we’ll be able to push them away from just using linear models and simpler things and allow them to embrace more complex models like neural networks.”

As excited as the team is about what they’ve achieved with their complex models, they remain cognizant that simple models should not be dismissed entirely.

“Just because these more complex systems exist doesn’t always mean that they’re the best tools for the job,” Tyvand says. “We’ve had success with our more complex models outperforming some of these simple models sometimes, and sometimes we’ve not. I think that just has to do with how the data is structured.”

“This capstone was an excellent opportunity to compile all the skills we learned throughout this program,” Henry says. “We were able to effectively and objectively collaborate with each other while keeping the spirit of creativity for each individual within the team. Overall, we were delighted with the quantitative and qualitative results and seek to build upon our goal to incorporate all healthcare datasets for future predictive analysis.”

Michelle, David, Henry and Tyvand graduated with the MADS program’s first graduating class in August 2021.