University of Michigan School of Information
Safe Online Communities | Intersectional Ageism: AI in Focus

Monday, 03/03/2025
By Noor HindiUniversity of Michigan School of Information faculty and PhD students are advancing the field of artificial intelligence through innovative research and impactful contributions. Here are some of their recent publications.
Publications
New Opportunities, Risks, and Harm of Generative AI for Fostering Safe Online Communities
GROUP '25: Companion Proceedings of the 2025 ACM International Conference on Supporting Group Work, January 2025
Guo Freeman, Douglas Zytko, Afsaneh Razi, Cliff Lampe, Heloisa Candella, Timo Jakobi, Konstantin Kosta Aael
Recently, there is a growing trend of using generative AI systems and tools for fostering and protecting online collaborative communities. Yet, existing AI tools may introduce new risks and even harm to diverse communities’ online safety. How to better maximize the novel opportunities of AI and mitigate its emerging risks and harm for our future online safety is a critically needed discussion for the HCI community. Featuring experts from both industry and academia, the goal for this panel is to promote interdisciplinary, community-wide discussions and collective reflections on important questions and considerations at the unique intersection of AI and online communities, including but not limited to: how the design of AI systems may discourage existing online harm but also invite new online harm in various online spaces; how different populations, cultures, and communities may perceive and experience AI’s new roles for their online safety; and what new strategies, principles, and directions can be envisioned and identified to better design future AI technologies to protect rather than harm various online communities.
Intersectional Ageism: How Black Older Adults Envision A Future With Conversational Technologies
National Library of Medicine, December 2024
Robin Brewer, Christina Harrington, Courtney Heldreth
Dominant narratives of aging often center whiteness. However, older age can intersect with other identities that experience structural inequities and exacerbate ageism. In this study, we explore what it means to design technologies that are fair for historically minoritized older adults. We conducted design workshops with 16 Black older adults, asking how they envision older age and race represented in one form of AI system - conversational technologies. We highlight concerns around sharing age-related data, potential ageist outcomes that intersect with racism, and desires for authentically representing older age and race. We discuss these findings with a lens on how ageism discourse can exclude certain groups and how we might mitigate age-related AI harms in the technology design process by incorporating other values such as authenticity.
Examining the Relationship between Socioeconomic Status and Beliefs about Large Language Models in an Undergraduate Programming Course
SIGCSE Virtual 2024: Proceedings of the 2024 on ACM Virtual Global Computing Education Conference, December 2024
Amy Pang, Aadarsh Padiyath, Diego Viramontes Vargas, Barbara Jane Ericson
Research on students' use of large language models (LLMs) in academic settings has increased recently, focusing on usage patterns, tasks, and instructor policies. However, there is limited research on the relationships between students' socioeconomic backgrounds, perceptions, and usage of these resources. As socioeconomic factors may shape students' approach to learning, it is important to understand their impact on students' perceptions and attitudes towards emerging technologies like LLMs. Thus, we analyzed a quantitative and internally consistent student survey (N=144) and qualitative interview (N=2) responses of students taking an undergraduate-level programming course at a public university for correlations between socioeconomic background, attitudes towards LLMs, and LLM usage. Regression analysis found a significant positive association between socioeconomic status (SES) and belief that LLM use will lead to career success. Qualitative interviews suggested low-SES students perceived LLMs as helpful tools for debugging and learning concepts, but not as a significant factor in long-term career success. Rather, programming knowledge itself was still paramount for career success. Our findings contribute to our understanding of the complex influences social and cultural factors have on students' perceptions and attitudes towards LLMs.
A Human–Security Robot Interaction Literature Review
ACM Transactions on Human-Robot Interaction, December 2024
As advances in robotics continue, security robots are increasingly integrated into public and private security, enhancing protection in locations such as streets, parks, and shopping malls. To be effective, security robots must interact with civilians and security personnel, underscoring the need to enhance our knowledge of their interactions with humans. To investigate this issue, the authors systematically reviewed 47 studies on human interaction with security robots, covering 2003 to 2023. Papers in this domain have significantly increased over the last 7 years. The article provides three contributions. First, it comprehensively summarizes existing literature on human interaction with security robots. Second, it employs the Human–Robot Integrative Framework (HRIF) to categorize this literature into three main thrusts: human, robot, and context. The framework is leveraged to derive insights into the methodologies, tasks, predictors, and outcomes studied. Last, the article synthesizes and discusses the findings from the reviewed literature, identifying avenues for future research in this domain.
Pre-prints, Working Papers, Articles, Reports, Workshops and Talks
Reasoning-Enhanced Self-Training for Long-Form Personalized Text Generation
arXiv, January 2025
Alireza Salem, Cheng Li, Mingyang Zhang, Qiaozhu Mei, Weize Kong, Tao Chen, Zhuowan Li, Michael Bendersky, Hamed Zaman
Personalized text generation requires a unique ability of large language models (LLMs) to learn from context that they often do not encounter during their standard training. One way to encourage LLMs to better use personalized context for generating outputs that better align with the user’s expectations is to instruct them to reason over the user’s past preferences, background knowledge, or writing style. To achieve this, we propose Reasoning-Enhanced Self-Training for Personalized Text Generation (REST-PG), a framework that trains LLMs to reason over personal data during response generation. REST-PG first generates reasoning paths to train the LLM’s reasoning abilities and then employs Expectation-Maximization Reinforced Self-Training to iteratively train the LLM based on its own high-reward outputs. We evaluate REST-PG on the LongLaMP benchmark, consisting of four diverse personalized long-form text generation tasks. Our experiments demonstrate that REST-PG achieves significant improvements over state-of-the-art baselines, with an average relative performance gain of 14.5% on the benchmark.
Learning Implicit Social Navigation Behavior using Deep Inverse Reinforcement Learning
arXiv, January 2025
Tribhi Kathuria, Ke Liu, Junwoo Jang, X. Jessie Yang, Maani Ghaffari
This paper reports on learning a reward map for social navigation in dynamic environments where the robot can reason about its path at any time, given agents’ trajectories and scene geometry. Humans navigating in dense and dynamic indoor environments often work with several implied social rules. A rule-based approach fails to model all possible interactions between humans, robots, and scenes. We propose a novel Smooth Maximum Entropy Deep Inverse Reinforcement Learning (S-MEDIRL) algorithm that can extrapolate beyond expert demos to better encode scene navigability from few-shot demonstrations. The agent learns to predict the cost maps reasoning on trajectory data and scene geometry. The agent samples a trajectory that is then executed using a local crowd navigation controller. We present results in a photo-realistic simulation environment, with a robot and a human navigating narrow crossing scenario. The robot implicitly learns to exhibit social behaviors such as yielding to oncoming traffic and avoiding deadlocks. We compare the proposed approach to the popular model-based crowd navigation algorithm ORCA and a rule-based agent that exhibits yielding. The code and dataset are publicly available at https://github.com/UMich-CURLY/habicrowd.
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
arXiv, December 2024
A. Feder Cooper, Christopher A. Choquette-Choo, Miranda Bogen, Matthew Jagielski, Katja Filippova, Ken Ziyu Liu, Alexandra Chouldechova, Jamie Hayes, Yangsibo Huang, Niloofar Mireshghallah, Ilia Shumailov, Eleni Triantafillou, Peter Kairouz, Nicole Mitchell, Percy Liang, Daniel E. Ho, Yejin Choi, Sanmi Koyejo, Fernando Delgado, James Grimmelmann, Vitaly Shmatikov, Christopher De Sa, Solon Barocas, Amy Cyphert, Mark Lemley, danah boyd, Jennifer Wortman Vaughan, Miles Brundage, David Bau, Seth Neel, Abigail Z. Jacobs, Andreas Terzis, Hanna Wallach, Nicolas Papernot, Katherine Lee
We articulate fundamental mismatches between technical methods for machine unlearning in Generative AI, and documented aspirations for broader impact that these methods could have for law and policy. These aspirations are both numerous and varied, motivated by issues that pertain to privacy, copyright, safety, and more. For example, unlearning is often invoked as a solution for removing the effects of targeted information from a generative-AI model's parameters, e.g., a particular individual's personal data or in-copyright expression of Spiderman that was included in the model's training data. Unlearning is also proposed as a way to prevent a model from generating targeted types of information in its outputs, e.g., generations that closely resemble a particular individual's data or reflect the concept of "Spiderman." Both of these goals--the targeted removal of information from a model and the targeted suppression of information from a model's outputs--present various technical and substantive challenges. We provide a framework for thinking rigorously about these challenges, which enables us to be clear about why unlearning is not a general-purpose solution for circumscribing generative-AI model behavior in service of broader positive impact. We aim for conceptual clarity and to encourage more thoughtful communication among machine learning (ML), law, and policy experts who seek to develop and apply technical methods for compliance with policy objectives.
How Different AI Chatbots Behave? Benchmarking Large Language Models in Behavioral Economics Games
arXiv, December 2024
Yutong Xie, Yiyao Liu, Zhuang Ma, Lin Shi, Xiyuan Wang, Walter Yuan, Matthew O. Jackson, Qiaozhu Mei
The deployment of large language models (LLMs) in diverse applications requires a thorough understanding of their decision-making strategies and behavioral patterns. As a supplement to a recent study on the behavioral Turing test [7], this paper presents a comprehensive analysis of five leading LLM-based chatbot families as they navigate a series of behavioral economics games. By benchmarking these AI chatbots, we aim to uncover and document both common and distinct behavioral patterns across a range of scenarios. The findings provide valuable insights into the strategic preferences of each LLM, highlighting potential implications for their deployment in critical decision-making roles.
Evaluating Generative AI Systems is a Social Science Measurement Challenge
arXiv, November 2024
Hanna Wallach, Meera Desai, Nicholas Pangakis, A. Feder Cooper, Angelina Wang, Solon Barocas, Alexandra Chouldechova, Chad Atalla, Su Lin Blodgett, Emily Corvi, P. Alex Dow, Jean Garcia-Gathright, Alexandra Olteanu, Stefanie Reed, Emily Sheng, Dan Vann, Jennifer Wortman Vaughan, Matthew Vogel, Hannah Washington, Abigail Z. Jacobs
Across academia, industry, and government, there is an increasing awareness that the measurement tasks involved in evaluating generative AI (GenAI) systems are especially difficult. We argue that these measurement tasks are highly reminiscent of measurement tasks found throughout the social sciences. With this in mind, we present a framework, grounded in measurement theory from the social sciences, for measuring concepts related to the capabilities, impacts, opportunities, and risks of GenAI systems. The framework distinguishes between four levels: the background concept, the systematized concept, the measurement instrument(s), and the instance-level measurements themselves. This four-level approach differs from the way measurement is typically done in ML, where researchers and practitioners appear to jump straight from background concepts to measurement instruments, with little to no explicit systematization in between. As well as surfacing assumptions, thereby making it easier to understand exactly what the resulting measurements do and do not mean, this framework has two important implications for evaluating evaluations: First, it can enable stakeholders from different worlds to participate in conceptual debates, broadening the expertise involved in evaluating GenAI systems. Second, it brings rigor to operational debates by offering a set of lenses for interrogating the validity of measurement instruments and their resulting measurements.
Bridging AI and Science: Implications from a Large-Scale Literature Analysis of AI4Science
arXiv, November 2024
Yutong Xie, Yijun Pan, Hua Xu, Qiaozhu Mei
Artificial Intelligence has proven to be a transformative tool for advancing scientific research across a wide range of disciplines. However, a significant gap still exists between AI and scientific communities, limiting the full potential of AI methods in driving broad scientific discovery. Existing efforts in bridging this gap have often relied on qualitative examination of small samples of literature, offering a limited perspective on the broader AI4Science landscape. In this work, we present a large-scale analysis of the AI4Science literature, starting by using large language models to identify scientific problems and AI methods in publications from top science and AI venues. Leveraging this new dataset, we quantitatively highlight key disparities between AI methods and scientific problems in this integrated space, revealing substantial opportunities for deeper AI integration across scientific disciplines. Furthermore, we explore the potential and challenges of facilitating collaboration between AI and scientific communities through the lens of link prediction. Our findings and tools aim to promote more impactful interdisciplinary collaborations and accelerate scientific discovery through deeper and broader AI integration.
EDBooks: AI-Enhanced Interactive Narratives for Programming Education
arXiv, November 2024
Steve Oney, Yue Shen, Fei Wu, Young Suh Hong, Ziang Wang, Yamini Khajekar, Jiacheng Zhang, April Yi Wang
Large Language Models (LLMs) have shown the potential to be valuable teaching tools, with the potential of giving every student a personalized tutor. However, one challenge with using LLMs to learn new concepts is that when learning a topic in an unfamiliar domain, it can be difficult to know what questions to ask. Further, language models do not always encourage “active learning” where students can test and assess their understanding. In this paper, we propose ways to combine large language models with “traditional” learning materials (like e-books) to give readers the benefits of working with LLMs (the ability to ask personally interesting questions and receive personalized answers) with the benefits of a traditional e-book (having a structure and content that is pedagogically sound). This work shows one way that LLMs have the potential to improve learning materials and make personalized programming education more accessible to a broader audience.
RELATED
Keep up with research from UMSI experts by subscribing to our free research roundup newsletter!