Social media, bias and POPQUORN: UMSI Research Roundup

Social Media, Bias and POPQUORN. Check out UMSI faculty and PhD student publications.

Wednesday, 07/05/2023

University of Michigan School of Information faculty and PhD students are creating and sharing knowledge that helps build a better world. Here are some of their recent publications.

Enhancing Nonverbal Communication Through Virtual Human Technology: Protocol for a Mixed Methods Study

JMIR Research Protocols, June 2023

Analay Perez, Michael D Fetters, John W Creswell, Mark Scerbo, Frederick W Kron, Richard Gonzalez, Lawrence An, Masahito Jimbo, Predrag Klasnja, Timothy C Guetterman

Background: Communication is a critical component of the patient-provider relationship; however, limited research exists on the role of nonverbal communication. Virtual human training is an informatics-based educational strategy that offers various benefits in communication skill training directed at providers. Recent informatics-based interventions aimed at improving communication have mainly focused on verbal communication, yet research is needed to better understand how virtual humans can improve verbal and nonverbal communication and further elucidate the patient-provider dyad.

Objective: The purpose of this study is to enhance a conceptual model that incorporates technology to examine verbal and nonverbal components of communication and develop a nonverbal assessment that will be included in the virtual simulation for further testing.

Methods: This study will consist of a multistage mixed methods design, including convergent and exploratory sequential components. A convergent mixed methods study will be conducted to examine the mediating effects of nonverbal communication. Quantitative (eg, MPathic game scores, Kinect nonverbal data, objective structured clinical examination communication score, and Roter Interaction Analysis System and Facial Action Coding System coding of video) and qualitative data (eg, video recordings of MPathic–virtual reality [VR] interventions and student reflections) will be collected simultaneously. Data will be merged to determine the most crucial components of nonverbal behavior in human-computer interaction. An exploratory sequential design will proceed, consisting of a grounded theory qualitative phase. Using theoretical, purposeful sampling, interviews will be conducted with oncology providers probing intentional nonverbal behaviors. The qualitative findings will aid the development of a nonverbal communication model that will be included in a virtual human. The subsequent quantitative strand will incorporate and validate a new automated nonverbal communication behavior assessment into the virtual human simulation, MPathic-VR, by assessing interrater reliability, code interactions, and dyadic data analysis by comparing Kinect responses (system recorded) to manually scored records for specific nonverbal behaviors. Data will be integrated using building integration to develop the automated nonverbal communication behavior assessment and conduct a quality check of these nonverbal features.

Results: Secondary data from the MPathic-VR randomized controlled trial data set (210 medical students and 840 video recordings of interactions) were analyzed in the first part of this study. Results showed differential experiences by performance in the intervention group. Following the analysis of the convergent design, participants consisting of medical providers (n=30) will be recruited for the qualitative phase of the subsequent exploratory sequential design. We plan to complete data collection by July 2023 to analyze and integrate these findings.

Conclusions: The results from this study contribute to the improvement of patient-provider communication, both verbal and nonverbal, including the dissemination of health information and health outcomes for patients. Further, this research aims to transfer to various topical areas, including medication safety, informed consent processes, patient instructions, and treatment adherence between patients and providers.

Pulling through together: social media response trajectories in disaster-stricken communities

Journal of Computational Social Sciences, June 2023

Danaja Maldeniya, Munmun De Choudhury, David Garcia, Daniel Romero

Disasters are extraordinary shocks that disrupt every aspect of the community life. Lives are lost, infrastructure is destroyed, the social fabric is torn apart, and people are left with physical and psychological trauma. In the aftermath of a disaster, communities begin the collective process of healing, grieving losses, repairing damage, and adapting to a new reality. Previous work has suggested the existence of a series of prototypical stages through which such community responses evolve. As social media have become more widely used, affected communities have increasingly adopted them to express, navigate, and build their response due to the greater visibility and speed of interaction that these platforms afford. In this study, we ask if the behavior of disaster-struck communities on social media follows prototypical patterns and what relationship, if any, these patterns may have with those established for offline behavior in previous work. Building on theoretical models of disaster response, we investigate whether, in the short term, community responses on social media in the aftermath of disasters follow a prototypical trajectory. We conduct our analysis using computational methods to model over 200 disaster-stricken U.S. communities. Community responses are measured in a range of domains, including psychological, social, and sense-making, and as multidimensional time series derived from the linguistic markers in tweets from those communities. We find that community responses on Twitter demonstrate similar response patterns across numerous social, aspirational, and physical dynamics. Additionally, through cluster analysis, we demonstrate that a minority of communities are characterized by more intense and enduring emotional coping strategies and sense-making. In this investigation of the relationship between community response and intrinsic properties of disasters, we reveal that the severity of the impact makes the deviant trajectory more likely, while the type and duration of a disaster are not associated with it.

An Empirical Analysis of Racial Categories in the Algorithmic Fairness Literature

FAccT ’23, June 2023

Amina A. Abdu, Irene V. Pasquetto, Abigail Z. Jacobs

Recent work in algorithmic fairness has highlighted the challenge of defining racial categories for the purposes of anti-discrimination. These challenges are not new but have previously fallen to the state, which enacts race through government statistics, policies, and evidentiary standards in anti-discrimination law. Drawing on the history of state race-making, we examine how longstanding questions about the nature of race and discrimination appear within the algorithmic fairness literature. Through a content analysis of 60 papers published at FAccT between 2018 and 2020, we analyze how race is conceptualized and formalized in algorithmic fairness frameworks. We note that differing notions of race are adopted inconsistently, at times even within a single analysis. We also explore the institutional influences and values associated with these choices. While we find that categories used in algorithmic fairness work often echo legal frameworks, we demonstrate that values from academic computer science play an equally important role in the construction of racial categories. Finally, we examine the reasoning behind different operationalizations of race, finding that few papers explicitly describe their choices and even fewer justify them. We argue that the construction of racial categories is a value-laden process with significant social and political consequences for the project of algorithmic fairness. The widespread lack of justification around the operationalization of race reflects institutional norms that allow these political decisions to remain obscured within the backstage of knowledge production.

The Role of Relevance in Fair Ranking

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’23), July 2023

Aparna Balagopalan, Abigail Z. Jacobs, Asia J. Biega

Online platforms mediate access to opportunity: relevance-based rankings create and constrain options by allocating exposure to job openings and job candidates in hiring platforms, or sellers in a marketplace. In order to do so responsibly, these socially consequential systems employ various fairness measures and interventions, many of which seek to allocate exposure based on worthiness. Because these constructs are typically not directly observable, platforms must instead resort to using proxy scores such as relevance and infer them from behavioral signals such as searcher clicks. Yet, it remains an open question whether relevance fulfills its role as such a worthiness score in high-stakes fair rankings. In this paper, we combine perspectives and tools from the social sciences, information retrieval, and fairness in machine learning to derive a set of desired criteria that relevance scores should satisfy in order to meaningfully guide fairness interventions. We then empirically show that not all of these criteria are met in a case study of relevance inferred from biased user click data. We assess the impact of these violations on the estimated system fairness and analyze whether existing fairness interventions may mitigate the identified issues. Our analyses and results surface the pressing need for new approaches to relevance collection and generation that are suitable for use in fair ranking.

High-Effort Crowds: Limited Liability via Tournaments

WWW '23: Proceedings of the ACM Web Conference 2023, April 2023

Yichi Zhang, Grant Schoenebeck

We consider the crowdsourcing setting where, in response to the assigned tasks, agents strategically decide both how much effort to exert (from a continuum) and whether to manipulate their reports. The goal is to design payment mechanisms that (1) satisfy limited liability (all payments are non-negative), (2) reduce the principal’s cost of budget, (3) incentivize effort and (4) incentivize truthful responses. In our framework, the payment mechanism composes a performance measurement, which noisily evaluates agents’ effort based on their reports, and a payment function, which converts the scores output by the performance measurement to payments.

Previous literature suggests applying a peer prediction mechanism combined with a linear payment function. This method can achieve either (1), (3) and (4), or (2), (3) and (4) in the binary effort setting. In this paper, we suggest using a rank-order payment function (tournament). Assuming Gaussian noise, we analytically optimize the rank-order payment function, and identify a sufficient statistic, sensitivity, which serves as a metric for optimizing the performance measurements. This helps us obtain (1), (2) and (3) simultaneously. Additionally, we show that adding noise to agents’ scores can preserve the truthfulness of the performance measurements under the non-linear tournament, which gives us all four objectives. Our real-data estimated agent-based model experiments show that our method can greatly reduce the payment of effort elicitation while preserving the truthfulness of the performance measurement. In addition, we empirically evaluate several commonly used performance measurements in terms of their sensitivities and strategic robustness.

Multitask Peer Prediction With Task-dependent Strategies

WWW '23: Proceedings of the ACM Web Conference 2023, April 2023

Yichi Zhang, Grant Schoenebeck

Peer prediction aims to incentivize truthful reports from agents whose reports cannot be assessed with any objective ground truthful information. In the multi-task setting where each agent is asked multiple questions, a sequence of mechanisms have been proposed which are truthful — truth-telling is guaranteed to be an equilibrium, or even better, informed truthful — truth-telling is guaranteed to be one of the best-paid equilibria. However, these guarantees assume agents’ strategies are restricted to be task-independent: an agent’s report on a task is not affected by her information about other tasks. We provide the first discussion on how to design (informed) truthful mechanisms for task-dependent strategies, which allows the agents to report based on all her information on the assigned tasks. We call such stronger mechanisms (informed) omni-truthful. In particular, we propose the joint-disjoint task framework, a new paradigm which builds upon the previous penalty-bonus task framework. First, we show a natural reduction from mechanisms in the penalty-bonus task framework to mechanisms in the joint-disjoint task framework that maps every truthful mechanism to an omni-truthful mechanism. Such a reduction is non-trivial as we show that current penalty-bonus task mechanisms are not, in general, omni-truthful. Second, for a stronger truthful guarantee, we design the matching agreement (MA) mechanism which is informed omni-truthful. Finally, for the MA mechanism in the detail-free setting where no prior knowledge is assumed, we show how many tasks are required to (approximately) retain the truthful guarantees.

Bias as Boundary Object: Unpacking The Politics Of An Austerity Algorithm Using Bias Frameworks

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, June 2023

Gabriel Grill, Fabian Fischen, Florian Cech

Whether bias is an appropriate lens for analysis and critique remains a subject of debate among scholars. This paper contributes to this conversation by unpacking the use of bias in a critical analysis of a controversial austerity algorithm introduced by the Austrian public employment service in 2018. It was envisioned to classify the unemployed into three risk categories based on predicted prospects for re-employment. The system promised to increase efficiency and effectivity of counseling while objectifying a new austerity support measure allocation scheme. This approach was intended to cut spending for those deemed at highest risk of long term unemployment. Our in-depth analysis, based on internal documentation not available to the public, systematically traces and categorizes various problematic biases to illustrate harms to job seekers and challenge promises used to justify the adoption of the system. The classification is guided by a long-established bias framework for computer systems developed by Friedman and Nissenbaum, which provides three sensitizing basic categories. We identified in our analysis "technical biases," like issues around measurement, rigidity, and coarseness of variables, "emergent biases," such as disruptive events that change the labor market, and, finally, "preexisting biases," like the use of variables that act as proxies for inequality. Grounded in our case study, we argue that articulated biases can be strategically used as boundary objects to enable different actors to critically debate and challenge problematic systems without prior consensus building. We unpack benefits and risks of using bias classification frameworks to guide analysis. They have recently received increased scholarly attention and thereby may influence the identification and construction of biases. By comparing four bias frameworks and drawing on our case study, we illustrate how they are political by prioritizing certain aspects in analysis while disregarding others. Furthermore, we discuss how they vary in their granularity and how this can influence analysis. We also problematize how these frameworks tend to favor explanations for bias that center the algorithm instead of social structures. We discuss several recommendations to make bias analyses more emancipatory, arguing that biases should be seen as starting points for reflection on harmful impacts, questioning the framing imposed by the imagined “unbiased" center that the bias is supposed to distort, and seeking out deeper explanations and histories that also center bigger social structures, power dynamics, and marginalized perspectives. Finally, we reflect on the risk that these frameworks may stabilize problematic notions of bias, for example, when they become a standard or enshrined in law.

When Do Annotator Demographics Matter? Measuring the Influence of Annotator Demographics with the POPQUORN Dataset

Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII) at ACL, June 2023

Jiaxin Pei, David Jurgens

Annotators are not fungible. Their demographics, life experiences, and backgrounds all contribute to how they label data. However, NLP has only recently considered how annotator identity might influence their decisions. Here, we present POPQUORN (the Potato-Prolific dataset for QuestionAnswering, Offensiveness, text Rewriting and politeness rating with demographic Nuance). POPQUORN contains 45,000 annotations from 1,484 annotators, drawn from a representative sample regarding sex, age, and race as the US population. Through a series of analyses, we show that annotators’ background plays a significant role in their judgments. Further, our work shows that backgrounds not previously considered in NLP (e.g., education), are meaningful and should be considered. Our study suggests that understanding the background of annotators and collecting labels from a demographically balanced pool of crowd workers is important to reduce the bias of datasets. The dataset, annotator background, and annotation interface are available at https://github.com/Jiaxin-Pei/ potato-prolific-dataset.

The Stability of Cable and Broadcast News Intermedia Agenda Setting Across the COVID-19 Issue Attention Cycle

Political Communication, June 2023

Ceren Budak, Natalie Jomini Stroud, Ashley Muddiman, Caroline C. Murray, Yujin Kim

In today’s fragmented media environment, it is unclear whether the correspondence between media agendas that characterizes intermedia agenda setting persists. Through a combination of manual and computerized content analysis of 486,068 paragraphs of COVID−19 coverage across 4,589 cable and broadcast news transcripts, we analyze second and third-level attribute agenda setting, both in terms of central themes and aspects. Through the lens of the issue attention cycle, we assess whether relationships among media agendas change over time. The results show that even in a fragmented media environment, there is considerable evidence of intermedia agenda setting. The attribute agendas were largely similar across outlets despite the similarity slightly decreasing over time. The findings suggest that there was only modest evidence for the prominent perception of fragmented coverage for cable and broadcast news networks’ attribute agendas concerning the COVID−19 pandemic.

Envisioning Equitable Speech Technologies for Black Older Adults

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, June 2023

Courtney Heldreth, Robin Brewer, Christina Harrington

There is increasing concern that how researchers currently define and measure fairness is inadequate. Recent calls push to move beyond traditional concepts of fairness and consider related constructs through qualitative and community-based approaches, particularly for underrepresented communities most at-risk for AI harm. One in context, previous research has identified that voice technologies are unfair due to racial and age disparities. This paper uses voice technologies as a case study to unpack how Black older adults value and envision fair and equitable AI systems. We conducted design workshops and interviews with 16 Black older adults, exploring how participants envisioned voice technologies that better understand cultural context and mitigate cultural dissonance. Our findings identify tensions between what it means to have fair, inclusive, and representative voice technologies. This research raises questions about how and whether researchers can model cultural representation with large language models.

Effect-Invariant Mechanisms for Policy Generalization

arXiv, June 2023

Sorawit Saengkyongam, Niklas Pfister, Predrag Klasnja, Susan Murphy, and Jonas Peters

Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.

XRSpotlight: Example-based Programming of XR Interactions using a Rule-based Approach

Proceedings of the ACM on Human-Computer Interaction, Volume 7, Issue EICS, June 2023

Vittoria Frau, Lucio Davide Spano, Valentino Artizzu, Michael Nebeling

Research on enabling novice AR/VR developers has emphasized the need to lower the technical barriers to entry. This is often achieved by providing new authoring tools that provide simpler means to implement XR interactions through abstraction. However, novices are then bound by the ceiling of each tool and may not form the correct mental model of how interactions are implemented. We present XRSpotlight, a system that supports novices by curating a list of the XR interactions defined in a Unity scene and presenting them as rules in natural language. Our approach is based on a model abstraction that unifies existing XR toolkit implementations. Using our model, XRSpotlight can find incomplete specifications of interactions, suggest similar interactions, and copy-paste interactions from examples using different toolkits. We assess the validity of our model with professional VR developers and demonstrate that XRSpotlight helps novices understand how XR interactions are implemented in examples and apply this knowledge in their projects.

Ethno-biomathematics: A Decolonial Approach to Mathematics at the Intersection of Human and Nonhuman Design

Ubiratan D’Ambrosio and Mathematics Education, part of the Advances in Mathematics Education book series, June 2023

Ron Eglash

Associate Professor of Information, School of Information and Associate Professor of Electrical Engineering and Computer Science, College of Engineering

Send Email