405: “Is technology killing privacy?” with Florian Schaub
Listen to UMSI on:
Information Changes Everything
News and research from the world of information science
Presented by the University of Michigan School of Information (UMSI)
Episode
405
Released
June 18, 2024
Recorded
2021
Guests
Florian Schaub, associate professor at the University of Michigan School of Information and the U-M College of Engineering
Summary
In this episode of “Information Changes Everything,” we hear from Florian Schaub, an associate professor at the University of Michigan School of Information and the U-M College of Engineering. Schaub discusses the intricate relationship between privacy and technology, focusing on how companies like Google and Facebook track user actions and the implications of this data collection on our daily lives. He also explores potential solutions to better inform and protect consumers.
Resources and links mentioned
- Full video of Florian Schaub’s talk on YouTube
- Charles Severance: “Anyone can learn Python and also be a cab driver, so why not?”
- In due course: How an experimental intro class shaped UMSI students’ paths
- The New York Times is fighting off Wordle look-alikes with copyright takedown notices | The Associated Press
- umsi.info/events
- UMSI on social media: X | Instagram | LinkedIn | Facebook
Reach out to us at [email protected].
Timestamps
Intro (0:00)
Information news from UMSI (1:18)
Hear excerpts from Florian Schaub’s 2021 UMSI Mini Class, “Is technology killing privacy?” (2:20)
Next time: Enhancing remote work sustainability through human-centered data science with Xuan Lu (21:54)
Outro (22:34)
Subscribe
Subscribe to “Information Changes Everything” on your favorite podcast platform for more intriguing discussions and expert insights.
About us
The “Information Changes Everything” podcast is a service of the University of Michigan School of Information, leaders and best in research and education for applied data science, information analysis, user experience, data analytics, digital curation, libraries, health informatics and the full field of information science. Visit us at si.umich.edu.
Questions or comments
If you have questions, comments, or topics you'd like us to cover, please reach out to us at [email protected].
Florian Schaub (00:00):
It's technology killing privacy? I would say only if we let it. There's a lot we can do and there's a lot we can do both in terms of how we design technology as well as what public policy requires from companies to make sure privacy and technology can coexist.
Kate Atkins, host (00:17):
That was Florian Schaub, an associate professor at UMSI and at the U-M College of Engineering. And this is “Information Changes Everything,” where we put the spotlight on news and research from the world of information science. You're going to hear from experts, students, researchers, and other people making a real difference. As always, we're presented by the University of Michigan School of Information UMSI for short. Learn more about us at si.umich.edu. I'm your host, Kate Atkins. Today we'll hear more from Florian Schaub as he looks at the relationship between privacy and technology. From social media oversharing to the invisible data trails we leave online. This talk reveals how companies like Google and Facebook track our actions and how this data collection impacts our daily lives. Schaub also addresses potential solutions like innovative approaches to better inform and protect consumers. Before we jump in, a few other people and projects that you should know about.
(01:24):
More than 3 million people worldwide have taken the Python for Everybody course through online learning platform Coursera. The course was created by UMSI clinical professor Charles Severance, who has worked for decades to make programming easy to learn. | Masters students at UMSI have developed digital solutions to enhance surgical training through an innovative collaboration with the U-M Clinical Simulation Center. Some say the project significantly impacted their career paths. | Finally, not satisfied with just one Wordle a day? Many are feeding their puzzle addiction with Wordle clones, but now AP News says The New York Times is sending takedown notices to Wordle-like sites, claiming copyright infringement. For more on all of these stories, check out si.umich.edu or click the link in our show notes. Now back to Florian Schaub.
Florian Schaub (02:24):
When we think about privacy and technology, privacy and technology have really been intertwined for a long time. Samuel Warren and Louis Brandeis, who would end up becoming a Supreme Court justice, wrote a very influential law review article in 1890, which established the right to be let alone. And the technology that spurred them to think about privacy and the need for privacy protections was the Kodak camera. First time that you could actually take snap photos of people as they walked past rather than people having to pose for longer times to actually get a sharp picture. In the sixties, Alan Westin started formulating what we now call information privacy, and this notion that privacy is about individuals controlling and managing and being able to delete information about themselves and deciding who has that information and his thinking was spurred by the development of the mainframe computers.
(03:20):
So while when we talk about privacy today, we are often thinking social media and other such technologies. Really technology has always been driving privacy and privacy and technology have always been in some kind of tension. If we look at privacy, there are kinds of tensions that we see. On the one hand, we have government interests in terms of safety, law enforcement, national security, we have business interests. Companies are trying to mine data about individuals to better sell products to them. We also have just personal interests in sharing data, right? And we undermine our own privacy sometimes because we want to socialize, because using the technology is convenient because we want to present ourselves online in certain ways, and then technological advancement keeps pushing and accelerating these tensions further and further. When we think about privacy, a lot of people think about social media and how people overshare on social media, but it's often not just what you post or share yourself, it's also about the traces you leave online or leave in general when you use technology.
(04:25):
And that starts with very basic and benign things like websites wanting to better understand where are their visitors coming from, what are they doing on the website, where are maybe fall off points where people fail to complete the transaction. However, one of the problems when it comes to privacy is that these analytic services are actually being provided by two companies primarily, namely Google and Facebook. This is a great study by Princeton researchers. In 2016, they looked at the top 1 million websites and they found that 75% of them had Google Analytics tracking code on them and 25% had Facebook analytics tracking code on them. And while it's convenient for website providers to use some of these, the services that Google and Facebook offer here, it also means that Google and Facebook get a pretty comprehensive picture of where you're going when you're going from one website to the other.
(05:19):
And the same is true when you use mobile apps. Many apps have tracking code embedded as an additional revenue source for the app providers, and that's not just the case for free apps. Even paid apps often have the same tracking code and are still tracking you in the same way. This type of tracking, consumer tracking doesn't just occur online or on your phone. It also occurs in stores. You walk around and you go shopping. They might actually be Bluetooth beacons installed in the store that keep track of which aisles are you going down, what you're looking at. Many of you might have loyalty cards for the grocery stores you go to. Those are nice because they give you discounts, but also give the store permission to basically keep track of your shopping profiles and better understand who you are as a consumer. And as we move towards smart homes and smart cities, potential for data collection gets more and more.
(06:14):
The problem is that often while we might think we are using technology, we actually end up being the product. For example, we see this when you actually want to buy ads. So you can tailor based on location, based on age, gender, but then there are also all kinds of detailed targeting options you can set. So I want to target households who are in the top 10% of income or I want to target people who are currently living away from family or maybe closer to all our heart people who are Michigan fans. All of this information that's being collected allows Facebook and other companies to create very detailed profiles of who you are and then turn around and sell access to those profiles to marketers and companies. Beyond these direct interactions. There's also a whole economy of data brokers, which basically exist for the whole purpose of monetizing data.
(07:05):
Then can get their hands on, for example, sell data enrichment services. So if a company has, let's say your phone number or your email address but doesn't know anything else about you, they can go to a data broker and use the data enrichment service and say like, Hey, what do you know about this email address? Or do you know about this phone number? And that way get the information about we possibly get your name, but more interesting to then get your actual address, understand where you live, understand whether you may be in the market for car right now, whether you are pregnant, all kinds of things that are being kept track of and that then might make it easier to target advertising to you. Shoshana Zuboff calls this surveillance capitalism. So these companies, while they create services and technologies that are seemingly technologies really exist to monetize data and do that at a large scale.
(07:57):
One question we can ask is, well, what's the problem with that? We live in a capitalist society. I actually like it if I see advertisement that's relevant to me. What are the problems here? There are many, but it starts with something simple such as unwanted disclosure. If your information is out there, companies for example can track your location. That information can also be sold and may be available to people you might not know or might not want to have access to that data.. That data can be used for surveillance by the government, for example, to surveill and arrest immigrants or by police to monitor and try to identify protesters. It can also be exploited to exert political influence. So some of you might've heard about the Cambridge Analytica Facebook scandal a couple of years ago where basically this marketing company, Cambridge Analytica gained access to 50 million Facebook profiles and the connections and then used this information in political campaigns to basically target political advertising and different types of political messages depending on who you are.
(09:01):
Data is also used for discrimination based on gender or age. People might see different ads and might be offered different prices. It also leads to racism, really interesting studies that show that if you type a Black sounding name into a search engine, you're much more likely to get offers for bail bonds and other services compared to a white sounding name. And we also see that some of these privacy harms particularly affect lower income people. Virginia Eubanks in her book Automating Inequality talks about how people who are reliant on government benefits, for example, and similar services are made to reveal much more about their personal affairs and effects than other people and are basically subject to continuous scrutiny and monitoring. I think one of the really sometimes difficult to cross privacy harms is the absence of opportunity. You might never know that you didn't get this job interview because somehow a social media post from 10 years ago suggested that you might be a flaky employee because how the data flows and how it's used often makes it very opaque when decisions are being made about you.
(10:12):
And that's one of the key issues. When we look at how do consumers think about this In general, people are concerned about data use by companies and by governments, but at the same time, they also feel like they don't have control over how companies use the data, so there's concern but they don't know what to do about it. And sometimes this is called the privacy paradox in the literature. So this notion that people say they're concerned about privacy, but then they still use social media and they post on social media and by smart speakers and other things. I like to think of it more as the fallacy of revealed preferences. So in economics revealed preferences is this notion that we can best measure preferences like true preferences by looking at people's behaviors and there's some value to that. But when it comes to privacy, it's unfortunately not quite as easy.
(11:01):
In my own research, we've been studying exposure sensitive populations in particular. So these are populations that have a heightened interest in protecting their privacy, and we try to understand are they able to protect their privacy? And what we find is that that's often not the case. So we did research with undocumented immigrants and found that they use technologies the same way as all of us do. They're aware of some of the risks, but really struggle to protect their privacy as they use technologies. More recently, we conducted a study with sex workers in Europe where we found that whilr sex workers embrace technology for their work they also use technology for personal reasons, but really struggled to keep their work identity and their personal identity separate. It's things like social media services recommending clients as potential contacts, revealing clear names to clients or revealing their work identity to their friends and family.
(12:01):
In other work, we've been talking to older adults about their privacy and security needs and behaviors. And what we find is that concerns about privacy sometimes lead to a fear of adopting technology. People are interested in using technology but are worried about the potential privacy implications and security implications and don't do it because of that. But at the same time, we also see increased risk taking because the way they think about risks might not be aligned with the actual risks that exist. And we find the same thing at the other end of the spectrum looking at young children. So I'm working with child psychologists and pediatric researchers here at Michigan. We've been trying to understand how children think about data flows and what we see is that they actually have a really good understanding of how recommendations work. Why am I seeing a certain video on YouTube for example, or where do these Netflix recommendations come from?
(12:56):
But what they struggle with is that many of their explanations are based on what they see right in front of them, visual cues in the interface. But often these things happen invisibly in the background. These invisible practices are often clear to children or difficult for them to understand. So lots of issues why people struggle to protect their privacy. And what it comes down to is that we as humans are just not very good at making consistent decisions, particularly when we face uncertainty, uncertainty both in terms of, well, what are actually the data practices that a company engages in, but also what are the risks associated with that? We need context to make decisions and in technological context, sometimes we might be lacking the right information to understand what constitutes the context and how that works. And at the same time, there are lots of psychological tricks that can be exploited to get us to reveal more information than we might want to.
(13:49):
The first aspect is when it comes to uncertainty, is that there are actually documents where you can learn about data practice of a company are privacy policies. And usually at this point, if you were in person, I would ask like, oh, how many of you have read privacy policies in the last month? Maybe one or two hands go up, maybe a few more depending on what the audience is. But most people don't read privacy policies. Now I would actually argue that's a good thing. Privacy policies should be long and they should probably be even longer. They should look more like SEC findings where companies in detail need to describe their practices and what are they doing with the data of the customers. But we need to stop pretending that these are documents for consumers, right? No one reads these privacy policies and right now they don't actually serve the consumers well and they don't serve regulators or oversight authorities well either because they're way too vague to really understand what's going on.
(14:40):
And all of this leads to what Nora Draper and Joe Turow called digital resignation. So people are concerned about their privacy, but they also want to enjoy the technology. So what they end up doing is they kind of close their eyes, cross their fingers and hope for the best and hope that ultimately the companies are looking out for their interests, which unfortunately is not really the case. In addition to that, privacy behavior can also be easily swayed and we can, with simple tricks, people can be made to disclose much more information than they might want to. And more recently, this whole field has been associated with the term of dark patterns, basically design patterns in interfaces and interactions that are meant to trick people or sway people in doing things that might go against their actual interests. So is technology killing privacy? What we've talked about so far doesn't look too promising, but there's some hope.
(15:36):
First, we are actually hearing much more about privacy issues and they've really come into the spotlight, I'd say in the last five years. We're also seeing much more regulation focused on privacy. Europe's general data protection regulation was the most heaviest lobbied piece of legislation in the history of the European Union, and it still ended up being a very pro-consumer law that actually increased and improved privacy protections for Europeans. But because of its extraterritorial applicability also around the world in the US, we saw California adopt the first comprehensive privacy law in 2018 and just went into effect last year. And as part of the last election in 2020, Californians adopted a proposition called the California Privacy Rights Act that further increases privacy protections in California. Brazil and other companies have been adopting comprehensive privacy laws. Maybe I should explain what that means. It means the comprehensive privacy law is not just constrained to a specific context such as HIPAA for healthcare on the federal level or FERPA for education data, but instead just applies to the processing of personal identifiable information regardless of what the context or the domain is in which that data is being processed.
(16:59):
So a lot of our work has looked at, well, what can we actually do here? The three important things that help people make better decisions when it comes to privacy and make it easier for people to manage their privacy. One is the information we give to people has to be relevant to what they're currently doing. When you go to a news website, you might want to know, well, are you tracking what I'm reading? Who is the data being shared with? Who else has access to that? Finding that information in the privacy policy is actually very, very difficult because there are all kinds of other things in their privacy policy. Basically any potential interaction you might ever have with a company, it's going to be listed in the privacy policy. So it becomes really difficult to really understand what's happening in a particular moment. What we've shown in our work is that actually one way forward is to focus on unexpected or surprising practices.
(17:45):
So in this particular study, we looked at fitness trackers and we gave our participants, we showed our participants a fitness tracker this a Fitbit watch, and then gave them a list of data practices. Some were real, some were fake, and asked them, do you think this product engages in this practice? And by doing this, we found for some practices such just the collection of step data, almost every one of our participants expected that that's the case. That's why you would buy a fitness tracker, right? Like you want to count your steps. However, then for other aspects like the collection of location data, only 31% of participants expected this likely because this data collection doesn't actually happen on the watch. It happens in the mobile app, the mobile companion app that comes with a watch and lives on the phone. Right now, if we're thinking about, well, what are the things we need to most urgently inform people about the location collection is probably the practice we might want to highlight or potentially also ask for opt-in consent instead of just collecting the data by default.
(18:49):
It's also important that information is understandable. I keep harping on the privacy policies and how they're difficult to understand or too long, but they don't have to be. The Gramm-Leach-Bliley Act, for example, which regulates the financial industry, requires that companies provide concise privacy notices that follow this two page format. And while there are issues with this format, for sure, it's very concise. It's down to the point and it makes choices and makes information and choices available to consumers in a consistent way. We're thinking a lot about how we can better integrate privacy information and control it directly into user experiences rather than having them hidden away in the privacy policy or overwhelming people with lots of text or choices. And we kind of think along four dimensions when we think about the design of privacy interfaces, it's when do you show it to you users?
(19:39):
What's the channel? Can you actually show it on the same device? Usually that's the case for a phone or a computer, but maybe not so useful for a smart thermostat, right? You don't want to read a privacy notice on your thermostat. So in those cases, we might need secondary channels like sending an email or leveraging a mobile app that might exist on a phone. And then how can we really engage people with this information in a meaningful way? And we've been working a lot on beyond trying to establish these best practices and guidelines and communicating those to industry we are also thinking about, well, how can we help consumers in the moment? So by reducing the threshold for engaging with privacy settings and making otherwise invisible information, say and invisible, we can get people's attention and can get them to think about and reflect on their privacy settings.
(20:28):
We can go even a bit further and analyze the app code to figure out what might the application be using this information for. In other work, we analyze privacy policies using deep learning techniques and natural language processing techniques to build a chat bot. So Pribot, you can go to Pribot.org and try this out. You can ask questions about a company, and then the system answers based on information from the privacy policy for the company, and sometimes it can give directly short answers or it gives excerpts from the privacy policy. Similar work, we looked at links and privacy policies to try to identify potential privacy controls and opt-out links, and we extracted them from the privacy policy and built a browser extension that you can install called Optout Easy. And then when you go to a website, it shows you all the available opt-outs in an easily accessible interface without having to look for them in the privacy policy. So with that, I'm at the end, and I started with this question, is technology killing privacy? I would say only if we let it. There's a lot we can do and there's a lot we can do both in terms of how we design technology as well as what public policy requires from companies to make sure privacy and technology can coexist.
Kate Atkins, host (21:45):
You can watch the full talk by clicking the link in our show notes. To learn more about upcoming events like this, visit us at umsi.info/events and tune in next time to hear from UMSI research fellow Xuan Lu during a 2023 talk at UMSI focused on sustainable remote work.
Xuan Lu (22:12):
Since a shift to remote work, the emotional issues become more significant due to this something like isolation. However, at virtual workplaces without a face-to-face interactions, emotions are much harder to observe. However, as remote workers are using online platforms to communicate in their daily working, is it feasible to use the platforms as a channel to monitor their emotions at scale?
Kate Atkins, host (22:41):
That's in our next episode. Before we go, did you know that we post the latest on information science, news and research every day? Keep your finger on the pulse by following UMSI on X, Instagram, LinkedIn, and Facebook. Search for University of Michigan School of Information, or click the links in our show notes. The University of Michigan School of Information creates and shares knowledge so that people like you will use information with technology to build a better world. Don't forget to subscribe to “Information Changes Everything" on your favorite podcasting platform, and if you've got questions, comments, or episode ideas, send us an email at [email protected]. From all of us at the University of Michigan School of Information, thanks for listening.