Digital Detox #2: Ethics and Data Privacy

Reading Time: 9 minutes

Okay, it’s time to roll up our sleeves for the big guns: ethics and data.

It’s probably not a surprising piece of information for me to share with you that much of what you engage with online — from Google to Facebook and back again — is mining your data, using analytics to tailor advertising for your personal eyeballs. Yes, your phone is listening to you; yes, your browser is tracking you; yes, your store loyalty card follows your buying habits like they’re the most fascinating telenovella ever made. You know all this. And you know that every week we seem to have a major data breech by some major corporation, and you know that Facebook is peddling disinformation, and you know that every online service you use for free is selling your data to someone else for big money.

And yet, what exactly are you supposed to do? I recently left Facebook, partly because I find its ethics so upsetting and partly because so too are the politics of my distant relatives, and it’s actually really hard to function in a small centre without Facebook, where very few businesses have any web presence beyond a Facebook page. This, of course, is Facebook’s intention — it wants to be your walled garden, the only path to connecting online. And I guess you could live without store loyalty cards, but it’s a hard pill to swallow when you have to pay higher prices for the privilege, and this price disparity also targets people who can’t afford to make a different choice. You could never use a Google product — unless you have kids in the K-12 system and you’ve already signed off on their use of ChromeBooks and GSuite

These systems were not designed to be opted-out from.

I’m not interested in answers to the question of what we do to protect our data that rely on unplugging and disconnecting, because I don’t think those things are realistic expectations. And I struggle with the expectation that the end user is exclusively responsible for their own data privacy when Terms of Service agreements are inscrutable unless you have a law degree and appropriate corporate regulation is so much more effective (like, what is government for, exactly?). That’s not to say it’s not wise to be as data-aware as you can be (check out this phenomenal Data Detox that walks you through the steps of being more mindful about who knows what about you), because knowledge is power and it’s important to know what data you’re giving away, as best you can. But I am more interested in talking about the larger ethics of the companies involved in data collection, and by extension the choices we do make around data, particularly when we engage with the data of other people. And I want to talk about the data mining we may be less aware of: EdTech’s penchant for swiping data is every bit as menacing as Facebook and Google.

“But students don’t care about privacy!”

Here in BC, we have some of the strictest privacy laws in North America, and there’s no question that it limits some of what we can do with technology in the classroom. The shorthand for our privacy laws is basically that student data (names, grades, student numbers, assignments) cannot be stored on servers subject to the PATRIOT Act — eg, servers located in the USA — which reduces the companies we can engage with. There are workarounds (we can have students sign waivers, for example), and many companies have servers in Canada now to accommodate this need. And BCCampus leads the charge in making sure we have, where possible, made-in-BC solutions to many problems. But I still spend a lot of time talking to people about what is and is not FIPPA-compliant… and finding out about a lot of sketchy things instructors are doing (don’t worry, I’m not outing any of you here — lets call it technologist-client confidentiality — but I do hope you keep reading).

In some cases, non-compliance is just because people don’t know. I’ve been teaching in BC for a decade now, and everything I know about FIPPA is self-taught; most of us don’t seem to be well-briefed on our obligations around data privacy when we are hired, and I speak from the privileged perspective of being hired directly into a full-time position. I’m certain the training for precarious and non-permanent employees is even less. In some cases, instructors have a vague idea of FIPPA, but don’t know how to find out if a tool is compliant or what questions they’re expected to ask, and so when pressed for time and without other options, they use the tool anyway. And surprisingly often, I hear from instructors that they aren’t worried about FIPPA-compliance, because students don’t care about privacy, anyway.

In the first case, I provide information; in the second, I provide alternatives; in the third case, I push back. I’ve never seen evidence to support that students (or that amorphous group we call “young people”) care less about data privacy than other groups. And even if that were true, I increasingly believe that insofar as post-secondary education prepares students for life, it’s our job to make students aware of policies like FIPPA and the reasons for regulation of access to their data. Because the truth is, I don’t think any of us truly get the implications of poor data practices until someone explains it to us. Or until it’s too late.

The Shocker: Big EdTech is Mining Your Data, Too

Okay, maybe not a shocker. But it took me a long time to accept that educational technology is more a Silicon Valley space than an academic one — in my previous life as a literature scholar, I never anticipated having strong opinions about venture capitalism, but here we are. For-profit educational technology companies can make money in a limited number of ways: they can sell you or your institutions one-shot products, they can offer a subscription model that is institution-paid or student-paid, or they can sell the data they collect. Or some combination of the above. Many of the agreements our institutions sign with these companies give explicit rights to use student data for things like “targeted marketing,” and opt-outs are complicated and Byzantine. I want to take a minute to look at some of the data practices of the best-known players in this space, and spend some time thinking about how we might choose differently when we choose technologies for ourselves (and what we ask others to opt into).

Turnitin 

If you’ve ever had more than a fifteen minute conversation with me, you probably already know my thoughts on Turnitin; I have absolutely no chill: I loathe Turnitin. (I’m not the only one.) If you’re unfamiliar with this particular piece of tech, Turnitin is a plagiarism detection program — students opt (or in many jurisdictions, are required) to run their assignments through Turnitin to “prove” their academic integrity prior to submission. Philosophically, I think it’s a pretty flawed system: nothing says, “I value academic integrity!” like assuming students are acting in bad faith and so therefore handing their intellectual property over to a for-profit business. I know there are folks who use it and appreciate what it offers, but beyond my general frustration with the idea of Turnitin as a tool, its data practices need careful consideration.

Turnitin has access to wide swaths of student data in the form of essays and assignments, which they mine in order to be able to compare submissions to their database to assess whether student work has been copied. Their business model relies on receiving student intellectual property for free — students, of course, are not compensated for providing the content for their database — and has expanded to include a Revision Assistant tool for students that is also built from this massive amount of student data. Revision Assistant is, in essence, a machine-taught tool to improve writing based on the vast swaths of student writing Turnitin can analyze. Are students fully informed about where their data is going, in this context, and who is profiting from it? Increasingly, we’re seeing student groups advocate for more transparency in the use of Turnitin, and for opt-out policies to be made more explicit. Instructors can ask to have the work their students submit deleted from the Turnitin database… but they have to know to ask. Most don’t.

Turnitin has always downplayed the data mining they do, but it is the backbone of their ability to offer their service. It’s also what makes them attractive to venture capitalists. In March, Turnitin was acquired by a VC firm for $1.75B, which gives you a sense of what all that uncompensated student intellectual property and mined data is worth. 

TopHat

Polling software is just fun. It’s also really pedagogically useful — instructors can check in on student understanding of key points or collect questions to address later in class, and quick quizzes can give students a formative opportunity to self-check. This isn’t a new desire: “clickers” for polling were the first piece of cutting-edge classroom technology I ever got the opportunity to pilot, and that was back in 2003. TopHat is software that does a little more than polling, but that is its core functionality (it also can be used to monitor attendance, which is something else I have feelings about, but I’ll spare you them today).

You may have noticed in the comments to the last post that TopHat’s predatory business practices came up: they boast a free version for instructors to use, but once engaged with the software it can become difficult to tell what is a free or a paid service, preying on anxious and stressed students who may then pay when they don’t actually have to.

But TopHat also gets to acquire lots and lots of student data through its classroom resources, and like any private player in this space, it is loathed to disclose what it does with it. The CEO of TopHat likes to talk about how much data they have access to, and that they can drill down into it enough to analyze individual student study habits. That’s not “exciting,” that’s alarming. And Jason Rhinelander has done the work of reading through the End User License Agreement for TopHat, which includes gems like students cannot link to TopHat in an article critical of its use, students are responsible for any data breeches that occur, and they offer no opt-outs for the collection of personal data beyond opting out of the service altogether. Yikes. What position does that put a student in if an instructor decides to make its use mandatory?

Pearson

Pearson really wants student data. Student data is the One Ring, and Pearson is Gollum. Me, I’m more like the Samwise Gamgee to your Frodo in this conversation, and we’re going to take little stroll to Mount Doom. Watch your fingers. Have I tortured this Lord of the Rings bit enough?

Pearson is a textbook company, sure. As we talked about last day, it’s also a creator of homework systems or courseware, a layer of learning tool that gets between instructors and students (and absorbs a massive amount of data at the same time). It also owns many of the major standardized testing suites and it builds entire online degree programs. In all honestly, Pearson could feature in every single Digital Detox post and we’d never cover all the content they manage and mine. And Pearson has exclusive contracts with universities and colleges all over the world, sometimes achieved without a competitive bid process, as Pearson uses the goodwill and name recognition it developed in the textbook space to move into the big business of student data. In 2012, Pearson executives boasted that they have more access to student data in K-12 than anyone in the world

In the higher education space, Pearson is the biggest player, and they have some incredible access to student data, including everything from financial aid applications to interim and final grades. They say they don’t sell student data, but they also publicly refused to sign the Student Privacy Pledge. And last year, the inevitable happened: a data breech, exposing data from 13,000 institutions and one million college students. The attack occurred in November of 2018, but Pearson waited to inform the FBI until the following March, and end users were not notified until August. While Pearson asserts that the breech was “limited” to first and last name, date of birth, and email address — enough to do a fair amount of damage! — it impacted data collected as early as 2001. The roll out of the disclosure (and the disappearance of the statement from their website) suggests that the top priority in this instance wasn’t ever student data, but brand management.

Academia.edu

I was so excited when I learned about Academic.edu in the first year of my professional life. I even sometimes remembered to update it! Pitched as the social network for academics, it’s a place where grad students and academics alike can maintain a profile, upload articles and conference papers, and search and follow work in related areas. And if that’s all it was, it would be a dream. 

Unfortunately, like any other social media you engage with, the product is you and your data. In addition to their free service, they offer a paid premium package that acts a bit like a high school drag rag: click here to see which academics are taking about you! And predatory conferences and phoney journals troll the uploaded content on Academia.edu for people to pitch their wares to (sometimes with hilarious results: an editorial I once wrote called “The Unbearable Blind Spots of Comics Scholarship” netted me regular invitations to publish in “top” fake ophthalmology journals, because I guess bots don’t really do metaphor). Academia.edu is a for-profit company backed by venture capital, and the only thing it really has to sell is your data.

The saddest part of Academia.edu’s rise as a content repository is that most institutions now operate some kind of digital repository, often through the library, where faculty and graduate students can freely showcase their work without fear of how that work is being stripped, stored, and sold. It doesn’t have the social networking function, but it is another way of thinking about a low-effort solution for distributing research more widely. 

And then what?

So what’s the moral here? I guess it’s that I think we all need to be more careful not only with our data, but with what we ask other people to do with their own data — particularly when there’s a power imbalance. Can you really opt-out of submitting your paper to Turnitin, or will your professor assume that means you’re guilty? If your doctoral supervisor is really pushing you to maximize your Academia.edu profile, are you in a position to say no? We need to have more information about what companies are doing with our data if we’re going to be able to make good decisions about their use. It’s increasingly difficult for me to suggest that anyone should trust in what a for-profit EdTech (or any tech!) company offers them. And yet, I acknowledge that it’s hard to disentangle ourselves from these systems. 

I’m interested to know your thoughts. Here are today’s prompts:

  • Did you learn anything today about how data is used that will change your own practice?
  • What questions do you have about the tools you’re required to use for work or school? Does a tool being mandated change your perceptions of it?
  • What do you do to protect your data?

And of course, please comment on these ideas or anything else that got you thinking today. 

21 Responses

  • I hadn’t really thought about tools like TopHat mining data – which seems a bit of an oversight now I consider it!

    If using a tool is compulsory then I find it disquieting that it will be gathering data on the students (even to the point of the study habits of an individual)

    • Hi Donna 🙂
      Most of the cloud-based services that are used in your institution collect data (actually, so do the ones on campus). Most software applications have always had some kind of audit trail for debugging and support for example. They key thing is who can access this data and what’s it used for? I used to spend a significant portion of my time poring over the detail of all the various contracts in fine detail to make sure we limited what data could be used, what for and how long it was kept for. I spent a lot of quality time with Legal, Info Sec, and Data Protection 🙂 It’s so important that Unis have people who are employed specifically to do this work. But I also think there’s an issue in here. I was able to negotiate the kinds of terms I wanted because I was essentially bargaining against a contract with a big name institution. Would it be the same in other smaller, less “presitigious” institution, or even with great internal expertise will they get ridden roughshod over? I’d be interested in any experiences others have had here….

  • I guess it’s nice to confirm my phone is listening to me and I’m not losing my mind?

    What comes up for me a lot of how little I care for my own data and how much I care to protect my child’s data until he can choose. I also think our learning on this is still very much evolving and how hard it is to pry info off the internets/databases when we make realisations. I know quite a few people who have decided to leave Facebook but the internet is forever so is hard stopping the stream enough?

    • Oof, this hits home! You know my own social media practices for Groot are wildly different from my own. Again, this comes down to agency, power, and choice, which is why I want so much for us to be more careful with how we mandate the way others use technology that we know wants their data. I’m coming to see it as a fiduciary duty, frankly.

  • It’s a small (but important) thing – at my institution, we have a policy forbidding the use of tools that charge students for things that are used for assessment. Which means the university has to provide licenses for tools that are necessary. Which means we need to set up contracts with each vendor. Which means we, as a university, get the opportunity to formalize privacy, data provenance, data retention, etc. with each vendor – basically forcing them to go through a pretty rigorous review process before a contract can be signed. This includes Top Hat (we were the first institution to set up a campus license for Top Hat so it’s available at no cost to students), Pearson, etc.

    It can be a huge pain in the ass for us and the vendor, but it means for core tools that students and instructors don’t have to worry about the privacy aspects of these tools as much. (of course, they should still be aware and be mindful of data and privacy, but the university is doing the leg work for them on this front).

    • Small but critical thing! I love to hear that. What is very clear from that deep-dive article from Politico (linked above) is that the policies Pearson has around privacy vary wildly by how much push-back they get. Some institutions are able to safeguard quite a lot more data than others. I wonder, too, what the impact of size of institution is here. I’m also interested in Pearson’s intimacy with for-profit colleges in this vein. To what extent are harms redoubled if the institution isn’t watching out?

      Wouldn’t it be cool to see one of those college rankings that listed institutions by data protection policies?!

  • My key takeaways from today are two fold:
    1. Sam is the acknowledged hero and who really wants to be Frodo? Even Tolkien himself said Sam was the hero. So for the author to say she is the hero, it tells me that this article is well thought out and researched. But seriously, I would rather be Fangorn or any Ent before Frodo. Personally I think many of us take the Entish approach to things and basically stay out of what is going on, even when it is affecting us greatly.
    2. As soon as a tool in mandated, I instantly take a step back and view it more critically. It might be my cultural background but I bristle whenever anything is mandated out of the blue without what I consider transparency in the decision making. I try to avoid solely using any proprietary system in my teaching. Giving students choice allows them to find the path of learning that works best for them. This ties into my philosophy of education where I feel people are responsible for their own learning. Plus I basically don’t trust people or organizations whose main motive is financial profit.

    • It is so satisfying when a joke lands! Thank you for that. 🙂

      I am pleased to hear that a mandate gives you pause — I’m not sure it is always thus, and I worry about the legitimacy that is given to these tools when they are mandated. I mean, it’s literal legitimacy of course, but I also think it has a tendency to shut down further questions, in the, “Well, they wouldn’t tell us it was okay if it wasn’t” school.

  • Great post, Brenna. I really like the metaphor of Facebook as a “walled garden.” I want to leave too, but being physically so distant from family and friends, it’s difficult to be virtually distant as well. I often feel foolish because I talk a big game to students and colleagues about ensuring privacy, but I’ve been fairly lackadaisical with my own personal information.

    Have you read The Age of Surveillance Capitalism? She takes the “If you’re not the customer, you’re the product” adage even further where you’re not even the product, just the discarded husk of your extracted data.

    And thanks for the plug for library Institutional Repositories!

    • I have not read that but it sounds like I should! I’ve been thinking about ways to extend the Detox through the academic year, and drawing these conversations into a kind of reading circle for our Community of Practice series that starts in February. So I ‘m grateful for the suggestion.

      And you know I <3 the librarians and all they do. Y'all are on the ramparts for this conversation.

  • Thank you for this great post. It has me thinking of ever-changing ethical issues related to how academics acquire, store and disseminate research data—ethical guidelines and approval processes—and how these need to be considered and reconsidered in ongoing ways as they relate to classroom teaching also. I’m also reminded of this article on the ‘capitalist creep on campus’ and see this as another front: https://theconversation.com/capitalist-creep-on-campus-the-largest-quietest-privatisation-in-uk-history-its-why-were-striking-126554 Loving this detox!

  • At the WCOL conference this fall, I attended a presentation about the “parent” companies that own EdTech tools, behind the scenes, and the implication of such.

    For example, EdTech tool X used for quizzes is owned by company A, who is owned by company B, who is owned by Google, Amazon or Microsoft (with infinite combinations of this chain). It seemed like almost every tool discussed, open source or otherwise, at some point had data sent to one of these companies through this chain of ownership. I have seen this specifically with a tool that we considered once. There is some digging to do, when looking at tools.

    Another issue is that some companies might even operate on Canadian servers (to avoid the Patriot Act) but you find out that there is a back up server in the USA which is subject to the law. The layers can be thick…

    This is a question/thought that I have when looking at using EdTech tools:

    Do we have a specific problem/need that this tool can help with, or are we searching for a problem as an excuse to use a shiny, new tool!??

    If it is the latter, stop, run and don’t look back.

    • When I was more active in the book blogosphere, we always had this push to boycott Amazon — for good reasons! — but almost all of bookish internet interacts, at the very least, with Amazon’s servers. That’s why I’m so resistant to the idea of blaming end users. We so rarely have all the information.

      And I agree with you re: tools in search of an application. Much of EdTech would be weeded out with your rule!

  • Really great post!! It was interesting to consider how this tied into the previous post concerning the mandatory use of online assignment programs/websites. I hadn’t considered that being a large source of data mining. Thanks for bringing these things to light – it can be easy to ignore these topics since they’re difficult to personally address. Talking about them in a virtual “group setting” like this blog is helpful. 🙂

Leave a Reply to Matt D Cancel reply

Your email address will not be published. Required fields are marked *