Whither Comes the Data: Current Uses of AI and Data Set Training in Higher Ed
The ChatGPT handwringing of late has bothered me, not least because it is cloaked in a kind of shock, like the domain of higher education has suddenly been sullied by this profane technology. But babes, it was always already here.
Many faculty are learning about the impacts of artificial intelligence on their own practice as they begin, for the first time, to grapple with the obvious ways in which ChatGPT will demand changes. But our students have been impacted for much, much longer. I want to talk today about some ways AI is already impacting access and learning for our students, and then I want to talk a bit about where those data sets are coming from in the first place.
But first — there’s a lesson here in not waiting to care about a tool or technology until it impacts you personally, and the lesson is called (in the words of Cory Doctorow) the Shitty Technology Adoption Curve.
The Shitty Technology Adoption Curve
The point of the Shitty Technology Adoption Curve is to normalize technological oppression, one group at a time. 20 years ago, if you were eating your dinner under the unblinking eye of a video-camera, it was because you were in a supermax prison. Now, thanks to ‘luxury surveillance,’ you can get the same experience in your middle-class home with your Google, Apple or Amazon ‘smart’ camera. Those cameras climbed the curve, going from prisons to schools to workplaces to homes.
Cory Doctorow, “The Shitty Technology Adoption Curve Reached Its Apogee.”
Doctorow’s point is that by the time a crappy workplace surveillance tool ends up on the work computer of a white-collar middle manager, it’s been tested and normalized in places where folks had a lot less power to resist. And we see this in education all the time. You can’t tell me that the same tools we use to watch what students are doing — like the Microsoft Habits infrastructure, for example, which purports to draw conclusions about our students based on data points like when they engage with course materials or submit assignments — won’t be used to watch faculty too, especially those who are precariously employed. There’s a self-serving argument to keeping sketchy technology away from our students, and I think we see it play out with AI. Machine learning isn’t new to the university, but it’s front page news on Chronicle because it’s now an issue for people with institutional power.
I don’t especially want to read ChatGPT essays either, but it’s not the thin end of the wedge. It’s a chicken that has come home to roost. (It’s my essay and I’ll mix my metaphors if I want to; eat that, algorithmic prose analysis.)
Artificial Intelligence and Our Students: The Right Now
We are already using machine learning-based tools in an evaluative capacity with students right now in many institutions. This can range from the subtle, like using learning analytics to make conclusions about a learner’s capacity for success, to the overt, like using AI tools to determine acceptance and placement of students. I’m not going to argue that there is never a place for these tools (although I really do think humans are better at this work largely because we tend to assume algorithms are somehow more neutral, which is a dangerous fallacy). What I am going to argue is that it’s disingenuous to panic about AI in our universities now, when we have largely accepted it as a convenience to this point. We have to care about shitty technology before it finds its way to trouble the most privileged among us.
One example I like to consider because of its ubiquity is a tool called ACCUPLACER. Many institutions use some or all of the ACCUPLACER suite of products to assess competencies in different areas, perhaps most commonly to assess students for English proficiency. WritePlacer is the test for writing competency, and it is scored by a proprietary tool called “IntelliMetric, an artificial intelligence-based, writing-sample scoring tool.” IntelliMetric has been around since 1997, so a lot of students have been evaluated by this tool — 100 billion essays have been marked by it, according to IntelliMetric themselves. There are certainly studies that show equivalency between human and IntelliMetric graders, but there’s also a significant reason to wonder about how automatic grading tools deal with inherent biases in writing instruction generally. Algorithms are garbage-in, garbage-out, and we know that the history of writing assessment is plagued with bias, so too must be the data sets; are there guardrails in place to protect against this? One 2012 study showed that ACCUPLACER failed to correctly predict accurate placement for women and racialized students, and while new versions of ACCUPLACER have come since, it is concerning to think about the disruptions and delays to the pathways of students who were incorrectly evaluated.
In general, there’s evidence that AI tools — because they look for structure and pattern, but not meaning — are easy to game, as projects like the BABEL Generator have shown. Effectively, well-structured gibberish can pass these tests when AI evaluates it, but not when humans do. Interesting, this is basically what AI produces at this stage, too: it generates well-formed but largely meaningless prose. The inverse is also true: folks with non-standard English patterns are disproportionately penalized.
The underlying problem here is the allegiance to the algorithm and an assumption that such tools are unbiased. We know that is not true, and that machine learning in particular amplifies bias. And yet even so, we let machine learning tools determine which students are at risk and tell us about who our students are. But these are tools of convenience and scale, not teaching and learning: they exist to make up for the conditions of teaching and learning we have decided are acceptable, which are precarious and out-sized conditions that make real relationships difficult to develop.
Where’d All That Data Come From, Anyway?
There are much smarter people than me thinking through the ramifications to copyright of these AI tools and the data sets they are trained on, but the concept proffers an interesting question when it comes to the idea of academic integrity on our campuses and the cultures we wish to build. What goes into the AI data set, and do the creators have any claims to compensation for this use of their work? If we want to call ChatGPT-generated essays plagiarism, what about the materials that their works are based on?
I wonder a lot about the data sets that underpin ChatGPT because I wonder a lot about consent and content and how the two are interrelated. The classic example of this for me is Turnitin, whose massive valuation is at least in part based on the huge amount of student data they have amassed for free and without a meaningful discussion about copyright, intellectual property, and the value of student work. Students upload their work to Turnitin at the requirement of their instructors, and while they have to click a consent tickybox, there’s no real discussion of what that means for their work, or who their work enriches. As institutions, higher ed players have been complicit in allowing these companies to build their fortunes on the work of our students, based on the idea that student work is valueless. I don’t know about you, but I kind of object to that notion; I sort of think student work… matters. And if a corporation want to extract value from that, maybe they should pay students for it.
But also — Turnitin isn’t just Turnitin. Turnitin is also a tool called Revision Assistant, which is designed to help students with their writing at the composition stage. Revision Assistant describes itself as “an online platform that uses state-of-the-art machine learning technology to analyze student writing and provide instant, holistic and sentence-level feedback and other guidance.” Based, I presume, on the data set Turnitin has collected from a generation of students at no cost. (There’s another question here about what kind of writing AI can train students to do, since what it produces itself is so vapid, but let’s park that for now.) To what extent has higher education, then, built the data sets that artificial intelligence writing tools are based on? After all, Grammarly offers a similar writing assistant tool powered by machine learning, and even Course Hero has a product called QuillBot that does the same. All these tools are used extensively (legitimately or not) across higher education, collecting masses of student writing and repackaging it to “assist” their writing. And often, students have limited opportunity to opt out.
I guess where I wind up in this mess is here: why do we wait to have the conversation with students about artificial intelligence and writing until it effects the faculty member or becomes a concern for institutional reputation and academic integrity? And what is academic integrity, anyway, if it doesn’t also encompass the practices we subject our students to? Why weren’t we interested in having this conversation when students were first subjected to the decision-making of AI or the data scraping of the tools we force students to use for our convenience? It’s nice to start the conversation now, but we’re all a bit late.
We’ve touched on the conversation of inequity and algorithmic processes like machine learning here today, but next week we’re going to jump in with both feet to explore racist data sets and the danger of performative equity that AI tools — especially image generation — might offer. We’re also getting closer to our first live session on January 27th, which anyone can register for here. We use this time to discuss the ideas of the Detox and build a community around critical analysis of these tools (and sometimes we plan our resistance). Also, this is as good a time as any to remind you that the Detox accepts guest submissions, a good choice for those who have comments longer or more involved than a typical blog comment. Though those are welcome too! See below.
So excited for this to start! Might be fun to tease out the historical analogies for LLMs
https://dynomight.net/llms/
“Gull-wing and scissor doors”? “Site-built homes and pre-manufactured homes”
Bonus points for the site icon;-)
All credit to Nicole Singular, whose visual genius knows no bounds. I love the site icon so much, I can’t tell you. Might get it as a tattoo.
Liked the feet vs. segways one. No one would opt to segway for eternity (would they?).
Thanks for this, Brenna. I couldn’t agree more. When all the panic started around Chat-GPT I thought it was odd because Chat-GPT isn’t new in any meaningful sense. I’ve been considering writing a piece about Chat-GPT panic and the ways in which we already rely on tech to ‘manage’ students at scale, and the ways we were further encouraged to do so during the emergency-remote teaching part of the pandemic. Ultimately, I think Chat-GPT is a sophisticated bullshit generator at the moment (though it may well learn to do more in future). And the fact that higher ed is in a panic may tell us something about the ways we have been teaching students to write and to formulate ideas. I think a lot about Amy Tan’s piece “Mother Tongue” and about how I was trained (and my students have been trained, and Chat-GPT has been trained) to write the 5-paragraph-essay. I don’t want to say that academic styles of writing have no place, but there are so many more ways to meaningfully express oneself, and style isn’t substance. Tan’s piece is here: https://www.umsl.edu/~alexanderjm/Mother%20Tongue%20by%20Tan.pdf
Lastly, this also made me think of Sarah Anderson’s recent piece regarding how her copyrighted artwork has been fed into an algorithmic learning system without her knowledge or consent. She argues that this, too, isn’t new. She has dealt with what she calls a “shadow self” online for years now, a shadow self created by the alt-right to harass her. Her piece is here in case anyone is interested: https://www.nytimes.com/2022/12/31/opinion/sarah-andersen-how-algorithim-took-my-work.html
What Anderson argues is very similar to your points about student work here. The idea that intellectual property is taken without knowledge or consent (or at best with coerced consent) to ‘improve’ these programs which can then be used to harass or oppress the very people whose work they are built on.
This isn’t new. But it is high time we talked about it!!
I promise we will talk a lot about the five paragraph essay soon! I think that to the extent that we have pushed students into patterned writing, we’re seeing ChatGPT replicate forms we’re only starting to extract ourselves from…
Another seed for building these data sets were all those cute “Which Harry Potter character are you” polls that used to be all over Facebook. Although, I suspect, mostly used to create digital finger prints of individuals for better targeted marketing tools used by Meta.
OH MAN JON everytime I see another one of those “post a picture of yourself 20 years ago and now” memes I want to screeeeeeeeeeeeeeeeam. We are building the data sets all the time. Willingly. Gleefully!
Very interesting post Brenna! What a difficult area for us to navigate!
Although I have not personally used AI with students in the ways you describe, I have definitely used data that is collected. But, I use the data as a starting point to query further rather than to form conclusions which I think supports the student. As an example, if a student is struggling, I have looked at moodle to see how many times they attempted quizzes and what resources they accessed to help start a conversation. From my perspective, this data allows me to ask better questions and support in the right way (maybe they missed seeing a resource or perhaps they tried a quiz multiple times getting the same kinds of questions wrong each time).
I wonder if you see a positive way forward with AI?
I think there’s a huge difference between using the data and outsourcing the judgement, something we’ll talk more about in a couple of weeks. Your example has you starting with an observation rooted in relationality, and using the tools at your disposal to collect more information. That’s way different, to me, than the computer spitting out a judgement about a student being “at risk” that we then act on. In general, it’s that outsourcing of judgement — in the form of evaluation or just vibes — that worries me.