Career Advice Artificial intelligence

Focusing on GenAI detection is a no-win approach for instructors

As GenAI continues to evolve faster than the tools to detect it, capturing all student use of AI is nearly impossible.

Loleen Berdahl

December 11, 2024

Posted in

The Skills Agenda

3 Comments

The use of GenAI in university classrooms and coursework is an evolving and sometimes contentious issue. Some instructors ban its use outright, considering any use of GenAI to be a form of academic dishonesty. Other instructors embrace its use fully, requiring students to engage with GenAI in their class assignments. Still others avoid the matter entirely by returning to exclusively in-class exams and assessments.

Instructors who want to retain out-of-class assignments while prohibiting students from using AI in their courses face the challenge of correctly identifying and penalizing AI usage. While detection tools and techniques are available, they present instructors with important practical, ethical and workload challenges that must be considered.

The practical limitations of GenAI detection tools

After testing several detection software programs, Australian researchers Daniel Lee and Edward Palmer concluded, “we should assume students will be able to break any AI-detection tools, regardless of their sophistication.” Similarly, business professor Ethan Mollick writes in Co-Intelligence: Living and Working with AI, “there is no way to detect whether or not a piece of text is AI-generated. A couple of rounds of prompting remove the ability of any detection system to identify AI writing. … Unless you are doing in-class assignments, there is no accurate way of detecting whether work is human-created” (emphasis in original).

To be sure, AI detection will catch some students. The students most likely to be caught are those with the least knowledge and/or resources to outsmart detection, meaning some students get away with AI use while others are penalized. Leon Furze writes, “GenAI detection tools privilege students who are English first language, have access to paid large language models/applications, and are more digitally literate” (emphasis in original). The result is potential equity issues with respect to which students are and are not penalized for GenAI use. In the U.S., the Modern Language Association and the Conference on College Composition and Communication jointly urged caution with detection tools to avoid perpetuating inequities: “We urge caution and reflection about the use of AI text detection tools. Any use of them should consider their flaws and the possible effect of false accusations on students, including negative effects that may disproportionately affect marginalized groups.”

The ethical questions of GenAI detection tools

Efforts to detect AI use pose ethical questions for instructors and universities. While some universities use AI-detection software, many universities discourage or prohibit their use due to the risks of both false positives and false negatives, and the inequities created through their use. For example, the University of British Columbia writes:

“The use of applications to detect AI-generated content is strongly discouraged at this time, due to concerns about effectiveness, accuracy, bias, privacy, and intellectual property. Despite a number of AI writing detectors being available, recent research has found that all fail to perform accurately enough, and accuracy is further reduced when simple obfuscation techniques are used.”

Beyond AI-detection software, some faculty use a detection method known as “white texting” or “Trojan horse detection” by including in their assignment details small white-font instructions invisible to the human eye that direct AI to include clues (such as mention of a particular false term) in the student’s assignment. Sarah Elaine Eaton, whose research focuses on ethics in higher education, considers this practice unethical. She writes, “Deceptive assessment is a breach of academic integrity on the part of the educator. If we want students to act with integrity, then it is up to educators to model ethical behaviour themselves” (emphasis in original).

Drawing on Dr. Eaton’s work, Marc Watson writes, “We should set clear guidelines for the use of AI in our courses and likewise hold students accountable when they violate those standards, but that only works when we act ethically. Otherwise, what’s the point? When you resort to deception to try and catch students cheating, you’ve compromised the values of honesty and transparency that come implicitly attached to our profession.”

The workload limitations of GenAI detection

Focusing on AI detection takes up instructor time and increases workloads and stress. As Mr. Furze explains:

Unlike plagiarism tools, generative AI tools do not give a clear cut result. The percentage likelihood of AI generated content is less accurate than plagiarism detection, more open to interpretation, and therefore requires more consideration on the educator’s part. It requires more nuanced and potentially more stressful conversations between the educator and the student, and the potential for much more kickback from the students and many more appeals. …The added time and stress of using generative AI detection tools is a burden on educators who are already in an industry with a high risk of burnout and attrition. (emphasis in original)

Policing student use of GenAI presents a difficult challenge for instructors, with high emotional labour costs. Instructors are put in the position of being suspicious of student work and, as Elizabeth Steere writes, “an accusation based on a false positive can irrevocably damage trust between an instructor and student.”

Even when AI detection is accurate, its use can create workload stresses if/as students push back against its use. Dr. Eaton argues that unless instructors ensure transparency by clearly stating the use of detection tools in the course syllabus, charges of academic dishonesty are likely to be subject to student challenge: “Students can appeal any misconduct case brought forward with the use of deceptive or undisclosed assessment tools or technology (and quite frankly, they would probably win the appeal)” (emphasis in original).

So now what?

Given the limits of detection approaches, what are instructors to do? For some, as noted earlier, the solution is a return to in-class exams and assessments, practices that have their own challenges with respect to equity and inclusion. For others, the solution is to find ways to engage with GenAI and to help students embrace transparency in its use. As Dr. Steere writes, “Perhaps as we work to integrate it meaningfully and ethically, and teach students how to cite its use, students will feel less compelled to disguise AI-generated writing.”

But for many instructors, neither of these models is appealing. For instructors who plan to use detection tools, it is necessary to be transparent with students about the use of the tools, thoughtful about the biases inherent in them and sensitive to the student stress associated with accusations of academic dishonesty. Other instructors will find it helpful to instead have open dialogue with students about why they prohibit GenAI use in their classes, how they define academic integrity, and how they plan to enforce it. (See Gianluca Agostinelli’s tips for fostering critical thinking and original writing in an AI-context.)

GenAI has disrupted higher education. Responses to it will continue to evolve over the years ahead as instructors continue to rethink and reimagine pedagogies and assessments. The key, as the Modern Language Association and the Conference on College Composition and Communication writes, is for “educators to respond out of a sense of our own strengths rather than operating out of fear”.

Continuing the Skills Agenda conversation

What is your own approach to GenAI use in your classroom? I would love to hear your thoughts in the comments below. And for faculty involved in social science and humanities graduate education, please check out my free Substack newsletter, Reimagining Graduate Education, that I produce with co-authors Lisa Young and Jonathan Malloy.

I look forward to hearing from you. Until next time, stay well, my colleagues.

Loleen Berdahl

Loleen Berdahl is a 3M National Teaching Fellow, an award-winning university instructor, the executive director of the Johnson Shoyama Graduate School of Public Policy (Universities of Saskatchewan and Regina), and professor and former head of political studies at the University of Saskatchewan. Since 2016, Dr. Berdahl has spoken about student skills training and professional development at conferences and university campuses across Canada. Her research on these topics was funded by the Social Sciences and Humanities Research Council Insight Grant program. Her most recent book is For the Public Good: Reimagining Arts Graduate Programs in Canadian Universities (with Jonathan Malloy and Lisa Young, University of Alberta Press 2024).

3 Comments

As an experiment I tried using several online AI-detection tools on a text consisting of an original introductory paragraph by myself followed by a somewhat longer passage I had generated using AI. One of the detection tools identified my passage as “likely AI-generated” but gave the AI-generated passage a pass. ¯\_(ツ)_/¯

The same thing happened to me. I put my own writing through an AI detection tool and it flagged it as 89% likely AI generated.

Addressing AI in the classroom: I discuss GenAI tools with students, both their strengths and weaknesses, and go on to emphasise academic integrity above all else.
Assessing student work: I use practical assessments such as presentations and laboratory exams as higher weighted assessments. I try to tailor lower weighted written assessments to be personal/specific enough that submission of AI generated text only is more likely to be insufficient.

Reply to