Teaching Innovation Awards winner

Teaching Innovation Awards finalist

The University of the West Indies

Biological and Chemical Sciences
Prof. Avril Williams

Overview

Rationale & Goals

The emergence of generative AI presented a critical challenge in my Inorganic Chemistry courses, so I set out to investigate how to prevent AI from becoming a crutch that undermines deep conceptual learning, and instead promote its use in skills acquisition and learning gain by encouraging judgement and conceptual understanding over rote answer retrieval. I aimed to move beyond simplistic "academic integrity" warnings (cheat vs don't cheat) to encourage genuine digital and AI literacy whereby students are taught the critical skill of calibrating their trust in AI outputs to improve mastery.

Implementation

The innovation was embedded into the course assessment structure. Over a 6-week period, students completed five short in-class quizzes in a unique two-stage process: an unaided attempt where students first answered the questions using only their knowledge, and an AI-assisted attempt, where the students immediately revisited the same quiz using a generative AI tool of their choice. The core innovation was the subsequent analytical framework applied. By categorising every answer change between attempts, four metacognitive AI-interaction metrics for each student were calculated:

  • Confirmation Rate (correct→correct, validating correct knowledge)
  • Correction Rate (incorrect→correct, fixing errors)
  • Stubborn Misconception Rate (incorrect→incorrect, persisting in error despite AI)
  • Undermining Rate (correct→incorrect, replacing correct answers with AI errors)

This methodology repackaged passive AI use (simple scores) into a measurable behavioural profile, revealing how the students engaged with AI. This process was followed by two traditional, unaided in-class tests to measure genuine retention and learning gain (skills acquisition).

Evidence of Impact

Quantitative analysis revealed actionable insights. We found that a student’s "AI Boost" (quiz score increase) showed no correlation (r = -0.18) with final test performance, proving that immediate AI help does not equate to learning. However, our metacognitive metrics were strong predictors. A high ‘stubborn misconception’ rate correlated strongly with poor test scores (r = -0.63), identifying persistence of core misunderstandings despite AI. Effective learning was linked to behaviours like a high confirmation rate (r = +0.58), highlighting students who used AI effectively to validate knowledge. These metrics provided evidence of impact on skills acquisition, indicating the importance of the quality of AI interaction. By making their own interaction patterns an object of study, students transitioned from viewing AI as an oracle, to treating it as a fallible tool requiring critical evaluation. The metrics, especially the undermining rate, provided evidence of the cost of overreliance and the ethical lesson of maintaining agency over one's own reasoning.

Wider Application

This framework provides a transferable model for any educational context where AI is present. It offers colleagues a practical methodology to anchor discussions of AI Ethical Literacy in solid, analysable student behaviours rather than abstract policy. For skills acquisition, it shifts the focus from output policing to process optimisation, allowing STEM educators to design interventions that guide students toward the AI interaction patterns like correction and confirmation, that correlate with deep learning. This turns a widespread pedagogical challenge into a scalable strategy for enhancing critical thinking and durable knowledge acquisition in the age of AI.