Obstacle or opportunity? The impact of ChatGPT on assessment in STEM higher education

Yasmin Wong
February 16, 2023
A robot sitting at a desk with a pen and paper, holding a hand up to it's face as if thinking.

ChatGPT is the generative-AI tool that has taken the world by storm. Individuals from various industries have been discussing the potential impact of the tool on their professions, and the education sector is no exception. Some are expressing concern over its impact on assessments, while others view it as an opportunity for innovative assessment design. In this article, we break down the abilities of ChatGPT, and how educators from different scientific disciplines see it playing a role in assessment.

What is ChatGPT?

ChatGPT is a generative AI language model developed by OpenAI. The user-friendly chatbot utilises advanced machine learning to generate human-like responses to given prompts. According to OpenAI -

“The dialogue format makes it possible for ChatGPT to answer follow-up questions, admit its mistakes, challenge incorrect premises and reject inappropriate questions” - A snippet taken from Open AI’s homepage

The tool became publicly available in November 2022 and has since gained a huge amount of attention. It isn’t the only generative AI resource available, but it is one of the few that is free to use, which is likely one of the reasons it’s become so widely used in such a short space of time. Since its release, however, similar chat-like tools have started to appear - such as JasperChat from Jasper and Bard from Google. 

How well can ChatGPT answer scientific assessment questions?

At first glance, it’s reasonable to be worried about how ChatGPT impacts education, especially when referring to assessments. When you search “ChatGPT” + ”Assessment” on Google, you’ll find over 5 million results, the majority of which are articles or blog posts asking the same questions - Is this the end of assessment as we know it?

At the moment - no. Not entirely. As evidenced by three educators who have reviewed the abilities of ChatGPT in responding to science assessment questions, ChatGPT is impressive, but not invincible. 

Dr. Suzanne Fergus - an educator, learning coach and associate professor at the University of Hertfordshire - demonstrated the AI’s ability to complete Chemistry assessments were limited beyond ‘describe’ and ‘discuss’ questions. In her blog, she reveals it was able to generate convincing answers to these short-text questions, but the quality of the answers varied, with more than one containing an error. It was also unable to answer questions that required the drawing or analysis of an image.

It was, however, able to include academic references when asked, but as Dr. Fergus demonstrated, the quality of the references was heavily dependent on the quality of the prompt given. It also did not produce a high Turnitin percentage.

"Application and interpretation of knowledge is not well processed by ChatGPT. Using problem solving, data interpretation or case-study based questions are ways to redesign assessment beyond knowledge-based questions. This disruptive technology will help educators to question ‘why are we doing things this way?’. That has to be a good thing in my book!”
Dr. Suzanne Fergus, Department of Clinical, Pharmaceutical and Biological Science, University of Hertfordshire

Prof. Philip Moriarty - University of Nottingham educator and active science communicator - completed a similar review using physics questions from GCSE to undergraduate assessments. Like Suzanne, he discussed the limitations of ChatGPT when it came to image analysis, but did ask it to answer several questions regarding both concepts and calculations in physics. ChatGPT was able to produce very articulate answers to physics questions, with all the correct sounding terms and logic, but in the wrong order, or with the wrong conclusion. 

He demonstrated several examples of ChatGPT being unable to link physics principles with the mathematical calculations. It even produced an impressive Python code, but still fell into a common pitfall that many undergraduate students make. You can watch his review on Youtube and read a follow-up article on his blog.

No items found.

Prof. David Smith from Sheffield Hallam University - whose blog shares extensive examples of good practice in assessment and questioning - put ChatGPT to work answering written assessment-style Bioscience questions. The questions, which David specifically designed to be ‘UnGoogleable’, included traditional essay style questions, short answer questions and problem solving questions that were used as prompts for the generative AI to tackle, and it produced very reasonable answers.

Dr. Smith even agreed that the answers, although lacking depth and understanding, were passable, awarding many of the answers a 2:2 or low 2:1. He even involved other colleagues to double-mark and moderate the grades the AI achieved. It did however struggle when any questions required rationale or reflection.

Is this the end of assessment as we know it?

The common theme that rings throughout many conversations academics are having, and the three reviews mentioned above, is that of opportunity. The use of any technology can bring about many positive changes that allow for more creative approaches that enhance both student learning experiences, and teaching experiences.

Recent TIAward winner and innovator in education and assessment, Dr. Alison Hill, offered us her thoughts on how this technology impacts assessment design. Her thoughts echoed that of the three reviews previous, that although ChatGPT is impressive, its capacity is limited, and that gives us time to consider how we can use it to our advantage as educators. She suggests that AI could be used to help take or summarise meeting notes, and in future maybe even alleviate the marking load.

She also shares ideas on how ChatGPT can be used in teaching right now, by asking students to critically analyse the answers ChatGPT produces to essay questions and tasking them with critiquing and improving it, demonstrating critical and analytical thinking skills.

The future?

Even if ChatGPT was closed for public use tomorrow, the rise of generative AI technology will continue. And as I write this article, progress in this field is being made at a staggering rate. Right now, it can produce impressive written answers to prompts, do some basic calculations, and produce nicely written code. It is limited in that it cannot draw or interpret diagrams, graphs or chemical structures, and it struggles to offer in-depth analysis. 

However, even in a year or two, the knowledge and abilities of generative AI like ChatGPT may have evolved so much that it can do all of these things and more. As each of the educators featured here have expressed, this does seem to be an opportunity to explore how we can assess and use these tools to our advantage.

Interested in discussing more topics about the use of digital tools in assessment? Join our Commmunity Groups to meet fellow university educators and explore how to enhance assessment and deliver effective feedback.