Enterprise AI Analysis
Harnessing Pre-Course Aptitude Tests to Predict Performance on an Introductory Programming Assessment in Higher Education
Authors: OLIVER KERR and LINDEN J. BALL, School of Engineering and Computing, University of Lancashire, Preston, United Kingdom of Great Britain and Northern Ireland; NICKY DANINO, School of Computing and Creative Industries, Leeds Trinity University, Leeds, United Kingdom of Great Britain and Northern Ireland
When learning to program, students' efforts can be hampered by a variety of misconceptions pertaining to fundamental programming concepts, which can prevent them from developing appropriate Mental Models (MMs) of these concepts. This can create a barrier to learning and subsequently impact students' confidence. As such, it is necessary to identify students who are likely to require support with learning to program at the earliest opportunity. This investigation utilises data collected from 285 first-year computer science undergraduate university students to examine the potential for using a pre-course aptitude test to predict the results of students' first introductory programming assessment, thereby providing an indication of which students would benefit most from additional support from the outset. The aptitude test, which was developed as part of this investigation, collates information on students' backgrounds and prior experiences, their perceived levels of confidence and their likelihood of holding appropriate mental models for several core programming concepts. The data collected using the aptitude test were subsequently used to train a variety of regression and classification models to explore their potential for predicting students' assessment results. This culminated in the selection of a Random Forest Regressor and a Random Forest Classifier to be refined using Sequential Feature Selection and then finally validated against a holdout test-set to assess the generalisability of these models. The Random Forest Classifier achieved a good level of performance during training (AUC=0.8688, F1=0.8353, accuracy =0.7450). However, this was seen to reduce when evaluated on the hold-out test set (AUC=0.7670, F1 =0.7020, accuracy =0.7020), demonstrating a moderate degree of overfitting, likely due to an imbalance in the classes being predicted and the limited amount of data available. In contrast, the Random Forest Regressor exhibited a generally consistent level of performance between training (RMSE=0.1616, MAE=0.1209) and testing (RMSE=0.1713, MAE=0.1396). Although there is still a sizeable margin of error, the results suggest that the Random Forest Regressor is not overfitting the data and has the potential to be used as a guide for identifying students who would benefit from additional support. This work contributes a novel, pre-course, aptitude-testing approach that integrates students' mental models, background factors and perceived levels of confidence to enable early identification of students who may require additional support through the prediction of introductory programming assessment results. As such, these findings provide a foundation for future work to develop targeted support interventions that can be integrated into introductory programming modules.
Executive Impact
Understanding student performance early can lead to significant improvements in computer science education, reducing failure rates and enhancing learning outcomes.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
When students attempt to learn to program at a university level, their efforts can be hampered by a variety of misconceptions pertaining to fundamental programming concepts. These misconceptions can create a barrier to learning and subsequently impact students' confidence. Learning to program is a slow and complex task, with Winslow suggesting it takes approximately 10 years to turn a novice programmer into an expert. Understanding these difficulties early is crucial for effective intervention.
Students' interpretations of fundamental programming concepts are described as their 'Mental Models' (MMs). In programming, concepts are precisely defined, but novice programmers often misinterpret them. Inaccurate MMs, especially at the start of a course, make learning difficult and can lead to various misconceptions (e.g., parallelism bug, egocentrism bug). Identifying these MMs proactively is key to addressing potential learning barriers.
The aptitude test (Programming Checkup) integrates student background factors, self-efficacy, and Mental Model (MM) diagnostic questions. It aims to assess prior programming experience, motivation, mathematics background, perceived confidence, and identify common misconceptions. The test uses language-independent pseudocode and open-ended questions to deduce logical understanding rather than syntax knowledge, making it a supportive diagnostic tool.
The study utilized data from 285 first-year computer science students to train and validate regression and classification models. Random Forest Regressor and Classifier were selected for their consistent performance across various feature combinations (background factors, confidence, MMs). Sequential Feature Selection and GridSearchCV were used for refinement, with models evaluated against a holdout test set to ensure generalisability and an unbiased estimate of real-world performance.
Data Processing Flow for Predictive Model
| Metric | Random Forest Regressor | Random Forest Classifier |
|---|---|---|
| RMSE | 0.1713 | N/A |
| MAE | 0.1396 | N/A |
| AUC | N/A | 0.7670 |
| F1 Score | N/A | 0.7020 |
| Accuracy | N/A | 0.7020 |
| Note: Regression models do not have AUC, F1, or Accuracy. Classification models do not have RMSE or MAE. | ||
Early Intervention for Programming Success
Problem: High failure rates (historically 33%, recently 25-28%) in introductory programming courses, exacerbated by misconceptions forming early and becoming entrenched before assessment. Lecturers often unaware of student struggles until too late.
Solution: Develop a pre-course aptitude test ('Programming Checkup') to assess students' prior experience, confidence, and mental models of core programming concepts. Use this data to train predictive models (Random Forest Regressor and Classifier) to identify students likely to require additional support BEFORE teaching begins.
Outcome: The Random Forest Regressor showed consistent performance (Test RMSE=0.1713, MAE=0.1396), indicating potential to guide early identification for targeted support. The Classifier also showed reasonable generalisability (Test AUC=0.7670, Accuracy=0.7020) despite some overfitting due to class imbalance and limited data. This diagnostic tool enables proactive interventions, potentially reducing failure rates and improving learning outcomes by addressing misconceptions early.
Calculate Your Potential ROI
See how early AI-driven insights into student performance can translate into real-world impact for your institution.
Implementation Roadmap
A phased approach to integrating pre-course aptitude testing for maximum impact.
01. Pre-Course Assessment
Administer Programming Checkup to incoming students to gather data on backgrounds, confidence, and mental models.
02. Predictive Modeling
Utilize trained Random Forest models to predict individual student assessment performance and identify those at risk.
03. Targeted Intervention Design
Develop and tailor support interventions (e.g., supplementary tuition, scaffolded exercises) based on predictive insights and specific misconceptions.
04. Early Course Integration
Implement interventions from the very outset of the introductory programming module, ideally within the first week.
05. Continuous Monitoring & Refinement
Monitor student progress, gather feedback, and continually refine the Programming Checkup and predictive models for improved accuracy and generalisability.
Ready to Transform Your Curriculum?
Book a free 30-minute consultation with our AI education specialists to explore how these insights can be tailored for your institution.