Enterprise AI Analysis

Harnessing Pre-Course Aptitude Tests to Predict Performance on an Introductory Programming Assessment in Higher Education

Authors: OLIVER KERR and LINDEN J. BALL, School of Engineering and Computing, University of Lancashire, Preston, United Kingdom of Great Britain and Northern Ireland; NICKY DANINO, School of Computing and Creative Industries, Leeds Trinity University, Leeds, United Kingdom of Great Britain and Northern Ireland

When learning to program, students' efforts can be hampered by a variety of misconceptions pertaining to fundamental programming concepts, which can prevent them from developing appropriate Mental Models (MMs) of these concepts. This can create a barrier to learning and subsequently impact students' confidence. As such, it is necessary to identify students who are likely to require support with learning to program at the earliest opportunity. This investigation utilises data collected from 285 first-year computer science undergraduate university students to examine the potential for using a pre-course aptitude test to predict the results of students' first introductory programming assessment, thereby providing an indication of which students would benefit most from additional support from the outset. The aptitude test, which was developed as part of this investigation, collates information on students' backgrounds and prior experiences, their perceived levels of confidence and their likelihood of holding appropriate mental models for several core programming concepts. The data collected using the aptitude test were subsequently used to train a variety of regression and classification models to explore their potential for predicting students' assessment results. This culminated in the selection of a Random Forest Regressor and a Random Forest Classifier to be refined using Sequential Feature Selection and then finally validated against a holdout test-set to assess the generalisability of these models. The Random Forest Classifier achieved a good level of performance during training (AUC=0.8688, F1=0.8353, accuracy =0.7450). However, this was seen to reduce when evaluated on the hold-out test set (AUC=0.7670, F1 =0.7020, accuracy =0.7020), demonstrating a moderate degree of overfitting, likely due to an imbalance in the classes being predicted and the limited amount of data available. In contrast, the Random Forest Regressor exhibited a generally consistent level of performance between training (RMSE=0.1616, MAE=0.1209) and testing (RMSE=0.1713, MAE=0.1396). Although there is still a sizeable margin of error, the results suggest that the Random Forest Regressor is not overfitting the data and has the potential to be used as a guide for identifying students who would benefit from additional support. This work contributes a novel, pre-course, aptitude-testing approach that integrates students' mental models, background factors and perceived levels of confidence to enable early identification of students who may require additional support through the prediction of introductory programming assessment results. As such, these findings provide a foundation for future work to develop targeted support interventions that can be integrated into introductory programming modules.

Schedule Your Strategy Session

Executive Impact

Understanding student performance early can lead to significant improvements in computer science education, reducing failure rates and enhancing learning outcomes.

0.7020 Classifier Test Accuracy

0.1713 Regressor Test RMSE

0.7670 Classifier Test AUC

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

When students attempt to learn to program at a university level, their efforts can be hampered by a variety of misconceptions pertaining to fundamental programming concepts. These misconceptions can create a barrier to learning and subsequently impact students' confidence. Learning to program is a slow and complex task, with Winslow suggesting it takes approximately 10 years to turn a novice programmer into an expert. Understanding these difficulties early is crucial for effective intervention.

Students' interpretations of fundamental programming concepts are described as their 'Mental Models' (MMs). In programming, concepts are precisely defined, but novice programmers often misinterpret them. Inaccurate MMs, especially at the start of a course, make learning difficult and can lead to various misconceptions (e.g., parallelism bug, egocentrism bug). Identifying these MMs proactively is key to addressing potential learning barriers.

The aptitude test (Programming Checkup) integrates student background factors, self-efficacy, and Mental Model (MM) diagnostic questions. It aims to assess prior programming experience, motivation, mathematics background, perceived confidence, and identify common misconceptions. The test uses language-independent pseudocode and open-ended questions to deduce logical understanding rather than syntax knowledge, making it a supportive diagnostic tool.

The study utilized data from 285 first-year computer science students to train and validate regression and classification models. Random Forest Regressor and Classifier were selected for their consistent performance across various feature combinations (background factors, confidence, MMs). Sequential Feature Selection and GridSearchCV were used for refinement, with models evaluated against a holdout test set to ensure generalisability and an unbiased estimate of real-world performance.

0.1713 Regressor Test RMSE

Data Processing Flow for Predictive Model

Extract Raw Data (Qualtrics)

→

Code Misconceptions

→

Perform BKT Calculations

→

Numerical Encode Features

→

Normalize Data

→

Binarize Assessment Results

Model Performance Comparison (Holdout Test Set)

Metric	Random Forest Regressor	Random Forest Classifier
RMSE	0.1713	N/A
MAE	0.1396	N/A
AUC	N/A	0.7670
F1 Score	N/A	0.7020
Accuracy	N/A	0.7020
Note: Regression models do not have AUC, F1, or Accuracy. Classification models do not have RMSE or MAE.

Early Intervention for Programming Success

Problem: High failure rates (historically 33%, recently 25-28%) in introductory programming courses, exacerbated by misconceptions forming early and becoming entrenched before assessment. Lecturers often unaware of student struggles until too late.

Solution: Develop a pre-course aptitude test ('Programming Checkup') to assess students' prior experience, confidence, and mental models of core programming concepts. Use this data to train predictive models (Random Forest Regressor and Classifier) to identify students likely to require additional support BEFORE teaching begins.

Outcome: The Random Forest Regressor showed consistent performance (Test RMSE=0.1713, MAE=0.1396), indicating potential to guide early identification for targeted support. The Classifier also showed reasonable generalisability (Test AUC=0.7670, Accuracy=0.7020) despite some overfitting due to class imbalance and limited data. This diagnostic tool enables proactive interventions, potentially reducing failure rates and improving learning outcomes by addressing misconceptions early.

Calculate Your Potential ROI

See how early AI-driven insights into student performance can translate into real-world impact for your institution.

Select Your Industry Focus

Number of Students in Introductory Courses

Average Weekly Hours Spent by Staff on Remedial Support

Average Hourly Staff Cost ($)

Estimated Annual Savings $0

Hours Reclaimed Annually 0

Implementation Roadmap

A phased approach to integrating pre-course aptitude testing for maximum impact.

01. Pre-Course Assessment

Administer Programming Checkup to incoming students to gather data on backgrounds, confidence, and mental models.

02. Predictive Modeling

Utilize trained Random Forest models to predict individual student assessment performance and identify those at risk.

03. Targeted Intervention Design

Develop and tailor support interventions (e.g., supplementary tuition, scaffolded exercises) based on predictive insights and specific misconceptions.

04. Early Course Integration

Implement interventions from the very outset of the introductory programming module, ideally within the first week.

05. Continuous Monitoring & Refinement

Monitor student progress, gather feedback, and continually refine the Programming Checkup and predictive models for improved accuracy and generalisability.

Strategize Your Rollout

Ready to Transform Your Curriculum?

Book a free 30-minute consultation with our AI education specialists to explore how these insights can be tailored for your institution.

Book Your Free Consultation

Enterprise AI Analysis

Harnessing Pre-Course Aptitude Tests to Predict Performance on an Introductory Programming Assessment in Higher Education

Executive Impact

Deep Analysis & Enterprise Applications

Data Processing Flow for Predictive Model

Model Performance Comparison (Holdout Test Set)

Early Intervention for Programming Success

Calculate Your Potential ROI

Implementation Roadmap

01. Pre-Course Assessment

02. Predictive Modeling

03. Targeted Intervention Design

04. Early Course Integration

05. Continuous Monitoring & Refinement

Ready to Transform Your Curriculum?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai