Enterprise AI Analysis
ChatGPT as a tool for reviewing multiple-choice questions in the health sector
This study evaluates the capacity of ChatGPT-4 to enhance the quality of multiple-choice questions (MCQs) in medical education, comparing AI-revised versions with faculty-authored originals against 38 rigorous criteria. Findings reveal AI's proficiency in structural clarity but limitations in higher-order cognitive assessment, underscoring the critical role of human expertise and prompt engineering.
Authors: Tatiane Iembo, Helena Landim Gonçalves Cristóvão, Patrícia Carla Zanelatto Gonçalves, Wagner Ricardo Montor, Patrícia Silva Fucuta, Toufic Anbar Neto, Júlio César André & Milton Arruda Martins
Source: Scientific Reports (2026), Published Online: 13 May 2026
Executive Impact: AI in Medical Education Assessment
Addressing persistent challenges in MCQ quality, this analysis demonstrates how AI can augment human efforts in assessment development, particularly in high-stakes environments like the Progress Test. Understanding AI's specific strengths and weaknesses is key to its effective deployment.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
The study authors, after standardization, found a statistically significant increase in the number of criteria met by ChatGPT-4-reviewed MCQs, indicating AI's capacity to refine questions. External evaluators, without standardization, did not find a significant difference.
| Aspect | Faculty-Authored MCQs | ChatGPT-4 Reviewed MCQs |
|---|---|---|
| Structural Clarity |
|
|
| Higher-Order Thinking |
|
|
| Prompt Engineering |
|
|
| Review Consistency |
|
|
Enterprise Process Flow
ChatGPT-4's Role in MCQ Review Process
ChatGPT-4 was accessed via its website and individual questions were submitted with a single, comprehensive prompt. This prompt instructed the model to act as a "medical undergraduate professor" and review/reformulate MCQs based on all 38 specified construction criteria (e.g., good English, independent alternatives, higher-level reasoning, clear phrasing, no unnecessary data, etc.). The intention was to simulate a common faculty request without specialized prompt engineering training, ensuring a standardized approach to AI review.
Strategic Integration of AI in Medical Education Assessment
The findings emphasize that AI, particularly ChatGPT-4, serves as a powerful complementary tool, not a replacement for human expertise. It excels in refining structural clarity and basic item-writing principles, freeing up faculty time. However, human oversight remains critical for assessing higher-order cognitive skills and nuanced problem-solving. Effective prompt engineering and continuous faculty training are essential to maximize AI's potential and ensure pedagogical depth in assessments like the Progress Test.
AI’s ability to conduct complex analysis in a shorter time frame can significantly boost efficiency for educators, allowing them to focus on more intricate aspects of assessment development and student engagement.
Acknowledged Limitations & Future Directions
Limitations: The study's cross-sectional design prevents establishing causality or "improvement" longitudinally. The constrained sample size (36 MCQs) limits generalizability. Evaluator homogeneity and the deliberate use of a non-optimized prompt might underestimate AI's full potential, especially for higher-order cognitive skills. The specific context of the Progress Test consortium may also limit direct generalizability.
Future Research: Future studies should aim for a more diverse range of questions and evaluators. Crucially, exploring the use of ChatGPT-4 to create questions based on structured instructions and optimized prompts could reveal its full capabilities. Longitudinal studies assessing AI-assisted revision on student performance and learning outcomes are also warranted.
Calculate Your Potential AI Impact
Estimate the hours and cost savings your enterprise could achieve by integrating AI into routine knowledge management and content generation workflows.
Your AI Implementation Roadmap
A structured approach ensures successful integration and maximum return on your AI investment. Here’s a typical journey.
Phase 1: Discovery & Strategy
Conduct a comprehensive audit of existing workflows, identify high-impact AI opportunities, and define clear objectives and KPIs. Develop a tailored AI strategy aligned with enterprise goals.
Phase 2: Pilot & Proof-of-Concept
Implement AI solutions in a controlled environment, validate effectiveness against defined metrics, and gather feedback for iterative refinement. Demonstrate tangible value with a successful pilot project.
Phase 3: Integration & Scaling
Seamlessly integrate AI tools into your existing technology stack. Develop training programs for your team to maximize adoption and ensure smooth operational scaling across departments.
Phase 4: Optimization & Future-Proofing
Continuously monitor AI performance, fine-tune models, and explore advanced capabilities to sustain competitive advantage. Establish governance and ethical AI frameworks for long-term success.
Ready to Transform Your Enterprise with AI?
Don't miss out on the competitive edge AI can offer. Schedule a personalized consultation to discuss how these insights apply to your specific needs and challenges.