AI Project: Predicting Student Performance Using Machine Learning
Acknowledgement
I would like to express my sincere gratitude to my school, my AI teacher,
and the CBSE curriculum developers for providing guidance and
support throughout this Artificial Intelligence project. Their
encouragement helped me explore real-world applications of AI
meaningfully. I also thank the creators of open-source tools such as
Python, NumPy, Pandas, Matplotlib, and Scikit-learn. Finally, I
appreciate the valuable feedback received from classmates and friends
during the development of this project.
Objectives
- To understand and apply the AI Project Cycle.
- To build a simple Machine Learning model capable of predicting
student performance.
- To support SDG 4: Quality Education.
- To analyse data using statistical and visualization tools.
- To learn ethical concepts like fairness, privacy, and transparency.
Step One: Problem Scoping
Students often struggle to understand which study habits lead to better
academic performance. The problem this project solves is predicting
student performance using factors like study time, attendance, sleep,
and assignments. This supports SDG 4: Quality Education.
Type of Data Needed
To predict student performance, the following data features are useful:
• Study time (hours per day)
• Attendance percentage
• Sleep hours
• Number of completed assignments
• Previous test marks
• Final exam marks (target value)
Data Sources
The dataset can be:
• Collected from sample student surveys.
• Taken from open educational datasets (e.g., Kaggle student
performance dataset).
• Generated manually for school-project purposes.
Data Relevance
The data collected must be:
• Sufficient (enough examples for model training)
• Accurate (correct student entries)
• Relevant (only factors affecting academics)
Step Two: Data Acquisition
The dataset includes study time, attendance, sleep hours, assignment
completion, previous marks, and final marks. Data is manually created
for academic purposes and aligns with ethical AI usage.
Sample Data
Study Attendance Sleep Assignments Previous Final Student
Time Marks Marks
1 70 6 3 48 50 1
2 75 7 4 52 55 2
2 80 6 5 55 60 3
3 85 7 6 60 68 4
3 88 8 7 64 72 5
4 90 8 8 70 78 6
4 95 7 8 74 85 7
5 97 8 9 80 90 8
Step Three: Data Exploration
Data visualization shows clear patterns: more study time is associated
with higher marks. Basic statistics such as mean, median, and mode help
understand central tendencies.
Graph: Study Time vs Marks
Step Four: Modelling
A Decision Tree Regressor model was used because it is simple,
explainable, and suitable for school-level AI projects. The model learns
patterns between study habits and academic performance.
if study_time > 2 hours:
if attendance > 85%:
predict high marks
else:
predict medium marks
else:
predict low marks
Step Five: Evaluation
Evaluation Metrics
• Accuracy: How close predictions are to actual marks
• Mean Absolute Error (MAE): Average prediction error
• R² Score: How well the model explains performance trends
Results
• The model performed reasonably well with small error values.
• Study time and attendance were the most important predictors.
What Went Well
• Data was easy to visualize.
• Decision Tree made predictions understandable.
• Patterns matched real-life expectations.
Limitations
• Dataset size was small (school-project level).
• Self-reported data may not be perfectly accurate.
• More features (like class participation) could improve accuracy.
Ethical Reflection
• The model should not be used to judge students unfairly.
• Predictions must be used only to support learning, not label or
pressure students.
All personally identifiable information (PII) must be kept private
Conclusion
This project successfully applied the AI Project Cycle to create an
interpretable model that predicts student performance based on simple
learning-related features. The project demonstrates how AI can support
Quality Education (SDG 4) by helping students understand how their
habits influence academic success. Ethical aspects such as fairness,
transparency, and bias reduction were considered throughout the
model-building process. The results show that with responsible use, AI
can be a powerful tool to enhance learning in schools.
Bibliography
- CBSE AI Curriculum 2025–26
- Kaggle Student Performance Dataset
- Python, NumPy, Pandas, Matplotlib Documentation
- Scikit-learn Documentation
- United Nations SDG Website