Predicting Student Academic Achievement by Using the Decision Tree and Neural Network Techniques

Main Article Content

Pimpa Cheewaprakobkit

Abstract

The aims of this study are 1) to study the prediction accuracy rate between the two data mining techniques: decision tree and neural network in classifying a group of student academic achievement, 2) to analyze factors affecting academic achievement that contribute to the prediction of students’ academic performance. In this study, the researcher used WEKA open source data mining tool to analyze attributes for predicting undergraduate students’ academic performance in an international program. The data set comprised of 1,600 student records with 22 attributes of students registered between year 2001 and 2011 in a university in Thailand. Preprocessing included attribute importance analysis. The researcher applied the data set to differentiate classifiers (Decision Tree, Neural Network). A cross-validation with 10 folds was used to evaluate the prediction accuracy. An experimental comparison of the performance of the classifiers was also conducted. Results show that the decision tree classifier achieves high accuracy of 85.188%, which is higher than that of neural network classifier by 1.313%.

Article Details

Section
Research Articles

References

Affendey, L. S., Paris, I.H.M., Mustapha, N., Nasir Sulaiman, Md. & Muda, Z. (2010). Ranking of Influencing Factors in Predicting Students’ Academic Performance. International Technology Journal, 9(6), 832-837.

Beikzadeh, M. R. & Delavari N. (2005). A New Analysis Model for Data Mining Processes in Higher Educational Systems: proceedings of the 6th Information Technology Based Higher Education and Training, 7-9 July 2005.

Chen, M. S., Han, J. & Yu, P. S. (1996). Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, 8 (6), 866 – 883.

Connolly, T., Begg, C. & Strachan, A. (1999). Database Systems: A Practical Approach to Design Implementation and Management. Harlow: Addison-Wesley.

Elouedi, Z., Mellouli, K., & Smets, P. (2000). Decision trees using the belief function theory. In Proceedings of the international conference on Information Processing and Management of Uncertainty IPMU.1. 141-148.

Fayyad, U., Piatetsky-Shapiro, G. & Smyth, P. (1996). From Data Mining to Knowledge Discovery in Databases. AI Magazine, 17 (3), 37-54.

Hagan, M. T. & Menhaj, M. B. (1994). Training Feed-forward Networks with the Marquardt Algorithm. IEEE Trans. on Neural Networks, 5(6), 989-993.

Hyndman, R. J., & Athanasopoulos, G. (2014). Forecasting: principles and practice [OTexts]. Retrieved from https://www.otexts.org/fpp/9/3

Ian, H. W. & Eibe, F. (2005). Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. California: Morgan Kaufmann.

Kotsiantis, S. B., Pierrakeas, C. J. & Pintelas, P. E. (2003). Preventing Student Dropout in Distance Learning Using Machine Learning Techniques. In proceedings of 7th International Conference on Knowledge-Based Intelligent Information and Engineering Systems (KES 2003), 267-274.

Merceron, A. & Yacef, K. (2005). Educational Data Mining: a Case Study. In Proceedings of the 12th International Conference on Artificial Intelligence in Education AIED 2005, Amsterdam: IOS Press.

Minaei-Bidgoli, B., Kashy, D., Kortemeyer, G. & Punch, W. (2003). Predicting Student Performance: An Application of Data Mining Methods with an Educational Web-Based System. In the Processing of 33rd ASEE/IEEE conference of Frontiers in Education.

Piatetsky-Shapiro, G. & Frawley, W. J. (1991). Knowledge Discovery in Databases. MIT Press.

Polpinij, J. (2002). The Probabilistic Models Approach for Analysis the Factors Affecting of Car Insurance Risk. M.S. thesis, Department of Computer Science, Kasetsart University. Thailand.

Quinlan, J. R. (1990). Decision trees and decision-making. Systems, Man and Cybernetics, IEEE Transactions on, 20(2), 339-346.

Quinlan, J. R. (1992). C 4. 5: Programs for Machine Learning. Morgan Kaufmann.

Romero, C., Ventura, S. & Garcia, E. (2008). Data Mining in Course Management Systems: Moodle Case Study and Tutorial. Computers & Education, 51(1). 368-384.

Thai Nghe, N., Janecek, P. & Haddawy, P. (2007). A Comparative Analysis of Techniques for Predicting Academic Performance. ASEE/IEEE Frontiers in Education Conference.

Utgoff, P. E. (1989). Incremental induction of decision trees. Machine learning, 4(2), 161-186.

Waiyamai, K. (2003). Improving Quality of Graduate Students by Data Mining. Department of Computer Engineering, Faculty of Engineering, Kasetsart University, Thailand.