ISSN : 1013-0799
The purpose of this study is to identify the factors influencing middle school students’ reading behavior by applying various machine learning methods and to determine which method yields the best performance. Using data from the 2018 Korean Child and Youth Panel Survey, students who read more than one hour per weekday were classified as the reading group, while those who did not read at all were classified as the non-reading group. Seven machine learning methods-logistic regression, decision tree, random forest, adaptive LASSO, SVM, gradient boosting, and kNN-were then applied for comparative analysis. The analysis shows that random forest and gradient boosting achieved the highest predictive accuracy, indicating that ensemble methods can effectively capture nonlinear patterns and distinguish reading behavior more precisely. According to the partial dependence plots, academic engagement, relationships with peers and teachers, smartphone dependence, and self-esteem were significant variables in determining whether students read. While stronger academic engagement and higher self-esteem increased the likelihood of reading, academic helplessness, and excessive smartphone use impeded it. This study is significant in that it provides detailed factors necessary for designing effective reading promotion policies. In particular, it underscores the need for a multifaceted approach that integrates academic motivation strategies, peer and teacher support, and media utilization education.
