Two class Logistic Regression : Student Performance

October 18, 2019
To predict student performance using the students, past performance data
For this dataset, I decided to try a two-class logistic experiment. As it is one of the popular methods used in classification problems like this one and helps in finding the probability of an outcome. Findings: • After running the model, G1 and G2 are found to be positively skewed to the right (train model section) even after normalizing it before splitting. (check score model) • This algorithm predicts the probability of occurrence (student pass or fail) of an event with overfitting issues. (check to evaluate section). • By increasing the Threshold to 0.85, the model still reflects 87.9% accuracy in predicting the student’s performance. The model has learned everything about the data and will be reluctant to accept any changes to the data. Solutions: • Adding more data might improve the performance of the model. • The data is not too sparse, therefore increasing L2 regularization can be helpful in getting better results.