Two class Logistic Regression : Student Performance
To predict student performance using the students, past performance data
For this dataset, I decided to try a two-class logistic experiment. As it is one of the popular methods used in classification problems like this one and helps in finding the probability of an outcome.
Findings:
• After running the model, G1 and G2 are found to be positively skewed to the right (train model section) even after normalizing it before splitting. (check score model)
• This algorithm predicts the probability of occurrence (student pass or fail) of an event with overfitting issues. (check to evaluate section).
• By increasing the Threshold to 0.85, the model still reflects 87.9% accuracy in predicting the student’s performance.
The model has learned everything about the data and will be reluctant to accept any changes to the data.
Solutions:
• Adding more data might improve the performance of the model.
• The data is not too sparse, therefore increasing L2 regularization can be helpful in getting better results.