Finance: Credit Risk Classification

February 29, 2020
Credit risk classification predicts the credit score of borrowers. The insight can help the bank make better decisions.
Credit risk classification predicts the credit score of borrowers. The insight can help the bank make better decisions. ## Scenario EASY MONEY Pte Ltd is a credit co-operative in Singapore. The bank has been in operations for years and have a dataset of past customers' credit card payment defaults. The management decides to explore using machine learning to help its employees to make decisions when **assessing the credit risk of their customers**. This will be a walkthrough on how to build a machine learning model that will **predict the credit risk of customers**. ## Dataset ### Description The dataset is from [UCI Statlog (German Credit Card) dataset](https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)). The dataset classifies people, described by a set of attributes, as low or high credit risks. Each example represents a person. There are 20 features, both numerical and categorical, and a binary label (the credit risk value). High credit risk entries have label = 2, low credit risk entries have label = 1. The cost of misclassifying a low risk example as high is 1, whereas the cost of misclassifying a high risk example as low is 5. ![Credit Risk Dataset Overview](https://drive.google.com/uc?export=view&id=1NQpim7lHoghQREHPfQpbh5dvFFVDjxfh) ### Attribute Information The attribute information of the dataset can be found at the dataset's website: [UCI Statlog (German Credit Card) dataset](https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)). ## Content This walkthrough demonstrates the following: 1. Build a classification model to predict credit risk of customers 2. Address class imbalance byupsampling default cases 3. Improve performance by tuning the parameters of the model ## Outcome The first customer in the dataset has credit risk of 2, but the model predicted the customer to be of credit risk of 1. The second customer's credit risk is correctly predicted. ![Prediction Result](https://drive.google.com/uc?export=view&id=1pf20jEmcr3b2ljl49w4TIrgVWim02HxE) The graph below shows the performance of the credit risk classification machine learning model. The [Confusion Matrix](https://en.wikipedia.org/wiki/Confusion_matrix) shows that the model has a precision of 69% and recall of 52%. ![Confusion Matrix](https://drive.google.com/uc?export=view&id=1IZqxjQSLZ13LXqn_6Nskm1PBmZj5BmLQ)