Retail Product Category Classification Based on Features

May 12, 2015

Report Abuse
The main objective of this model is to build a predictive model which is able to distinguish between main retail product categories.
Data: The data is collected from Otto Group which contains a dataset with 93 features for more than 200,000 products. It has a training dataset which has ID, Features and Target Class and a testing dataset which has ID and Features. Model: Initially we are doing some preprocessing such as cleaning data and removing NAs like that. Then we are using Random Forest Classification Algorithm for classifying the products. There we used R Scripts for the Random Forest. We set the ntree as 100 for better classfication Output: The output will be the products with the classes which has to be there. The accuracy is 80%.