AzureML Part_A:Predicting Wine Quality
Predicting Wine Quality through Binary Classification Model.
The goal of this experiment is to build a binary classification model to predict the quality of a given wine
into high or low-quality wine given its various properties. In the data set, the quality of the wine is rated
on a scale of 1 to 10. For binary classification into high and low-quality wines, using the “Group Data
into Bins” module to convert the ratings into a binary categorical variable, the quality rating of 7 or
more have been grouped into one bin (the High Quality or the HQ) and the quality rating
of 6 or less into another bin (the low quality or the LQ).The data has been split into two sets – 70%
for training and 30% for testing. The dataset was trained, scored and evaluated for three data
mining models which are :Two Class Boosted Decision Tree Classification Model,a Two Class
Logistics Regression Model and a Two Class Neural Network Model.
***Business Decision :*** Here the business decison was to use data mining to price the wine by predicting
its quality. The threshold probability level in the models were adjusted to minimize the impact of
missclassifications. The best performing data mining model's (based on the maximum area under curve,
AUC metric) confusion matrix was used to calculate the profitability by pricing the HQ wine $65 a bottle and LQ wine $30 a bottle and the cost of producing the
wine is assumed to be $20 a bottle.