Census Model 001

July 9, 2017
This experiment demonstrates how we can build a binary classification model to predict income levels of adult individuals. The process includes training, testing and evaluating the model on the Adult dataset.
#Binary Classification: Income Level Prediction In this sample experiment we will train a binary classifier on the Adult dataset, to predict whether an individual’s income is greater or less than $50,000. We will show how you can perform basic data processing operations, split the dataset into training and test sets, train the model, score the test dataset, and evaluate the predictions. ![enter image description here][1] #Creating the Experiment 1. Drag and drop the Adult Census Income Binary Classification dataset module into your experiment's workspace. 2. Add a Clean Missing Data module, and use the default settings, to replace missing values with zeros. Connect the dataset module output to the input port. 3. Add a Project Columns module, and connect the output of Clean Missing Data module to the input port. 4. Use the column selector to exclude these columns: workclass, occupation, and native-country. We are excluding these columns because we don't want their values to be used in the training process. By default, Azure ML Studio treats all columns as features except for the target variable (the Label column). Alternatively, you could use the Metadata Editor module, select the excluded columns, and then choose ClearFeatures from the Fields dropdown list. 5. Add a Split module to create the testing and test sets. Set the Fraction of rows in the first output dataset to 0.7. This means that 70% of the data will be output to the left port and the rest to the right port of this module. We will use the left dataset for training and the right one for testing. 6. Add a Two-Class Boosted Decision Tree module to initialize a boosted decision tree classifier. 7. Add a Train Model module and connect the classifier (step 5) and the training set (left output port of the Split module) to the left and right input ports respectively. This module will perform the training of the classifier. 8. Add a Score Model module and connect the trained model and the test set (right port of the Split module). This module will make the predictions. You can click on its output port to see the actual predictions and the positive class probabilities. 9. Add an Evaluate Model module and connect the scored dataset to the left input port. To see the evaluation results, click on the output port of the Evaluate Model module and select Visualize. ![enter image description here][2] ![enter image description here][3] <br><br> ---------- > This ML experiment is for [Microsoft Azure Machine Learning Course][101].<br> For the complete experiment list [Click here][102].<br> Laploy | laploy@gmail.com | 084 007 5544 | [www.laploy.com][103]<br> ![enter image description here][104] ---------- [101]: https://notebooks.azure.com/laploy/libraries/loyml/html/00001%20Sessions%20summary.ipynb [102]: https://gallery.cortanaintelligence.com/Home/Author?authorId=81E333F747E3429B55A3445E6714C36F60B397C13B4D0B07F34DEF1421F64D73 [103]: http://laploy.com [104]: https://raw.githubusercontent.com/laploy/mli/master//loy-small.jpg [1]: https://raw.githubusercontent.com/laploy/mli/master//13000-001.JPG [2]: https://raw.githubusercontent.com/laploy/mli/master//13000-002.JPG [3]: https://raw.githubusercontent.com/laploy/mli/master//13000-003.JPG [11]: https://raw.githubusercontent.com/laploy/mli/master//loy-small.jpg