Opportunity scoring - Machine Learning models
This article describe how machine learning models are built for opportunity scoring
<h2>Models</h2>
The installer creates an Azure ML work space containing three experiments: <br/><br/>
1.The first experiment retrains and evaluates the predictive model. <br/>
2.The second experiment scores opportunities using the trained predictive model. <br/>
3.The third experiment is for produces prescriptive insights using feature ablation on top of the trained predictive model. <br/>
<br/>
<h2>Features</h2>
The input to the predictive model consists of data from the opportunity as well as related information from the original lead, associated customer account and products.<br/>
We handle the data differently depending on its type (e.g. numeric, categorical, or text). <br/>
First, we clean the data to fill in reasonable defaults for any missing values. For example:<br/><br/>
When quantities are missing, we replace them with zeros. <br/>
When categories are missing, we replace them with a special "missing" category. <br/>
We then Featurize the data in the following manner<br/>
We encode categorical variables with 1-hot encodings. <br/>
We encode text variables using a bag of words, and then hashing. <br/>
The "hashing trick" limits the feature dimension, which reduces over-fitting, and makes the problem more tractable for the learning algorithm. <br/>
<h2>Predictive Model Algorithm</h2>
We train a two-class boosted decision tree on the input features to predict if an opportunity will be won or lost.<br/>
<h2>Prescriptive Model Algorithm</h2>
An algorithm called "feature ablation" runs on top of the predictive model to give further insights to the factors which influenced the score. <br/>
Feature ablation refers to selectively removing sets of features, and determining the impact on the score.<br/>
1.First, we score the opportunity using the entire set of features, creating a base score. <br/>
2.Next, the algorithm iterates through each pre-defined group of features. <br/>
3.For each feature group, the algorithm creates a new opportunity, which is the same as the original, but all features in the group have been replaced with the default value. <br/>
4.We then score this new opportunity using the same original model, producing a new score. <br/>
5.We then compare the new score to the base score. If the new score is lower than the base score, that means the features which were removed were positively contributing to the score by the magnitude of the difference between the new score and the base score. <br/>
Inversely, if the new score is higher than the base score, the features were negative contributors. 6.Once all feature groups have been tested independently, we rank the positive and negative influences based on the magnitude of the score differences. <br/>
<h2>DNN for feature selection</h2>
Microsoft team used DNN (Deep Neural Network) technique to determine which features are important. Any enterprise has vast variety of data signal in different format. Using DNN team prepared first cut of signal that are important. <br/>
<h2>Operationalized Boosted Decision Tree Model</h2>
Once team shortlisted features, team simplified model using Boosted decision tree algorithm available in Azure ML (AML). It provide easy operationalization, faster training and cost saving over DNN. <br/>
In AML - optimal parameters were selected.<br/>
<h2>Prescriptive analytics using Ablation & WeakAUC</h2>
For providing reasoning and recommendation on opportunity score - Microsoft team used technique "feature ablation". Ablation look at each column predictive power combine with WeakAUC technique. WeakAUC is computed from training dataset using information gain like statistics and provide benchmark to validate model result.<br/>