Enhanced model evaluation with multiple performance metrics for regression analysis
Enhanced Evaluate Model module which integrates 22 performance metrics and can be used with both Azure built-in models and R script models.
**Experiment Highlights**
The purpose of this experiment was to develop an Enhanced Evaluate Model module, based on the R language that would improve performance of the built-in Azure MLS Evaluate Model module. Specifically:
- Enhanced Evaluate Model module enables evaluation of regression models with 22 performance metrics (compared to five (5) in the built-in Azure module).
- Enhanced Evaluate Model module can perform evaluations of models implemented with R language using Azure “Create R Model” (function not available now in Azure) as well as combining them with evaluations of the Azure built-in regression models.
Details of this experiment are described in:
Botchkarev, A. (2018). Evaluating Performance of Regression Machine Learning Models Using Multiple Error Metrics in Azure Machine Learning Studio (May 12, 2018). Available at SSRN: http://ssrn.com/abstract=3177507
A list of the performance metrics implemented in the Enhanced Evaluate Model module in alphabetic order of the metric abbreviation:
**Metric
Abbreviation** >>> **Metric Name**
CoD >>> Coefficient of Determination
GMRAE >>> Geometric Mean Relative Absolute Error
MAE >>> Mean Absolute Error
MAPE >>> Mean Absolute Percentage Error
MASE >>> Mean Absolute Scaled Error
MdAE >>> Median Absolute Error
MdAPE >>> Median Absolute Percentage Error
MdRAE >>> Median Relative Absolute Error
ME >>> Mean Error
MPE >>> Mean Percentage Error
MRAE >>> Mean Relative Absolute Error
MSE >>> Mean Squared Error
NRMSE_mm >>> Normalized Root Mean Squared Error (normalized to the difference between
maximum and minimum actual data)
NRMSE_sd >>> Normalized Root Mean Squared Error (normalized to the standard deviation
of the actual data)
RAE >>> Relative Absolute Error
RMdSPE >>> Root Median Square Percentage Error
RMSE >>> Root Mean Squared Error
RMSPE >>> Root Mean Square Percentage Error
RSE >>> Relative Squared Error
sMAPE >>> Symmetric Mean Absolute Percentage Error
SMdAPE >>> Symmetric Median Absolute Percentage Error
SSE >>> Sum of Squared Error
**Overview of the experiment**
The focus of this experiment is on the 'Enhanced Evaluate Model' module.
However, for easier understanding of the module operation and potential reuse, it is shown in a typical Azure regression experiment workflow, i.e. input data, initialize model, train model, score model, evaluate model.
A component of the previously published experiment is used as a sample:
Botchkarev, A. (2018). Revision 2 Integrated tool for rapid assessment of multi-type regression machine learning models. Experiment in Microsoft Azure Machine Learning Studio. Azure AI Gallery. https://gallery.azure.ai/Experiment/Revision-2-Integrated-tool-for-rapid-assessment-of-multi-type-regression-machine-learning-models
Details of this experiment are described in:
Botchkarev, A. (2018). Evaluating Hospital Case Cost Prediction Models Using Azure Machine Learning Studio. arXiv:1804.01825 [cs.LG].
**How to use the model in your experiment**
1. Use input from the 'Score Model' (dataset1) to create vectors (one-column data frames) for actual and predicted values.
Check column names by visualizing output of the 'Score Model'. Column names for predicted variable may differ depending on which regression model is used, e.g. 'Scored Label Mean' for Decision Forest Regression, or 'Scored Labels' for Linear regression.
Rename the column in the input data frame with target/label variable values ('Cost' in the example) to have a new name - 'Actual'.
Similarly, rename the column in the input data frame with predicted variable values ('Predicted Cost' in the example) to have a new name - 'Predicted'.
See hints in the R script of the 'Enhanced Evaluate Model'.
2. Copy 'Enhanced Evaluate Model', paste it into your experiment canvas, and connect input of this module to the output of the 'Score Model'.
**Output**
Right click on the left output port (1) of the Enhanced Evaluate Model module. Select Visualize option from a drop down menu.
The output is presented in a table with three columns: metric abbreviation, metric full name and numerical estimate. Another option is to attach a Convert to CSV module to the Enhanced Evaluate Model module and download output file.
Note that results are rounded to four digits after the decimal point.
**Concluding Remarks**
Note that the operation of the module assumes that all data preparation has been done: there are no NA (missing) data elements, all data are numeric.
Note that some metrics perform division operation and may return error messages, if denominator is equal or close to zero.
Note that the module is publicly shared for information and training purposes only. All efforts were taken to make this module error-free. However, we do not guarantee the correctness, reliability and completeness of the material and the module is provided "as is", without warranty of any kind, express or implied. Any user is acting entirely at their own risk.
Note that the views, opinions and conclusions expressed in this document are those of the author alone and do not necessarily represent the views of the author’s current or former employers.