Retail Forecasting: Step 4 of 6, train regression models
By AzureML Team for Microsoft March 18, 2015
Accurate and timely forecast in retail business drives success. It is an essential enabler of supply and inventory planning, product pricing, promotion, and placement. As part of Azure ML offering, Microsoft provides a template letting data scientists easily build and deploy a retail forecasting solution.
#Retail Forecasting Template Accurate and timely forecast in retail business drives success. It is an essential enabler of supply and inventory planning, product pricing, promotion, and placement. As part of the Azure Machine Learning offering, Microsoft provides a template letting data scientists easily build and deploy a retail forecasting solution. In this document, you will learn how to use and customize the template through a demo use case. ##Problem Description Elena works for a retail company whose chain-store managers report sales quantities on products at the SKU level every week. Her task is to build a pipeline that automatically provides weekly forecasts of the next month for each store and each product. ##Input Data Schema The template takes in historical time series values and retail-related information as input. In Elena’s case, she has weekly sales data from November 2010 to December 2013. The template provides two ID fields, which she associates with the store ID and product/SKU ID. (This association can be customized, and more IDs can be added with a moderate knowledge of R.) The sample data originally came from the retail industry, but has been anonymized and transformed before being used in this sample. **It is suggested to have at least two seasons' data for training.** ![tb0_ipt1] Retail-related information can be divided into two categories: static and temporal. **Static** information includes store location, size, etc. **Temporal** information involves temperature, marketing activities such as promotions and markdowns, as well as economic indices. We need values observed for these features not only in the past, but also in the future -- either their true values or predicted values. How to forecast these features is an interesting topic, which is beyond the scope of this template. For this sample we assume all observations are ready. Elena adds one economic index into her model: real disposable personal income. This index reveals economic trends that are influential to retail customer behaviors. The dataset of economic indices is collected at the national level, so every store can share the same information. (Real Disposable Personal Income, 11/01/2009 – 01/01/2014. Source: [http://research.stlouisfed.org/fred2/](http://research.stlouisfed.org/fred2/)) ![tb0_ipt2] ##Output Data Schema Forecasts and 95% confidence intervals are returned by the deployed web service. ![tb0_opt] ##Workflow The graph below presents the workflow of the template. Each step corresponds to an experiment. The output of one experiment is the input of the next. As a data scientist, Elena knows that time series models and regression models are the most common approaches. However, there is no conclusive answer as to which model works the best – it really depends on the data. This template provides her with a framework to quickly try out multiple models and pick up the best one to build a web service. ![workflow] ## Parallelization Consideration The retail dataset includes multiple time series with different IDs. We load all IDs into the experiment in Step 1 to Step 5 (the model building and evaluation phase) in order to judge models based on their performance over all stores and products. In Step 6A and Step 6B (where web services are deployed), only one time series is read at a time, and the ID of the series is set as an input parameter to the web service. This design enables **parallelized** forecasting of different IDs, which can be implemented by using either an external worker or Azure Data Factory. This design greatly enhances efficiency. ##Data Pipeline In Azure Machine Learning, users can either upload a dataset from a local file, or connect to an online data source, such as the web, Azure SQL database, Azure table, Hive table, or Windows Azure blob storage, by using the [**Reader**](https://msdn.microsoft.com/library/azure/4e1b0fe6-aded-4b3f-a36f-39b8862b9004) and [**Writer**](https://msdn.microsoft.com/library/azure/7a391181-b6a7-4ad4-b82d-e419c0d6522c) modules, or by using Azure Data Factory. Because retail data are updated frequently, we recommend using online data sources for input datasets, to enable real-time updates in an end-to-end solution. Keeping data in the cloud is also the most convenient way to ensure that intermediate datasets that are shared between experiments are always the newest version. A complete online data flow includes an online storage, together with the **Reader** and **Writer** modules. For this demo, we use the **Reader** module to connect to a sample Azure SQL database account. However, to prevent accidental deletion of this database, we decided not to include a **Writer**. Users are encouraged to set up their own connections to gain the full experience, by opening an Azure SQL database account and accessing and writing data using the **Reader** and **Writer** modules. Here is a good tutorial to get started: [Getting started with Microsoft Azure SQL Database](http://azure.microsoft.com/en-us/documentation/articles/sql-database-get-started/). Instruction **2** below explains how to customize the settings of the **Reader** and **Writer**. If you would like just to have a quick peek without going through setup, do not worry. The **Reader** modules used in this demo load all input and intermediate datasets. Simply run the sample experiments and you are good to go. If you would like to get your hands dirty by playing around with the code and models, you can generate new intermediate datasets that are different from the default ones. You can identify the modules that produce intermediate datasets by their comments, which contain `[Data Output]:`, followed by the dataset names. **Reader** modules that use these intermediate datasets can be identified by their comments, which start with `[Data Input]:` followed by the dataset name. How do you pass datasets between experiments? Here are two options: **1.** Save the output as a dataset by clicking the output port and selecting **Save as Dataset** (see below). Then replace the corresponding **Reader** module with this dataset. Make sure that you reconnect all the lines correctly before removing the **Reader**. ![reader1] **2.** Use your own online storage account to hold these datasets. Add a **Writer** connected to the output ports as shown below. Then replace the credentials in the corresponding Reader module with the information of your own accounts. ![reader2] Ready to go? Enjoy the journey! Here are the links to each step (experiment) of the multi-step template: **[Retail Forecasting: Step 1 of 6, data preprocessing](http://gallery.azureml.net/Details/8d1587b63ec54d03ae1276c513ab72f6)** **[Retail Forecasting: Step 2 of 6, train time series models](http://gallery.azureml.net/Details/38131c170faf4554b05447213e0ac783)** **[Retail Forecasting: Step 3 of 6, feature engineering](http://gallery.azureml.net/Details/6b14f98c397f40b2aea290f3dee760a3)** **[Retail Forecasting: Step 4 of 6, train regression models](http://gallery.azureml.net/Details/678c86f992714515beab98f03bc9a44e)** **[Retail Forecasting: Step 5 of 6, evaluate models](http://gallery.azureml.net/Details/266c5da6ec0c49ea8924479bc6fc0f37)** **[Retail Forecasting: Step 6A of 6, deploy a web service with a time series model](http://gallery.azureml.net/Details/370c80490e774a6cb26edba69c583c9b)** **[Retail Forecasting: Step 6B of 6, deploy a web service with a regression model](http://gallery.azureml.net/Details/bef6f84ac80d4625891f9f0ae768b356)** ## Step Description ---------- ##Retail Forecasting: Step 1 of 6, data preprocessing ! **1.0.** Open experiment "Retail Forecasting: Step 1 of 6, data preprocessing". **1.1.** Load the sample historical time series _input_. **1.2.** Provide modeling parameters. In our sample use case, Elena has **weekly** observations for three years and a month. She would like to use the last year, i.e. the last **52** weeks as testing data. In this way she has more than two years' training data. The data shows a yearly seasonality, thus the length of a season, also known as frequency, is **52**. The time stamps in her data follows the format of **“%m/%d/%Y”**. The parameter **observation.freq** allows descriptive values including "hour", "week", "day", "month", "week", and "year". It also supports customized time differences: for example, difftime("2012-12-08 00:15:00 UTC", "2012-12-08 00:00:00 UTC"). ![tb1_1] **1.3.** Select an eligible time series, based on pre-defined business rules. This template demonstrates two possible rules: - If a time series is too short to provide enough historical information, discard it. Here, Elena only considers time series longer than **two years**. ![tb1_2] - If a time series has any sales quantity less than a certain threshold, discard it. For instance, Elena only considers products having sales quantity larger than **20**. ![tb1_3] **1.4.** Create a complete time series by inserting any time stamps that are missing between the earliest and latest times in the data. You can replace the corresponding missing data values with NA. **1.5.** Select a time series based on the goodness of training and testing data. Discard a time series if the last six values in training data set are all missing or more than half of the testing data are missing. This module produces an output dataset named _Cleaned Input_. ## Retail Forecasting: Step 2 of 6, train time series models ! **2.0** Open the experiment "Retail Forecasting: Step 2 of 6, train time series models". **2.1.** Load the dataset _Cleaned Input_. **2.2.** Provide the same modeling parameters as those in Step 1.2. **Note:** Step 2.3 - 2.5 requires at least two years' training data. **2.3.** Fit demo time series model 1: Seasonal Trend Decomposition using Loess (STL) + Exponential Smoothing (ETS). Note that **STL** won’t work when seasonality equals 1. In this case, you should use R’s **ets** function instead. **2.4.** Fit demo time series model 2: Seasonal Naive. Note that **seasonal naïve** won’t work when seasonality equals to 1. In this case, you should use R’s **naïve** function instead. **2.5.** Fit demo time series model 3: Seasonal Trend Decomposition using Loess + AutoRegressive Integrated Moving Average (ARIMA). Note that **STL** won’t work when seasonality equals 1. In this case, use R’s **auto.arima** function instead. **2.6.** Join the three model’s forecasts. This module produces an output dataset _Time Series Results_. ##Retail Forecasting: Step 3 of 6, feature engineering ! **3.0** Open the experiment "Retail Forecasting: Step 3 of 6, feature engineering". **3.1.** Load dataset _Cleaned Input_. **3.2** Provide the same modeling parameters as those in Step 1.2. **3.3.** [Optional] Load the external economic index data. Here we use Real Disposable Personal Income as an example. The dataset name is _Real Disposable Personal Income_. **3.4.** [Optional] Create features using the external economic index loaded in Step 3.3. As a leading indicator, this index changes before sales change. The module selects the best lag of this index based on maximum correlation. Modeling parameters in Step 1.2, including _test.length_, _seasonality_, _observation.freq_, and _timeformat_, needs to be input as well. **3.5.** Create features. Sample codes for the following features are prepared. Please feel free to customize the code to suit your own use cases. - Date features: year, month, week of month, etc. - Time features - Season features - Weekday-and-weekend features - Holiday features: New Year, U.S. Labor Day, U.S. Thanksgiving, Cyber Monday, Christmas, etc. - Fourier features to capture seasonality **3.6.** Indicate categorical variables. **3.7.** [Optional] Conduct a log transformation to the _value_ column. After transformation, its distribution approximates Normal. **3.8.** Create a training dataset. Add lag features for training data. This module produces the output dataset, _Train Data for Regression_. ![tb3_1] **3.9.** Create a testing dataset. Add lag features for testing data. This module produces the output dataset, _Test Data for Regression_. ![tb3_2] ##Retail Forecasting: Step 4 of 6, train regression models ! **4.0** Open experiment "Retail Forecasting: Step 4 of 6, train regression models". **4.1.** Load the dataset, _Train Data for Regression_. Remove the time column as it is not a feature of the regression model. Assign data into five folds for cross-validation and the parameter sweep. **4.2.** Load the dataset, _Test Data for Regression_. Remove the time column as it is not a feature of the regression model. **Note:** In Step 4.3 - 4.5, the [**Sweep Parameters**](https://msdn.microsoft.com/library/azure/038d91b6-c2f2-42a1-9215-1f2c20ed1b40) module is used to select the optimized parameters. Here we compare 10 sets of parameters selected by the **Random sweep** option, to ensure a reasonable running time. For a comprehensive comparison, you can select the **Entire grid** option instead, but be aware that this can take a long time. **4.3.** Fit demo regression model 1: [**Boosted Decision Tree Regression**](https://msdn.microsoft.com/library/azure/0207d252-6c41-4c77-84c3-73bdf1ac5960) **4.4.** Fit demo regression model 2: [**Decision Forest Regression**](https://msdn.microsoft.com/library/azure/562988b2-e740-4e3a-8131-358391bad755) **4.5.** Fit demo regression model 3: [**Fast Forest Quantile Regression**](https://msdn.microsoft.com/library/azure/b9064dc3-2d69-4e06-b307-6cebf324686a) **4.6.** [Optional] If the log transformation in Step 3.7 is performed, conduct an exponential transformation to the _value_ column, to convert it into original scale. **4.7.** Join the forecasts of the two models. This module produces the output dataset, _Regression Results_. ##Retail Forecasting: Step 5 of 6, evaluate models ! **5.0** Open experiment "Retail Forecasting: Step 5 of 6, evaluate models". **5.1.** Load the dataset, _Time Series Results_. **5.2.** Load the dataset, _Regression Results_. **5.3.** Load the dataset, _Cleaned Input_. **5.4.** Provide the same modeling parameters as those in Step 1.2. **5.5.** Join the datasets uploaded in Step 5.1 – Step 5.3. **5.6.** Evaluate and compare the model results. Choose the best model based on the result of this step. (Note: A discussion on model performance can be found at the end of this document.) - Left port: Aggregated error metrics of all IDs ![res1] - Right port: Box-plot revealing error distribution across IDs ![res2] **5.7.** Identify an individual time series of interest. For example, Elena decided to review store **12** and product **1**, because these appear as outliers under many models and metrics. ![tb5_1] **5.8.** Extract this time series. **5.9.** View the performance of this single time series. - Left port: Error metrics of this single ID combination - Right port: Diagnostic graphs for this outlier time series. We can improve the models based on insights gained from here. ![res3] ##Retail Forecasting: Step 6A and 6B, deploy a web service Step 6A and 6B demonstrates how to deploy a web service by using the best model found in Step 5. For how to publish a web service, click [this tutorial](http://azure.microsoft.com/en-us/documentation/articles/machine-learning-walkthrough-5-publish-web-service/). Forecasting models are typically retrained frequently, using updated data, to make more accurate predictions. Therefore, in this step, we included the training workflow in the scoring experiments 6A and 6B. ##Retail Forecasting: Step 6A of 6, deploy a web service with a time series model ![6A] For more information about how to publish a web service, see [this tutorial](http://azure.microsoft.com/en-us/documentation/articles/machine-learning-walkthrough-5-publish-web-service/). **6A.0** Open experiment "Retail Forecasting: Step 6A of 6, deploy a web service with a time series model". **6A.1.** Load the dataset, _input_. **6A.2.** Specify the IDs to forecast. ![tb6_1] **6A.3.** Extract the corresponding time series. **6A.4.** Provide modeling parameters. In our sample use case, Elena forecasts for the next **4** weeks. ![tb6_2] **6A.5.** Check if this time series satisfies the business rules that we set in Step 1.3. If not, the R module stops and returns a 500 error on the web service output. **6A.6.** Plug in the best model in Step 2. **6A.7.** Add the ID and runtime information to the result dataset. **6A.8.** Set the web service input. **6A.9.** Set the web service output. ##Retail Forecasting: Step 6B of 6, deploy a web service with a regression model ![6B] For more information about how to publish a web service, see [this tutorial](http://azure.microsoft.com/en-us/documentation/articles/machine-learning-walkthrough-5-publish-web-service/). **6B.0** Open the experiment "Retail Forecasting: Step 6B of 6, deploy a web service with a regression model". **6B.1.** Load the dataset, _input_. **6B.2.** Specify the IDs to forecast. ![tb6_1] **6B.3.** Extract the corresponding time series. **6B.4.** Provide modeling parameters. In our sample use case, Elena forecasts for the next **4** weeks. ![tb6_2] **6B.5.** Check if this time series satisfies the business rules that we set in Step 1.3. If not, this R module stops, and returns a 500 error on the web service output. **6B.6.** Plug in the best model in Step 4, together with the best parameters for the mode, which you can obtain from the right port of the [**Sweep Parameters**](https://msdn.microsoft.com/library/azure/038d91b6-c2f2-42a1-9215-1f2c20ed1b40) module. **6B.7.** Add ID and runtime information to the result dataset. **6B.8.** Set web service input. **6B.9.** Set web service output. ##Consume a Deployed Web Service The web service can be consumed in two modes: RRS (request-response service) and BES (batch execution service). Sample code (C#/Python/R) to call the web services are provided in the web service. Just click the **API help page** link as shown in the web service page. ![service] ##Discussion The best aggregated MAPE (Mean Absolute Percentage Error) that Elena achieved so far is 19%. When looking at the graphs for a single product (Step 5.7), she noticed that her retail data do not have a clear seasonal pattern. However, in her experience, weather, pricing, promotions, discount events, and so forth all have a huge impact on sales. The lack of such information might explain why the MAPE is relatively high. She decided to contact colleagues in the marketing department to get the data she needs into the pipeline, so she can start a new iteration of modeling! Developing an end-to-end architecture at first, and then iterating on models is always a good practice. ##Summary Microsoft Azure Machine Learning provides a cloud-based machine learning platform for data scientists to easily build and deploy machine learning applications. This retail forecasting template, based on weekly sales quantity data, can be adapted to other retail forecasting scenarios. This template, along with other templates published by Microsoft, further enables users to perform fast prototyping and deployment of machine learning solutions. <!-- Images --> [workflow]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/workflow.png [reader1]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/reader1.PNG [reader2]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/reader2.PNG :https://az712634.vo.msecnd.net/samplesimg/v1/T1/1.png :https://az712634.vo.msecnd.net/samplesimg/v1/T1/2.png :https://az712634.vo.msecnd.net/samplesimg/v1/T1/3.png :https://az712634.vo.msecnd.net/samplesimg/v1/T1/4.png :https://az712634.vo.msecnd.net/samplesimg/v1/T1/5.png [res1]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/res1.png [res2]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/res2.png [res3]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/res3.png [6A]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/6A.png [6B]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/6B.png [service]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/service.png [tb0_ipt1]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/tb0_ipt1.PNG [tb0_ipt2]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/tb0_ipt2.PNG [tb0_opt]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/tb0_opt.PNG [tb1_1]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/tb1_1.PNG [tb1_2]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/tb1_2.PNG [tb1_3]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/tb1_3.PNG [tb3_1]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/tb3_1.PNG [tb3_2]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/tb3_2.PNG [tb5_1]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/tb5_1.PNG [tb6_1]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/tb6_1.PNG [tb6_2]:https://az712634.vo.msecnd.net/samplesimg/v1/T1/tb6_2.PNG