Product Demand Estimation and Stock Optimization

February 11, 2018
Predict demand for products and optimize stock availability.
The use case described in this experiment is an estimation of demand for products sold in store, in order to optimize their stock availability in prediction of future sales. Estimation is based on customer-oriented features, such as age or gender, and environmental features like weather condition or location, and specific products previously acquired by customers. The **dataset is labelled**, which allows for **supervised learning** algorithms, and the label column being trained contains numerical values. The following estimation - regression - algorithms are tested: * **Linear Regression**: This algorithm attempts to establish a linear relationship between one or more independent variables and the numeric outcome. * **Boosted Decision Tree Regression**: Boosting is a machine learning technique for regression problems that builds each regression tree in a step-wise fashion, using a predefined loss function to measure the error in each step and then select the optimal tree #Dataset Data is sourced from a Dynamics 365 for Sales application (a sample copy is provided in this experiment), and contains the following columns: - Age Range - Gender - Skin Type - Weather - Temperature - Product - Applied - Bought ![Dataset sample][1] #Training This experiment selects a sub-set of features for training: - Gender - Skin Type - Weather - Product - Bought: Indicated how many products have been bought, for the indicated features. The **Bought** column is used for training the model, that is for identify patterns of bought products in the provided historical data. Feel free to modify the selected columns for trying alternative prediction models based on different features. ![Training experiment][2] #Scoring Two models are trained and scored, using **Linear Regression** and **Boosted Decision Tree Regression**. For regression models, the **Score Model** action generates the predicted numeric value, which is identified in the scored dataset as "Scored Labels". ![Scored dataset][3] #Evaluation To generate a set of metrics used for evaluating the model's accuracy (performance), the two scored datasets are connected to an **Evaluate Model** action. The metrics returned for regression models are designed to estimate the amount of error in the scored value. For more information on the generated metrics and how to interpret them, please refer to [Evaluate Model][4] #Web Service Once you run the **Predictive Experiment** and deploy it as a **Web Service**, it is possible to interact with the prediction service programmatically. ![Predictive experiment][5] Based on the selected columns in the Training experiment, testing the service asks for the following values in input: - Gender - Skin Type - Weather - Product ![Test the service][6] And it generates an estimated number of products that will be bought by customer matching the indicated conditions. To consume the Web Service programmatically, this is the expected format of the **request message**: { "Inputs": { "inputData": { "ColumnNames": [ "Gender", "Skin Type", "Weather", "Product" ], "Values": [ [ "value", "value", "value", "value" ], [ "value", "value", "value", "value" ] ] } }, "GlobalParameters": {} } And this is the format of the **response message**, containing the **Scored Labels** value. { "Results": { "outputData": { "type": "DataTable", "value": { "ColumnNames": [ "Gender", "Skin Type", "Weather", "Product", "Scored Labels" ], "ColumnTypes": [ "String", "String", "String", "String", "Numeric" ], "Values": [ [ "value", "value", "value", "value", "0" ], [ "value", "value", "value", "value", "0" ] ] } } } } The predicted value, that is the estimated number of products that will be sold under the identified conditions, can be used to update stock availability of the given product in store. [1]: [2]: [3]: [4]: [5]: [6]: