Time Series Forecasting

By for September 2, 2014

Report Abuse
Time Series Forecasting with Azure ML using R
#Time Series Forecasting in Azure ML using R In this article, we'll use Microsoft Azure Machine Learning Studio to build an experiment for doing time series forecasting using several classical time series forecasting algorithms available in R. ##Overview of Experiment The main steps of the experiment are: - [Step 1: Get data] - [Step 2: Split the data into train and test] - [Step 3: Run time series forecasting using R] - [Step 4: Generate accuracy metrics] - [Step 5: Results] [Step 1: Get data]:#step-1-get-data [Step 2: Split the data into train and test]:#step-2 [Step 3: Run time series forecasting using R]:#step-3 [Step 4: Generate accuracy metrics]:#step-4 [Step 5: Results]:#step-5 ### Step 1: Get data We obtained the N1725 time series data from the publicly available [M3 competition dataset](http://forecasters.org/resources/time-series-data/), and uploaded the data to Azure ML Studio. This dataset has 126 rows and two columns, **`time`** and **`value`**. ### Step 2: Split the data into train and test We used the **Split** module in Azure ML Studio to divide the data into training and testing sets, using the _Relational split_ option and specifying a time value as the split condition. We used the first 108 points for training and the remaining 18 points for testing the accuracy of various forecasting modules. ![][image1] ### Step 3: Run time series forecasting using R To compute forecasts, we used the following classical time series methods from the [forecast package](http://cran.r-project.org/web/packages/forecast/index.html) in R: 1. Seasonal [ARIMA](http://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average) 2. Non Seasonal [ARIMA](http://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average) 3. Seasonal [ETS](http://en.wikipedia.org/wiki/Exponential_smoothing) 4. Non -Seasonal [ETS](http://en.wikipedia.org/wiki/Exponential_smoothing) 5. Average of Seasonal [ETS](http://en.wikipedia.org/wiki/Exponential_smoothing) and Seasonal [ARIMA](http://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average) For all seasonal methods, we used a seasonality value of 12. The following R script was added to the **Execute R Script** module to build the model for seasonal [ARIMA](http://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average).. 1. Read the training data in dataset1 and the test data (for the timestamps) in **`dataset2`**. 2. Create a **`ts`** object in R with the training data and specified seasonality. 3. Learn a [ARIMA](http://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average) model using the **`auto.arima()`** function from the **`forecast`** package in R. 4. Compute the forecasting horizon by comparing the maximum timestamps in training and test datasets. 5. Forecast using the learned [ARIMA](http://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average) model for the computed horizon. ![][image2] For each of the other model types, we added a new **Execute R Script** module, added similar code to call the R packages with appropriate parameters. Note: To save space, not all the scripts are shown here, but you can open the experiment in Azure ML Studio and click each module to see the R script details. ### Step 4: Generate accuracy metrics We joined the forecasting results from each of the methods with the test data, to compute the accuracy metrics. We used another instance of the **Execute R Script** module to compute the following metrics: - [**Mean Error** (ME) ](http://en.wikipedia.org/wiki/Mean_signed_difference)- Average forecasting error (an *error* is the difference between the predicted value and the actual value) on the test dataset - [**Root Mean Squared Error** (RMSE)](http://en.wikipedia.org/wiki/Root-mean-square_deviation) - The square root of the average of squared errors of predictions made on the test dataset. - [**Mean Absolute Error** (MAE)](http://en.wikipedia.org/wiki/Mean_absolute_error) - The average of absolute errors - [**Mean Percentage Error** (MPE)](http://en.wikipedia.org/wiki/Mean_percentage_error) - The average of percentage errors - [**Mean Absolute Percentage Error** (MAPE)](http://en.wikipedia.org/wiki/Mean_absolute_percentage_error) - The average of absolute percentage errors - [**Mean Absolute Scaled Error** (MASE)](http://en.wikipedia.org/wiki/Mean_absolute_scaled_error) - [**Symmetric Mean Absolute Percentage Error** (sMAPE)](http://en.wikipedia.org/wiki/Symmetric_mean_absolute_percentage_error) ### Step 5: Results We found that the average of seasonal [ETS](http://en.wikipedia.org/wiki/Exponential_smoothing) and seasonal [ARIMA](http://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average) models performs better than either of the two algorithms individually measured in terms of MASE/sMAPE/MAPE. ![][image4] The final experiment looks like this: ![][image3] <!-- Images --> [image1]:http://az712634.vo.msecnd.net/samplesimg/v1/12/split.png [image2]:http://az712634.vo.msecnd.net/samplesimg/v1/12/seasonal_arima.png [image3]:http://az712634.vo.msecnd.net/samplesimg/v1/12/full_experiment.png [image4]:http://az712634.vo.msecnd.net/samplesimg/v1/12/table.png