Cache Hit Ratio Optimization

March 13, 2018
Optimize cache hit ratios and reduce “miss rates” with regression algorithms processed by machine learning.
The use case described in this experiment is an estimation of demand for specific objects or pages in a software application. Objects more frequently requested can be cached for optimizing the response time to the client. Performance of a cache is measured as "hit ratio" and "miss ratio". The hit ratio is the fraction of accesses that are a hit (object found in cache) over all of the requests. The miss ratio is the fraction of accesses that are a miss (object not found in cache), or the remaining of the hit ratio to 100 percent. To optimize the performance of a cache instance, we want to increase the hit ratio and decrease the miss ratio. A prediction technique based on a machine learning regression (demand estimation) algorithm predicts the likelihood that an object will be used and, therefore, it can be allocated in cache before a request is submitted, to increase the chance of a hit. A full description of this technique is described on my article for MSDN Magazine, July 2017 issue: [Scale Applications with Microsoft Azure Redis and Machine Learning]( The dataset contains information about total number of object/page views and total number of cache hits, which allows to defined the column "Hit Ratio" as the percentage of cache hits for a specific object at a given time. #Dataset Data contains the following columns: - Date - Object - Total Hits - Cache Hits - Hit Ratio, defined as percentage of cache hits over total hits ![Dataset sample][1] #Training Two training processes are tried, for identifying the best performance: - **Poisson Regression**: Poisson regression is intended for use in regression models that are used to predict numeric values in a Poisson distribution. A [Poisson distribution]( is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant rate and independently of the time since the last event. - **Neural Network Regression**: Although neural networks are widely known for use in deep learning and modeling complex problems such as image recognition, they are easily adapted to regression problems. Neural network regression is a supervised learning method, and therefore requires a tagged dataset, which includes a label column. Because a regression model predicts a numerical value, the label column must be a numerical data type. Both methods apply to the problem we are trying to resolve. Executing them in parallel allows to compare their performance and accuracy at the evaluation step at the end of the flow. Using the Poisson regression method requires the label column (Hit Ratio) to be a whole (integer) number. For this reason, the initial value in the Hit Ratio column, which is a decimal fration, has to be converted to an integer number, by adding two sequential "Apply Math Operation" tasks: - The first math operation runs a multiply operation of the value in the Hit Ratio column by 100, and replace (Output mode = Inplace) the existing value with the result of the multiplication. - The second math operation rounds the value up to a whole number (Constant Precision = 0) and, again, replace the previous value with the new one. ![Training experiment][2] #Scoring Two scored models are produced, one for each training method. The score model contains an additional column "Score Label" appended to the original dataset. #Evaluation The **Evaluate Model** action assesses the performance of the scored models. The metrics returned for regression models are generally designed to estimate the amount of error. A model is considered to fit the data well if the difference between observed and predicted values is small. From the evaluation results generated by this experiment, with the results on the left-hand side being the ones related to the Poisson regression method, and the ones on the right-hand side being related to the Neural Network regression, we can observe: - **Mean absolute error** measures how close the predictions are to the actual outcomes. The lower the score, the better. - **Root mean squared error** is a single value that summarizes the error in the model. A lower value, as for the model on the right, indicates better accuracy. - **Relative absolute error** is the relative absolute difference between expected and actual values. The value on the right has a better score. - **Relative squared error** normalizes the total squared error of the predicted values. The right model outperforms the left one by large. - **Coefficient of determination** represents the predictive power of the model. The closer the value to 1, the better. One (1) means there is a perfect fit. ![Evaluation results][3] #Web Service Because of the performance of the Neural Network regression is clearly better in this experiment, we can select its train model for building a predictive experiment. Once you run the **Predictive Experiment** and deploy it as a **Web Service**, it is possible to interact with the "Cache Hit Ratio Optimization" service programmatically. ![Predictive experiment][4] Based on the selected columns in the Training experiment, testing the service asks for the following values in input: - Date - Object - Total Hits And it generates an estimated percentage of hits of the indicated object at a specific time. With these figures in mind, we can build an external process that preallocates objects with the higher hit ratio estimate in cache. To consume the Web Service programmatically, this is the expected format of the **request message**: { "Inputs": { "input1": { "ColumnNames": [ "DATE", "OBJECT", "TOTAL HITS" ], "Values": [ [ "", "value", "0" ], [ "", "value", "0" ] ] } }, "GlobalParameters": {} } And this is the format of the **response message**, containing the **Scored Labels"** value. { "Results": { "output1": { "type": "DataTable", "value": { "ColumnNames": [ "Scored Labels" ], "ColumnTypes": [ "Numeric" ], "Values": [ [ "0" ], [ "0" ] ] } } } [1]: [2]: [3]: [4]: