Anomaly Detection

June 24, 2017
ML example experiment. Attempts to predict credit risk as anomalies within the data. data = German Credit data
**Anomaly Detection** #In this session • Anomaly Detection • One-Class SVM Algorithm • PCA-Based Algorithm • Data set • Data attribute • Experiment Steps ![enter image description here][1] #Anomaly Detection • Credit card fraud, transaction, medical, text etc. • Also referred to as outliers, novelties, noise, deviations and exceptions • The data consists of 'normal' applications and 'risky' applications • Risky transactions = anomalous   #One-Class SVM • SVM = Support Vector Model • Supervised learning models • Analyze data and recognize patterns • Have a lot of "normal" data and not many cases of the anomalies • Use with Train Anomaly Detection Model • The train data set contain all or mostly normal cases.   #PCA-Based Anomaly Detection module • Principal Component Analysis (PCA) • Use when easy to obtain training data from one class • One class = acceptable transactions • Use when difficult to obtain sufficient samples of the targeted anomalies • Detect fraudulent transaction • You might not have enough examples of fraud to train the mode • But have many examples of good transactions   #Data set https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data) #German credit dataset • Credit card application • 1000 instances (rows) • Attributes = 20 (7 numerical, 13 categorical) • Label 1 = normal, 2 = risky   #Data attribute Attribute: Account status, month, credit history, propose, amount, saving, employ since, installment rate, sex …   #Experiment Steps 1. Import data set 2. Edit metadata 3. Split data for training 4. Split data for Score 5. Add PCA Base method 6. Add Tune Model Hyper parameters 7. Add Train Anomaly Detection Model 8. Add Score model 9. Add Evaluate Model A. Import data set <br> B. Edit metadata<br> C. Split data for training<br> D. Set Split data 1 property<br> E. Set Split data 2 property<br> F. Set Split data 3 property<br> G. Set PCA-Based Anomaly Detection property<br> H. Add Tune Model Hyperparameters<br> I. Add Train Anomaly Detection Model<br> J. Add Score Model<br> K. Add Evaluate Model<br> #Training mode • Single Parameter: If you know how you want to configure the model, you can provide a specific set of values as arguments. You might have learned these values by experimentation or received them as guidance. <br> • Parameter Range: If you are not sure of the best parameters, you can find the optimal parameters by specifying multiple values and using a parameter sweep to find the optimal configuration. <br> ![enter image description here][2] ![enter image description here][3] #More Information PCA-Based Anomaly Detection https://msdn.microsoft.com/en-us/library/azure/dn913102.aspx <br><br> ---------- > This ML experiment is for [Microsoft Azure Machine Learning Course][101].<br> For the complete experiment list [Click here][102].<br> Laploy | laploy@gmail.com | 084 007 5544 | [www.laploy.com][103]<br> ![enter image description here][104] ---------- [101]: https://notebooks.azure.com/laploy/libraries/loyml/html/00001%20Sessions%20summary.ipynb [102]: https://gallery.cortanaintelligence.com/Home/Author?authorId=81E333F747E3429B55A3445E6714C36F60B397C13B4D0B07F34DEF1421F64D73 [103]: http://laploy.com [104]: https://raw.githubusercontent.com/laploy/mli/master//loy-small.jpg [1]: https://raw.githubusercontent.com/laploy/mli/master//12500-000.PNG [2]: https://raw.githubusercontent.com/laploy/mli/master//12500-001.JPG [3]: https://raw.githubusercontent.com/laploy/mli/master//12500-002.JPG [11]: https://raw.githubusercontent.com/laploy/mli/master//loy-small.jpg