Fraud Detection with Azure HDInsight Spark Clusters
This solution demonstrates how to build and deploy a machine learning model with Microsoft R Server on Azure HDInsight Spark clusters for online retailers to detect fraudulent purchase transactions. This solution enables efficient handling of big data on Spark with Microsoft R Server.
> **Note:** You can read more about this solution and deployment guides in the [Fraud Detection solution](https://github.com/Microsoft/r-server-fraud-detection) published on GitHub.
> This solution will create an HDInisght Spark cluster with Microsoft R Server. This cluster will contain 2 head nodes, 2 worker nodes, and 1 edge node with a total of 32 cores. The approximate cost for this HDInsight Spark cluster is 3.11USD/hour. Billing starts once a cluster is created and stops when the cluster is deleted. Billing is pro-rated per minute, so you should always **delete your cluster** when it is no longer in use. Use the Deployments page to delete the entire solution once you are done.
## Overview
Fraud detection is one of the earliest industrial applications of data mining and machine learning. This solution shows how to build and deploy a machine learning model for online retailers to detect fraudulent purchase transactions.
Read more about this solution, including step-by-step instructions on how to deploy it, at the [Fraud Detection Website](https://microsoft.github.io/r-server-fraud-detection/).
## Disclaimer
©2017 Microsoft Corporation. All rights reserved. This information is provided "as-is" and may change without notice. Microsoft makes no warranties, express or implied, with respect to the information provided here. Third party data was used to generate the Solution. You are responsible for respecting the rights of others, including procuring and complying with relevant licenses in order to create similar datasets.