Loan Credit Risk with Azure HDInsight Spark Clusters

By for June 29, 2017

Report Abuse
This solution demonstrates how to build and deploy a machine learning model with Microsoft R Server on Azure HDInsight Spark clusters to deploy predictive analytics for a lending institution to reduce the number of loans they offer to those borrowers most likely to default, increasing the profitability of their loan portfolio. This solution enables efficient handling of big data on Spark with Microsoft R Server.
> **Note:** You can read more about this solution and deployment guides in the [Loan Credit Risk solution]( published on GitHub. > This solution will create an HDInisght Spark cluster with Microsoft R Server. This cluster will contain 2 head nodes, 2 worker nodes, and 1 edge node with a total of 32 cores. The approximate cost for this HDInsight Spark cluster is 3.11USD/hour. Billing starts once a cluster is created and stops when the cluster is deleted. Billing is pro-rated per minute, so you should always **delete your cluster** when it is no longer in use. Use the Deployments page to delete the entire solution once you are done. ## Overview If we had a crystal ball, we would only loan money to someone we knew would pay us back. A lending institution can make use of predictive analytics to reduce number of loans they offer to those borrowers most likely to default, increasing the profitablity of their loan portfolio. This solution uses simulated data for a small personal loan financial institution, building a model to help detect whether the borrower will default on a loan. Read more about this solution, including step-by-step instructions on how to deploy it, at the [Loan Credit Risk Website]( ## Disclaimer ©2017 Microsoft Corporation. All rights reserved. This information is provided "as-is" and may change without notice. Microsoft makes no warranties, express or implied, with respect to the information provided here. Third party data was used to generate the Solution. You are responsible for respecting the rights of others, including procuring and complying with relevant licenses in order to create similar datasets.