Data Virtualization Patterns for Advanced Hybrid Analytics
By AzureML Team for Microsoft September 29, 2016
Over large variety of customer engagements and collateral building, we have identified a few commonly occurring Hybrid Analytics use cases and have developed powerful yet simple solutions using Cortana Intelligence Suite of products.
A lot of companies now rely on data to make decisions that advance organizational growth, drive business decisions, generate profitability and so on. With the velocity of data generated from disparate sources, these companies' data have the potential of residing both on premises and in cloud depending on the situations. There is a growing need to leverage a hybrid on-prem/cloud system for possible data congregation routes. The purpose of this tutorial is to discuss some identified use cases and solutions to achieve them using Microsoft Azure products. These identified patterns are not just recurring for various clients, but appears to be the growing business process trend. We've done some research and put together this tutorial that tries to cover these cases. A few possible scenarios are: - Low horsepower on-premises systems marshal off large data computation to the cloud; the results are sent back on-premises systems to be consumed by the applications dependent on these results (Query Execution Scale-out). - The user may want to be closest to the largest source of data (locality of reference), probably relational, while integrating and referencing NoSQL data like click stream information to drive sales. - Leverage SQL knowledge both on-prem and in cloud on both relational and non-relational data. - Avoiding low throughput on your data pipeline due to constant large data transfers - Segregation and security of sensitive information on-prem while using the cloud for large crunching of other types of data. - Avoid replication of business logic on multiple systems; save time, be more efficient, minimize latency and network I/O using incremental copying.