Data Science Virtual Machine
This virtual machine contains popular tools for data science modeling and development activities. It is offered in both Windows Linux(CentOS7).
The Data Science Virtual machine (VM) is a custom Azure VM with several popular tools for data science modeling/development. It is offered in bith Windows and Linux editions
The main tools included are Microsoft R Server, Anaconda Python distribution, Jupyter notebooks for Python and R, SQL Server 2016 Dev edition (Windows)/ Postgres database (Linux), Azure tools, libraries to access various Azure services like AzureML, databases, big data services. It also has machine learning tools and algorithms like CNTK (a deep learning toolkit), Vowpal Wabbit and xgboost.
In this collections, we have items below to help you quickly create your instance of the Windows or the Linux Data Science Virtual Machine.
With the data science virtual machine you can jump start modeling and development for your data science project using software commonly used for analytics and machine learning tasks in a variety of languages including R, Python, SQL, Java and more all pre-installed. Jupyter notebooks offers a browser based experimentation and development environment for both Python and R. Microsoft R Server included in the VM. On the VM, the Azure SDK for Python, R, Java, node.js, Ruby, PHP allows you to build your applications using various Azure services in the cloud including the Cortana Intelligence Suite which is comprised of Azure Machine Learning, Azure data factory, Stream Analytics, SQL Datawarehouse, Hadoop, Data Lake, Spark and more. We also have other powerful machine learning tools and algorithms like CNTK (a deep learning toolkit from Microsoft Research), Vowpal Wabbit, xgboost pre-installed locally. Azure command line tools allow you to manage your Azure resources. The Linux VM also includes run time for Ruby, PHP, node.js, Java, Perl, Eclipse, standard editors like vim, Emacs, gedit). A remote graphical desktop is also provided with VM side pre-configured (needs one time X2Go client side download). You have full access to the virtual machine and the shell including sudo access for the account that is created during the provisioning of the VM. This VM is built on top of Linux Openlogic CentOS-based version 7.2 distribution.
We will feature interesting experiments and tutorials that use the Data Science Virtual machine from this collection. Please send in your suggestions in the comments box below.
Here are resources to help you get started with the DSVM.
**Windows Edition:**
- Documentation: https://azure.microsoft.com/en-us/documentation/articles/machine-learning-data-science-provision-vm/
- Article/Tutorial – Ten things you can do on the DSVM: https://azure.microsoft.com/en-us/documentation/articles/machine-learning-data-science-vm-do-ten-things/
- Link to Create VM instance: https://azure.microsoft.com/marketplace/partners/microsoft-ads/standard-data-science-vm/
- Link to Create DSVM for Deep Learning on Azure GPU VM: http://aka.ms/dsvm/deeplearning
**Linux Edition:**
- Documentation: https://azure.microsoft.com/en-us/documentation/articles/machine-learning-data-science-linux-dsvm-intro/
- Link to Create VM instance: https://azure.microsoft.com/marketplace/partners/microsoft-ads/linux-data-science-vm/
**Webinar:**
https://channel9.msdn.com/blogs/Cloud-and-Enterprise-Premium/Inside-the-Data-Science-Virtual-Machine (Duration: 1 Hour)