Vowpal Wabbit Samples

September 15, 2016

Report Abuse
A series of sample experiments using Vowpal Wabbit modules in Azure ML Studio.
[Vowpal Wabbit](http://hunch.net/~vw), or VW for short, is a powerful open-source, online, and out-of-core machine learning system created by [John Langford](http://hunch.net/~jl/) and colleagues in Microsoft Research. Azure ML has native support for VW with [Train VW](https://msdn.microsoft.com/library/azure/86f666bb-d459-4117-bbb0-4edfd566c3a9?f=255&MSPPError=-2147217396) and [Score VW](https://msdn.microsoft.com/library/azure/43d255dc-c03d-4dce-acc4-884e660210d9) modules. You can use it to train on datasets much bigger than 10 GB, which is generally the upper limit allowed by other learning algorithms in Azure ML. It supports many learning algorithms including OLS regression, matrix factorization, single layer neural network, Latent Dirichlet Allocation, contextual bandit and more. For full documentation, visit the [tutorials on the official GitHub site](https://github.com/JohnLangford/vowpal_wabbit/wiki/Tutorial). And here is also an [excellent article](http://www.zinkov.com/posts/2013-08-13-vowpal-tutorial/) (albeit a little dated) to get you started quickly. This Collection of samples intends to show how to leverage VW in Azure ML. We will start with [converting the well-known Adult Income Census data into VW format](https://gallery.cortanaintelligence.com/Experiment/Convert-Dataset-to-VW-Format-2), and then progress to [train a simple VW model](https://gallery.cortanaintelligence.com/Experiment/Train-a-VW-Model-with-Small-Dataset-1) using the converted dataset. We will also show how to train a 37 GB flight dataset later, and how to continuously improve an existing model with new batches of training data. Special thanks to [Luong Hoang](https://www.microsoft.com/en-us/research/people/lhoang/) who helped creating these samples.