Neural Network: Convolution and pooling deep net
This experiment demonstrates the usage of the 'Multiclass Neural Network' module to train neural network which is defined in Net# language.
#Deep learning with neural networks and Net\# #
In these experiments we will use the **Multiclass Neural Network** module to train models to recognize hand-written digits. We will use the [Net#](http://azure.microsoft.com/en-us/documentation/articles/machine-learning-azure-ml-netsharp-reference-guide) language to define network architecture, and consider several architectures ranging from simple 1 hidden layer to more advanced convolutional neural networks ("deep nets").
##Dataset##
The data used in this experiment is a popular MNIST dataset which consists of 70,000 grayscale images of hand-written digits. The dataset is publicly available on the [MNIST website](http://yann.lecun.com/exdb/mnist).
##Running the Experiments##
1. We will start with the most basic type of neural network: one hidden layer, fully connected. In the list of samples in Azure ML Studio, find the sample experiment, **`Neural Network: 1 hidden layer`**. Save a copy of the experiment to your workspace by clicking **Save As**. The experiment should look like the following graphic:
![experiment][experiment]
2. Run the copy of the experiment that you just made, and when the experiment has completed, right-click the **Result Dataset** port of the **Execute R script** module named **`Calculate accuracy`**, and select **Visualize**.
3. The chart shows you the accuracy of the model, which should be around **0.9767** (**97.67%**). Another way to interpret these results is that the model has an error of **2.33%**.
![result][result]
4. To view the confusion table for this model (shown in the following diagram), right-click the output of the **Evaluate Model** module and select **Visualize**.
![conftable][conftable]
3. Click the **Multiclass Neural Network** module, and review the custom script that defines the neural network architecture. The lines of code in the **Neural network definition** text box are written using the Net\# language.
![netsharp][netsharp]
This particular network has one input layer, named **Picture**, of size 28 \* 28. The product of these dimensions is 784, which corresponds to the number of pixels in each of the images in the MNIST datasets, which are 28 pixels in width by 28 pixels in height.
In the next line of Net\# code, the input layer is connected to the hidden layer **H** which has 100 nodes.
Finally, the hidden layer is connected to the output layer **Result**. The output layer has a size of 10 because there are 10 digits to classify. The keyword _softmax_ in this line denotes the activation function. For more information on activation functions supported in Net#, refer to [Net# guide](http://azure.microsoft.com/en-us/documentation/articles/machine-learning-azure-ml-netsharp-reference-guide).
We recommend that you use the procedures just described to explore all three related experiments, to learn how to create increasingly complex network architectures, which produce better results.
The following sections provide a short description of the highlights from each experiment:
## Neural Network: 2 hidden layers
This experiment uses two fully-connected hidden layers, with each layer having 200 nodes. The accuracy of this model is **0.981** (**98.1%**), which means an error rate of **1.9%** .
The following Net# code defines this architecture:
input Picture [28,28];
hidden H1 [200] from Picture all;
hidden H2 [200] from H1 all;
output Result [10] softmax from H2 all;
## Neural Network: Basic convolution
The code in this module defines a simple convolutional neural network with 3 hidden layers, in which the first 2 layers are convolutional and the remaining layer is fully connected. This network has accuracy of **0.9841** (**98.41%**) or an error rate of **1.59%**.
This architecture is defined by these lines:
const { T = true; F = false; }
input Picture [28, 28];
hidden C1 [5, 12, 12]
from Picture convolve {
InputShape = [28, 28];
KernelShape = [ 5, 5];
Stride = [ 2, 2];
MapCount = 5;
}
hidden C2 [50, 4, 4]
from C1 convolve {
InputShape = [ 5, 12, 12];
KernelShape = [ 1, 5, 5];
Stride = [ 1, 2, 2];
Sharing = [ F, T, T];
MapCount = 10;
}
hidden H3 [100]
from C2 all;
output Result [10] softmax
from H3 all;
## Neural Network: Convolution and pooling deep net
Our last example uses an advanced network architecture that can be considered an example of a "deep neural network", as it has several layers, including convolutional, max pooling, and fully connected layers. This network produces a model with error of approximately **1.1%** which is a significant improvement over the simple neural network in the first experiment, which had an error of **2.33%**.
To define this complex neural network we need a more elaborate Net# script, illustrating some of the advanced features of Net#:
const { T = true; F = false; }
const {
// input image size
ImgW = 28;
ImgH = 28;
// first convolutional layer parameters
C1Maps = 5;
C1KernW = 5;
C1KernH = 5;
C1StrideW = 1;
C1StrideH = 1;
// The following formula computes dimensions with padding enabled.
C1OutW = (ImgW - 1) / C1StrideW + 1;
C1OutH = (ImgH - 1) / C1StrideH + 1;
// first pooling layer parameters
P1KernW = 2;
P1KernH = 2;
P1StrideW = 2;
P1StrideH = 2;
// The following formula computes dimensions with no padding.
P1OutW = (C1OutW - P1KernW) / P1StrideW + 1;
P1OutH = (C1OutH - P1KernH) / P1StrideH + 1;
// second convolutional layer parameters
C2Maps = 10;
C2KernW = 5;
C2KernH = 5;
C2StrideW = 1;
C2StrideH = 1;
// The following formula computes dimensions with padding enabled.
C2OutW = (P1OutW - 1) / C2StrideW + 1;
C2OutH = (P1OutH - 1) / C2StrideH + 1;
// Since Z dimension of the kernel is 1 and sharing is disabled in Z dimension
// total number of maps is a product of input maps and layer maps.
C2OutZ = C2Maps * C1Maps;
// second pooling layer parameters
P2KernW = 2;
P2KernH = 2;
P2StrideW = 2;
P2StrideH = 2;
// The following formula computes dimensions with no padding.
P2OutW = (C2OutW - P2KernW) / P2StrideW + 1;
P2OutH = (C2OutH - P2KernH) / P2StrideH + 1;
}
input Picture [ImgH, ImgW];
hidden C1 [C1Maps, C1OutH, C1OutW]
from Picture convolve {
InputShape = [ImgH, ImgW];
KernelShape = [C1KernH, C1KernW];
Stride = [C1StrideH, C1StrideW];
Padding = [T, T];
MapCount = C1Maps;
}
hidden P1 [C1Maps, P1OutH, P1OutW]
from C1 max pool {
InputShape = [C1Maps, C1OutH, C1OutW];
KernelShape = [1, P1KernH, P1KernW];
Stride = [1, P1StrideH, P1StrideW];
}
hidden C2 [C2OutZ, C2OutH, C2OutW]
from P1 convolve {
InputShape = [C1Maps, P1OutH, P1OutW];
KernelShape = [1, C2KernH, C2KernW];
Stride = [1, C2StrideH, C2StrideW];
Sharing = [F, T, T];
Padding = [F, T, T];
MapCount = C2Maps;
}
hidden P2 [C2OutZ, P2OutH, P2OutW]
from C2 max pool {
InputShape = [C2OutZ, C2OutH, C2OutW];
KernelShape = [1, P2KernH, P2KernW];
Stride = [1, P2StrideH, P2StrideW];
}
hidden H3 [100]
from P2 all;
output Result [10] softmax
from H3 all;
<!-- Images -->
[experiment]:http://az712634.vo.msecnd.net/samplesimg/v1/24/experiment.png
[result]:http://az712634.vo.msecnd.net/samplesimg/v1/24/result.png
[conftable]:http://az712634.vo.msecnd.net/samplesimg/v1/24/conftable.png
[netsharp]:http://az712634.vo.msecnd.net/samplesimg/v1/24/netsharp.png