In this post we will learn how to create a multi scale neural network from scratch. This is different from a typical feed forward neural network in which output of each layer is just connected to the subsequent layer. In a multi scale network the output of the lower layers is also fed into the classifier. This allows the model to take advantage of the lower level features which the lower layers identify.
For reference the architecture of the model we will be creating will be similar to the model published in this paper by Pierre Sermanet and Yann LeCun. It is assumed that the reader has a knowledge of how to build a feed forward network and is familiar with TensorFlow.
Feed Forward Neural Net
In a strict feed forward neural network the output is fed only to the layer above. The classifier at the end only receives input from the preceding layer. Building such a network is very straightforward using TensorFlow. Following is the code which can be used to create a strict feed forward neural network (with two convolutional and one fully connected layer) using TensorFlow.
Creating the architecture
A few helper methods to define the weights, biases and layers we will be using
Now defining the main architecture.
Mutli Scale Neural Net
In a multi scale architecture the output of the initial convolutional layers is fed into the classifier directly after being subsampled. This has the advantage that the information gained by the lower layers has much more of a bearing on the final output.
Creating the architecture
As an initial step we will have to change the dimensions of the weights for the flattened layer as it’s input now will be the output of the second convolutional layer and the subsampled output of the first convolutional layer. The dimensions chosen will be clear if you follow the input and output shapes mentioned above each of the layers.
Now let’s go through the changes that we will have to make to our LeNet to transform it into a multi scale architecture. First of all we will subsample our first convolutional layer. This can easily done by passing the output through a Max Pooling layer.
Next we flatten both the first and the second convolutional layer.
And finally the layers are concatenated using tf.concat
And we are done 🎉🎉🎉
TensorFlow makes it quite easy to transform a strict feed forward neural net into a multi scale architecture. I recently used this architecture in one of my projects, based on the paper mentioned above, where I classified German Traffic Signs . You can check out the project here : https://github.com/manibatra/Traffic-Sign-Classifier