The different layers involved in the architecture of CNN are as follows:
1. Input Layer: The input layer in CNN should contain image data. Image data is represented by a three-dimensional matrix. We have to reshape the image into a single column.
For Example, Suppose we have an MNIST dataset and you have an image of dimension 28 x 28 =784, you need to convert it into 784 x 1 before feeding it into the input. If we have “k” training examples in the dataset, then the dimension of input will be (784, k).
2. Convolutional Layer: To perform the convolution operation, this layer is used which creates several smaller picture windows to go over the data.
3. ReLU Layer: This layer introduces the non-linearity to the network and converts all the negative pixels to zero. The final output is a rectified feature map.
4. Pooling Layer: Pooling is a down-sampling operation that reduces the dimensionality of the feature map.
5. Fully Connected Layer: This layer identifies and classifies the objects in the image.
6. Softmax / Logistic Layer: The softmax or Logistic layer is the last layer of CNN. It resides at the end of the FC layer. Logistic is used for binary classification problem statement and softmax is for multi-classification problem statement.
7. Output Layer: This layer contains the label in the form of a one-hot encoded vector.