Parameters: these are the coefficients of the model, and they are chosen by the model itself. It means that the algorithm while learning, optimizes these coefficients (according to a given optimization strategy) and returns an array of parameters that minimize the error.
In a CNN, each layer has two kinds of parameters : weights and biases. The total number of parameters is just the sum of all weights and biases.
Wc= Number of weights of the Conv Layer.
Bc= Number of biases of the Conv Layer.
Pc=Number of parameters of the Conv Layer.
K = Size (width) of kernels used in the Conv Layer.
N = Number of kernels.
C = Number of channels of the input image.
- Wc = K^2 x C x N
- Bc = N
- Pc = Wc+Bc
In a Conv Layer, the depth of every kernel is always equal to the number of channels in the input image. So every kernel has K^2xC parameters, and there are such kernels. That’s how we come up with the above formula.