Detailed Introduction of Recurrent Neural Network

vishrut-singhal · 13 May 2021 15:07

A recurrent neural network ( RNN ) is a kind of artificial neural network mainly used in speech recognition and natural language processing (NLP). RNN is used in deep learning and in the development of models that imitate the activity of neurons in the human brain .

Recurrent Networks are designed to recognize patterns in sequences of data, such as text, genomes, handwriting, the spoken word, and numerical time series data emanating from sensors, stock markets, and government agencies.

A recurrent neural network looks similar to a traditional neural network except that a memory-state is added to the neurons. The computation is to include a simple memory.

The recurrent neural network is a type of deep learning-oriented algorithm, which follows a sequential approach. In neural networks, we always assume that each input and output is dependent on all other layers. These types of neural networks are called recurrent because they sequentially perform mathematical computations.

Recurrent Neural Network in TensorFlow

Application of RNN

RNN has multiple uses when it comes to predicting the future. In the financial industry, RNN can help predict stock prices or the sign of the stock market direction (i.e., positive or negative ).

RNN is used for an autonomous car as it can avoid a car accident by anticipating the route of the vehicle.

RNN is widely used in image captioning, text analysis, machine translation, and sentiment analysis . For example, one should use a movie review to understanding the feeling the spectator perceived after watching the movie . Automating this task is very useful when the movie company can not have more time to review, consolidate, label, and analyze the reviews. The machine can do the job with a higher level of accuracy.

Following are the application of RNN:

1. Machine Translation

We make use of Recurrent Neural Networks in the translation engines to translate the text from one to another language. They do this with the combination of other models like LSTM (Long short-term memory) s .

Recurrent Neural Network in TensorFlow

2. Speech Recognition

Recurrent Neural Networks has replaced the traditional speech recognition models that made use of Hidden Markov Models. These Recurrent Neural Networks, along with LSTMs, are better poised at classifying speeches and converting them into text without loss of context.

Recurrent Neural Network in TensorFlow

3. Sentiment Analysis

We make use of sentiment analysis to positivity, negativity, or the neutrality of the sentence. Therefore, RNNs are most adept at handling data sequentially to find sentiments of the sentence.

Recurrent Neural Network in TensorFlow

4. Automatic Image Tagger

RNNs, in conjunction with convolutional neural networks, can detect the images and provide their descriptions in the form of tags. For example, a picture of a fox jumping over the fence is better explained appropriately using RNNs.

Recurrent Neural Network in TensorFlow

Limitations of RNN

RNN is supposed to carry the information in time. However, it is quite challenging to propagate all this information when the time step is too long. When a network has too many deep layers, it becomes untrainable. This problem is called: vanishing gradient problem.

If we remember, the neural network updates the weight use of the gradient descent algorithm. The gradient grows smaller when the network progress down to lower layers.

The gradient stays constant, meaning there is no space for improvement. The model learns from a change in its gradient; this change affects the network’s output. If the difference in the gradient is too small (i.e., the weight change a little), the system can’t learn anything and so the output. Therefore, a system facing a vanishing gradient problem cannot converge towards the right solution.

The recurrent neural will perform the following.

The recurrent network first performs the conversion of independent activations into dependent ones. It also assigns the same weight and bias to all the layers, which reduces the complexity of RNN of parameters. And it provides a standard platform for memorization of the previous outputs by providing previous output as an input to the next layer.

These three layers having the same weights and bias, combine into a single recurrent unit.

Recurrent Neural Network in TensorFlow

For calculating the current state-

ht =f(ht-1, Xt)

Where ht= current state
Ht-1= previous state
Xt= input state

To apply the activation function tanh, we have-

ht = tanh (Whhht-1+ WxhXt)

Where:

Whh = weight of recurrent neuron and,
Wxh = weight of the input neuron

The formula for calculating output:

Yt = Whyht

Training through RNN

The network takes a single time-step of the input.
We can calculate the current state through the current input and the previous state.
Now, the current state through ht-1 for the next state.
There is n number of steps, and in the end, all the information can be joined.
After completion of all the steps, the final step is for calculating the output.
At last, we compute the error by calculating the difference between actual output and the predicted output.
The error is backpropagated to the network to adjust the weights and produce a better outcome.