Explain Huffman Coding in detail?

Huffman coding is a lossless data compression algorithm. In this algorithm, a variable-length code is assigned to input different characters. The code length is related to how frequently characters are used. Most frequent characters have the smallest codes and longer codes for least frequent characters.

There are mainly two parts. First one to create a Huffman tree, and another one to traverse the tree to find codes.
For an example, consider some strings “YYYZXXYYX”, the frequency of character Y is larger than X and the character Z has the least frequency. So the length of the code for Y is smaller than X, and code for X will be smaller than Z.
Complexity for assigning the code for each character according to their frequency is O(n log n)

Input and Output

A string with different characters, say “ACCEBFFFFAAXXBLKE”
Code for different characters:
Data: K, Frequency: 1, Code: 0000
Data: L, Frequency: 1, Code: 0001
Data: E, Frequency: 2, Code: 001
Data: F, Frequency: 4, Code: 01
Data: B, Frequency: 2, Code: 100
Data: C, Frequency: 2, Code: 101
Data: X, Frequency: 2, Code: 110
Data: A, Frequency: 3, Code: 111

Algorithm: huffmanCoding(string)

Input: A string with different characters.
Output: The codes for each individual characters.

   define a node with character, frequency, left and right child of the node for Huffman tree.
   create a list ‘freq’ to store frequency of each character, initially, all are 0
   for each character c in the string do
      increase the frequency for character ch in freq list.

   for all type of character ch do
      if the frequency of ch is non zero then
         add ch and its frequency as a node of priority queue Q.

   while Q is not empty do
      remove item from Q and assign it to left child of node
      remove item from Q and assign to the right child of node
      traverse the node to find the assigned code

traverseNode(n: node, code)

Input: The node n of the Huffman tree, and the code assigned from the previous call
Output: Code assigned with each character

if a left child of node n ≠φ then
   traverseNode(leftChild(n), code+’0’)     //traverse through the left child
   traverseNode(rightChild(n), code+’1’)    //traverse through the right child
   display the character and data of current node.