Art of Huffman PI Coding by DonCaffeDeLaSkobalj !
Huffman PI coding is a lossless PI signal acuaring and analyse algorithm. The main idea is to assign variable-length codes to different input signals including ground noise signals, emi signals etc., etc., where the lengths of the assigned signals are based on the frequencies of their apperiance after ADC implementation procesdure. The most frequent input signals (like environment or groundsignals) gets the smallest code weight coding, while the
least frequent signals (like golden coins, gets the largest code weight.
Due to the well known fact that the normal accuired signals in noise backgrounds like heavy mineralised soils keep the law of uniform Gaussian distribution or keep the law of central limit theorem. In its most general form, under normal conditions (naturally finite signals variance), averages of input random samples of ground signals variables independently drawn from independent gaussian distributions converge in distribution to the normal, that is, they normally become distributed when the number of aquired samples is sufficiently large.
Physical quantities of random ground signals that are expected to be the sum of many independent processes (but always finite for given time frame) have distributions that are nearly normal.
In high tech prospecting art, many results and methods like are propagation of signals uncertainty and least squares parameter fitting, can be analytically derived in explicit form when the random variable sampled signals are normally distributed. The normal distribution is also called the bell curve. However, many other similar distributions have also bell-shaped form lik Cauchy, Student's, and logistic distributions.
In Huffman PI coding the variable-length codes were assigned to random input signals like Prefix Codes, means the codes (random sequences) are assigned in such a way that the code assigned to one input signal is not prefix of code assigned to any other input signal. This is how Huffman Coding makes that there is no ambiguity when decoding random input signals stream.
Let us understand how prefix codes work in Huffman PI with a small counter example.
Suppose there are four input Pi signals after preamplification and sampling and let's mark them like signal a, b, c and d, and let's mark their corresponding variable length like codes 00, 01, 0 and 1 (from a to d). In this simple example such coding leads to ambiguity because code assigned to signal c is prefix of codes assigned to a and b. If the compressed code stream is 0001, than the decompressed output may be any of “cccd” or “ccb” or “acd” or “ab”.
In Huffman PI Coding there are mainly two major parts
P1. Make a build of Huffman PI Tree from input signals.
P2. Traverse the Huffman PI Tree and assign codes to input.
Simple steps how to build Huffman PI Tree:
Input is array of unique signals along with their frequency of occurrences and output is Huffman PI Tree.
a. Let's create a leaf node for each unique signal and build a min heap of all leaf nodes. Here min heap is used as a priority queue. The value of frequency table is used to compare two nodes in min heap. Initially, the least frequent signal is at root of Huffman PI Tree.
Now we would extract two nodes with the minimum occurrence frequency from the min heap. Then we would create a new internal node with occurrence frequency equal to the sum of the two nodes frequencies. Now we would make the first extracted node as its left child and the other extracted node as its right child. Than we would add this node to the min heap. Then we would repeat these steps until the heap contains only one node.
This remaining node is the root node and the tree is complete.
Let us understand the algorithm with an example:
Input Signal Frequency of occurrences
a 5
b 9
c 12
d 13
e 16
f 45
First we need to build a min heap that contains 6 nodes where each node represents root of a Huffman PI Tree with single node.
Than we would extract two minimum frequency nodes from min heap. And add a new internal node with frequency 5 + 9 = 14.
Now our min heap contains 5 nodes where 4 nodes are roots of trees with single element each, and one heap node is root of tree with 3 elements
Input Signal Frequency of occurrences
c 12
d 13
Internal Node =14
e 16
f 45
Than we should extract two minimum frequency nodes from the heap and add a new internal node with frequency 12 + 13 = 25.
Now our min heap contains 4 nodes, where 2 nodes are roots of trees with single element each, and two heap nodes are root of tree with more than one nodes.
Input Signal Frequency of occurrences
Internal Node 14
e 16
Internal Node =25
f 45
Next step would be extraction of two minimum frequency nodes. So we add a new internal node with frequency 14 + 16 = 30.
Now our min heap contains 3 nodes:
Input Signal Frequency of occurrences
Internal Node 25
Internal Node = 30
f 45
Then we woulkd extract two minimum frequency nodes and add a new internal node with frequency 25 + 30 = 55
Now our min heap contains 2 nodes.
Input Signal Frequency of occurrences
f 45
Internal Node =55
Last step is extraction of two minimum frequency nodes. Add a new internal node with frequency 45 + 55 = 100. Now we got min heap which contains only one node.
Input Signal Frequency of occurrences
Internal Node 100
Since our heap contains only one remaining node, the Huffman PI algorithm stops here.
Now we would decode codes from Huffman PI Tree. We would traverse the formed tree, starting from the root, while maintain an auxiliary code array. While moving to the left child we would assign 0 to the array. While we are moving to the right child, we would assign 1 to the code array. Then decode array till the end when a leaf node is encountered.
The decoded codes are now:
Signal Code weight
f 0
c 100
d 101
a 1100
b 1101
e 111
Now we would write small C-program for Huffman PI Coding
#include <stdio.h>
#include <stdlib.h>
#define MAX_TREE_HT 100
// Huffman PI tree node
struct MinHeapNode {
// One of the input signals
signal data;
// Frequency of the occurrenced signal
unsigned freq;
// Let's make left and right child of this node
struct MinHeapNode *left, *right;
};
// Let's make min heap - collection of Hufmman PI Tree nodes
struct MinHeap {
// Size of min heap
unsigned size;
// capacity of min heap
unsigned capacity;
// minheap node pointers array
struct MinHeapNode** array;
};
// This small function would allocate a new min heap node with given signal and his occurrence frequency
struct MinHeapNode* newNode(signal data, unsigned freq)
{
struct MinHeapNode* temp
= (struct MinHeapNode*)malloc
(sizeof(struct MinHeapNode));
temp->left = temp->right = NULL;
temp->data = data;
temp->freq = freq;
return temp;
}
// This function would create min heap of given capacity
struct MinHeap* createMinHeap(unsigned capacity)
{
struct MinHeap* minHeap
= (struct MinHeap*)malloc(sizeof(struct MinHeap));
// current size is 0
minHeap->size = 0;
minHeap->capacity = capacity;
minHeap->array
= (struct MinHeapNode**)malloc(minHeap->
capacity * sizeof(struct MinHeapNode*));
return minHeap;
}
// Small swap min heap nodes function
void swapMinHeapNode(struct MinHeapNode** a,
struct MinHeapNode** b)
{
struct MinHeapNode* t = *a;
*a = *b;
*b = t;
}
// The normal min heap function
void minHeapify(struct MinHeap* minHeap, int idx)
{
int smallest = idx;
int left = 2 * idx + 1;
int right = 2 * idx + 2;
if (left < minHeap->size && minHeap->array[left]->
freq < minHeap->array[smallest]->freq)
smallest = left;
if (right < minHeap->size && minHeap->array[right]->
freq < minHeap->array[smallest]->freq)
smallest = right;
if (smallest != idx) {
swapMinHeapNode(&minHeap->array[smallest],
&minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}
// Check size of heap function
int isSizeOne(struct MinHeap* minHeap)
{
return (minHeap->size == 1);
}
// Small extract function for extraction of min. value node from heap
struct MinHeapNode* extractMin(struct MinHeap* minHeap)
{
struct MinHeapNode* temp = minHeap->array[0];
minHeap->array[0]
= minHeap->array[minHeap->size - 1];
--minHeap->size;
minHeapify(minHeap, 0);
return temp;
}
// Insert new node to Min Heap function
void insertMinHeap(struct MinHeap* minHeap,
struct MinHeapNode* minHeapNode)
{
++minHeap->size;
int i = minHeap->size - 1;
while (i && minHeapNode->freq < minHeap->array[(i - 1) / 2]->freq) {
minHeap->array[i] = minHeap->array[(i - 1) / 2];
i = (i - 1) / 2;
}
minHeap->array[i] = minHeapNode;
}
// min heap build funvtion
void buildMinHeap(struct MinHeap* minHeap)
{
int n = minHeap->size - 1;
int i;
for (i = (n - 1) / 2; i >= 0; --i)
minHeapify(minHeap, i);
}
// Decode function of an array of size n
void printArr(int arr[], int n)
{
int i;
for (i = 0; i < n; ++i)
printf("%d", arr[i]);
printf("\n");
}
// Check function if this node is leaf
int isLeaf(struct MinHeapNode* root)
{
return !(root->left) && !(root->right);
}
// This fuction creates a min heap of capacity equal to size and inserts all signals data in min heap
// Initially size of min heap is equal to capacity
struct MinHeap* createAndBuildMinHeap(signal data[], int freq[], int size)
{
struct MinHeap* minHeap = createMinHeap(size);
for (int i = 0; i < size; ++i)
minHeap->array[i] = newNode(data[i], freq[i]);
minHeap->size = size;
buildMinHeap(minHeap);
return minHeap;
}
// This is main Huffman Tree Build fucntion
struct MinHeapNode* buildHuffmanTree(signal data[], int freq[], int size)
{
struct MinHeapNode *left, *right, *top;
// Step 1: Create a min heap of capacity
// equal to size. Initially, there are
// modes equal to size.
struct MinHeap* minHeap = createAndBuildMinHeap(data, freq, size);
// Do iterate while size of heap doesn't become 1
while (!isSizeOne(minHeap)) {
// Extraction the two minimum freq. from min heap
left = extractMin(minHeap);
right = extractMin(minHeap);
// This function would create new internal node with occurrence frequency
// is equal to the sum of the two nodes frequencies. Then we make the two extracted
// nodes as left and right children of this new node.
// Finally we would add this node to the min heap
top = newNode('$', left->freq + right->freq);
top->left = left;
top->right = right;
insertMinHeap(minHeap, top);
}
// Root node is remaining node and the tree is complete.
return extractMin(minHeap);
}
// Decode Huffman PI codes from the root of Huffman Tree.
// This function uses array to store codes
void printCodes(struct MinHeapNode* root, int arr[], int top)
{
// Assign 0 to left edge and recur
if (root->left) {
arr[top] = 0;
printCodes(root->left, arr, top + 1);
}
// Assign 1 to right edge and recur
if (root->right) {
arr[top] = 1;
printCodes(root->right, arr, top + 1);
}
// If this is a leaf node, it would contains one of the input signals, its corresponding code from array
if (isLeaf(root)) {
printf("%c: ", root->data);
printArr(arr, top);
}
}
// Traversing function which builds Huffman PI Tree and decode codes by traversing built Huffman Tree
void HuffmanCodes(signal data[], int freq[], int size)
{
// Let's buid Huffman PI Tree
struct MinHeapNode* root
= buildHuffmanTree(data, freq, size);
// Decode Huffman PI codes
int arr[MAX_TREE_HT], top = 0;
printCodes(root, arr, top);
}
// Let's test all functions
int main()
{
signal arr[] = { 'a', 'b', 'c', 'd', 'e', 'f' };
int freq[] = { 5, 9, 12, 13, 16, 45 };
int size = sizeof(arr) / sizeof(arr[0]);
HuffmanCodes(arr, freq, size);
return 0;
}
Result would be such:
f: 0
c: 100
d: 101
a: 1100
b: 1101
e: 111
Huffman PI coding overal time complexity is n*log(n), where n is the number of unique sampled signals
If there are n nodes, ExtractMin(function) is called 2*(n – 1) times. Each ExtractMin(function) takes log(n) time as
it calles minHeapify(function).
That's it. Happy analysing and prospecting.
DonCaffeDeLaSkobalj - LRL Master
Huffman PI coding is a lossless PI signal acuaring and analyse algorithm. The main idea is to assign variable-length codes to different input signals including ground noise signals, emi signals etc., etc., where the lengths of the assigned signals are based on the frequencies of their apperiance after ADC implementation procesdure. The most frequent input signals (like environment or groundsignals) gets the smallest code weight coding, while the
least frequent signals (like golden coins, gets the largest code weight.
Due to the well known fact that the normal accuired signals in noise backgrounds like heavy mineralised soils keep the law of uniform Gaussian distribution or keep the law of central limit theorem. In its most general form, under normal conditions (naturally finite signals variance), averages of input random samples of ground signals variables independently drawn from independent gaussian distributions converge in distribution to the normal, that is, they normally become distributed when the number of aquired samples is sufficiently large.
Physical quantities of random ground signals that are expected to be the sum of many independent processes (but always finite for given time frame) have distributions that are nearly normal.
In high tech prospecting art, many results and methods like are propagation of signals uncertainty and least squares parameter fitting, can be analytically derived in explicit form when the random variable sampled signals are normally distributed. The normal distribution is also called the bell curve. However, many other similar distributions have also bell-shaped form lik Cauchy, Student's, and logistic distributions.
In Huffman PI coding the variable-length codes were assigned to random input signals like Prefix Codes, means the codes (random sequences) are assigned in such a way that the code assigned to one input signal is not prefix of code assigned to any other input signal. This is how Huffman Coding makes that there is no ambiguity when decoding random input signals stream.
Let us understand how prefix codes work in Huffman PI with a small counter example.
Suppose there are four input Pi signals after preamplification and sampling and let's mark them like signal a, b, c and d, and let's mark their corresponding variable length like codes 00, 01, 0 and 1 (from a to d). In this simple example such coding leads to ambiguity because code assigned to signal c is prefix of codes assigned to a and b. If the compressed code stream is 0001, than the decompressed output may be any of “cccd” or “ccb” or “acd” or “ab”.
In Huffman PI Coding there are mainly two major parts
P1. Make a build of Huffman PI Tree from input signals.
P2. Traverse the Huffman PI Tree and assign codes to input.
Simple steps how to build Huffman PI Tree:
Input is array of unique signals along with their frequency of occurrences and output is Huffman PI Tree.
a. Let's create a leaf node for each unique signal and build a min heap of all leaf nodes. Here min heap is used as a priority queue. The value of frequency table is used to compare two nodes in min heap. Initially, the least frequent signal is at root of Huffman PI Tree.
Now we would extract two nodes with the minimum occurrence frequency from the min heap. Then we would create a new internal node with occurrence frequency equal to the sum of the two nodes frequencies. Now we would make the first extracted node as its left child and the other extracted node as its right child. Than we would add this node to the min heap. Then we would repeat these steps until the heap contains only one node.
This remaining node is the root node and the tree is complete.
Let us understand the algorithm with an example:
Input Signal Frequency of occurrences
a 5
b 9
c 12
d 13
e 16
f 45
First we need to build a min heap that contains 6 nodes where each node represents root of a Huffman PI Tree with single node.
Than we would extract two minimum frequency nodes from min heap. And add a new internal node with frequency 5 + 9 = 14.
Now our min heap contains 5 nodes where 4 nodes are roots of trees with single element each, and one heap node is root of tree with 3 elements
Input Signal Frequency of occurrences
c 12
d 13
Internal Node =14
e 16
f 45
Than we should extract two minimum frequency nodes from the heap and add a new internal node with frequency 12 + 13 = 25.
Now our min heap contains 4 nodes, where 2 nodes are roots of trees with single element each, and two heap nodes are root of tree with more than one nodes.
Input Signal Frequency of occurrences
Internal Node 14
e 16
Internal Node =25
f 45
Next step would be extraction of two minimum frequency nodes. So we add a new internal node with frequency 14 + 16 = 30.
Now our min heap contains 3 nodes:
Input Signal Frequency of occurrences
Internal Node 25
Internal Node = 30
f 45
Then we woulkd extract two minimum frequency nodes and add a new internal node with frequency 25 + 30 = 55
Now our min heap contains 2 nodes.
Input Signal Frequency of occurrences
f 45
Internal Node =55
Last step is extraction of two minimum frequency nodes. Add a new internal node with frequency 45 + 55 = 100. Now we got min heap which contains only one node.
Input Signal Frequency of occurrences
Internal Node 100
Since our heap contains only one remaining node, the Huffman PI algorithm stops here.
Now we would decode codes from Huffman PI Tree. We would traverse the formed tree, starting from the root, while maintain an auxiliary code array. While moving to the left child we would assign 0 to the array. While we are moving to the right child, we would assign 1 to the code array. Then decode array till the end when a leaf node is encountered.
The decoded codes are now:
Signal Code weight
f 0
c 100
d 101
a 1100
b 1101
e 111
Now we would write small C-program for Huffman PI Coding
#include <stdio.h>
#include <stdlib.h>
#define MAX_TREE_HT 100
// Huffman PI tree node
struct MinHeapNode {
// One of the input signals
signal data;
// Frequency of the occurrenced signal
unsigned freq;
// Let's make left and right child of this node
struct MinHeapNode *left, *right;
};
// Let's make min heap - collection of Hufmman PI Tree nodes
struct MinHeap {
// Size of min heap
unsigned size;
// capacity of min heap
unsigned capacity;
// minheap node pointers array
struct MinHeapNode** array;
};
// This small function would allocate a new min heap node with given signal and his occurrence frequency
struct MinHeapNode* newNode(signal data, unsigned freq)
{
struct MinHeapNode* temp
= (struct MinHeapNode*)malloc
(sizeof(struct MinHeapNode));
temp->left = temp->right = NULL;
temp->data = data;
temp->freq = freq;
return temp;
}
// This function would create min heap of given capacity
struct MinHeap* createMinHeap(unsigned capacity)
{
struct MinHeap* minHeap
= (struct MinHeap*)malloc(sizeof(struct MinHeap));
// current size is 0
minHeap->size = 0;
minHeap->capacity = capacity;
minHeap->array
= (struct MinHeapNode**)malloc(minHeap->
capacity * sizeof(struct MinHeapNode*));
return minHeap;
}
// Small swap min heap nodes function
void swapMinHeapNode(struct MinHeapNode** a,
struct MinHeapNode** b)
{
struct MinHeapNode* t = *a;
*a = *b;
*b = t;
}
// The normal min heap function
void minHeapify(struct MinHeap* minHeap, int idx)
{
int smallest = idx;
int left = 2 * idx + 1;
int right = 2 * idx + 2;
if (left < minHeap->size && minHeap->array[left]->
freq < minHeap->array[smallest]->freq)
smallest = left;
if (right < minHeap->size && minHeap->array[right]->
freq < minHeap->array[smallest]->freq)
smallest = right;
if (smallest != idx) {
swapMinHeapNode(&minHeap->array[smallest],
&minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}
// Check size of heap function
int isSizeOne(struct MinHeap* minHeap)
{
return (minHeap->size == 1);
}
// Small extract function for extraction of min. value node from heap
struct MinHeapNode* extractMin(struct MinHeap* minHeap)
{
struct MinHeapNode* temp = minHeap->array[0];
minHeap->array[0]
= minHeap->array[minHeap->size - 1];
--minHeap->size;
minHeapify(minHeap, 0);
return temp;
}
// Insert new node to Min Heap function
void insertMinHeap(struct MinHeap* minHeap,
struct MinHeapNode* minHeapNode)
{
++minHeap->size;
int i = minHeap->size - 1;
while (i && minHeapNode->freq < minHeap->array[(i - 1) / 2]->freq) {
minHeap->array[i] = minHeap->array[(i - 1) / 2];
i = (i - 1) / 2;
}
minHeap->array[i] = minHeapNode;
}
// min heap build funvtion
void buildMinHeap(struct MinHeap* minHeap)
{
int n = minHeap->size - 1;
int i;
for (i = (n - 1) / 2; i >= 0; --i)
minHeapify(minHeap, i);
}
// Decode function of an array of size n
void printArr(int arr[], int n)
{
int i;
for (i = 0; i < n; ++i)
printf("%d", arr[i]);
printf("\n");
}
// Check function if this node is leaf
int isLeaf(struct MinHeapNode* root)
{
return !(root->left) && !(root->right);
}
// This fuction creates a min heap of capacity equal to size and inserts all signals data in min heap
// Initially size of min heap is equal to capacity
struct MinHeap* createAndBuildMinHeap(signal data[], int freq[], int size)
{
struct MinHeap* minHeap = createMinHeap(size);
for (int i = 0; i < size; ++i)
minHeap->array[i] = newNode(data[i], freq[i]);
minHeap->size = size;
buildMinHeap(minHeap);
return minHeap;
}
// This is main Huffman Tree Build fucntion
struct MinHeapNode* buildHuffmanTree(signal data[], int freq[], int size)
{
struct MinHeapNode *left, *right, *top;
// Step 1: Create a min heap of capacity
// equal to size. Initially, there are
// modes equal to size.
struct MinHeap* minHeap = createAndBuildMinHeap(data, freq, size);
// Do iterate while size of heap doesn't become 1
while (!isSizeOne(minHeap)) {
// Extraction the two minimum freq. from min heap
left = extractMin(minHeap);
right = extractMin(minHeap);
// This function would create new internal node with occurrence frequency
// is equal to the sum of the two nodes frequencies. Then we make the two extracted
// nodes as left and right children of this new node.
// Finally we would add this node to the min heap
top = newNode('$', left->freq + right->freq);
top->left = left;
top->right = right;
insertMinHeap(minHeap, top);
}
// Root node is remaining node and the tree is complete.
return extractMin(minHeap);
}
// Decode Huffman PI codes from the root of Huffman Tree.
// This function uses array to store codes
void printCodes(struct MinHeapNode* root, int arr[], int top)
{
// Assign 0 to left edge and recur
if (root->left) {
arr[top] = 0;
printCodes(root->left, arr, top + 1);
}
// Assign 1 to right edge and recur
if (root->right) {
arr[top] = 1;
printCodes(root->right, arr, top + 1);
}
// If this is a leaf node, it would contains one of the input signals, its corresponding code from array
if (isLeaf(root)) {
printf("%c: ", root->data);
printArr(arr, top);
}
}
// Traversing function which builds Huffman PI Tree and decode codes by traversing built Huffman Tree
void HuffmanCodes(signal data[], int freq[], int size)
{
// Let's buid Huffman PI Tree
struct MinHeapNode* root
= buildHuffmanTree(data, freq, size);
// Decode Huffman PI codes
int arr[MAX_TREE_HT], top = 0;
printCodes(root, arr, top);
}
// Let's test all functions
int main()
{
signal arr[] = { 'a', 'b', 'c', 'd', 'e', 'f' };
int freq[] = { 5, 9, 12, 13, 16, 45 };
int size = sizeof(arr) / sizeof(arr[0]);
HuffmanCodes(arr, freq, size);
return 0;
}
Result would be such:
f: 0
c: 100
d: 101
a: 1100
b: 1101
e: 111
Huffman PI coding overal time complexity is n*log(n), where n is the number of unique sampled signals
If there are n nodes, ExtractMin(function) is called 2*(n – 1) times. Each ExtractMin(function) takes log(n) time as
it calles minHeapify(function).
That's it. Happy analysing and prospecting.
DonCaffeDeLaSkobalj - LRL Master

Comment