Saturday, July 30, 2016

Introduction

Geometric coding is a new method of encoding data by representing each byte or word of data using geometric shapes. Data is split either in to bytes or fixed length words and each byte or word is encoded using a specific geometric shape.

A byte contains 8 bits and the value is represented as a binary number using bits 0 and 1. Each bit is multiplied with the power of two where the exponent is the position of the bit. The least significant bit represents 20  and the most significant bit represents 27.

In geometric encoding an octagon is used to represent a byte. The corners of the octagon represent each  bit value either 0 or 1. The corners are numbered from 0 to 7 to represent each bit position of a byte. A byte value is encoded using the shape formed by connecting the 1 bit corners of the octagon. We can also use 0 bit corners instead of 1 bit corners.

The number of 1 bits in a byte determine the geometric shape to  represent it and the position of the 1 bits determine the type of the geometric shape.

For example the decimal number 153 is 10011001 in binary representation and a rectangle shape is formed by connecting the 1 bits in it.


There are three other binary numbers which are represented by a rectangle




The four numbers 51, 102, 153 and 204 are represented with the same shape rectangle but the orientation of the rectangle is different for each number. Each orientation of the rectangle connects different corners of the octagon. These orientations are also called rotations.

There are nine other possible types of shapes with four edges and connecting four different corners of the octagon. These nine types with their rotations represent 66 binary numbers.

In geometric coding a binary number is represented by three parts Shape, sub type of shape and its orientation.

Shape

The Shape of a binary number is determined by number of 1 bits in it. For example a binary number containing five 1 bits is represented using a Pentagon.

Sub Type of Shape

The spread of 1 bits in a binary number is represented by the types of the selected shape representing it. For example there are four types of hexagons representing different spreads of six 1 bits.

Orientation (Rotation)

Positions of 1 bits in a binary number is represented by the orientation of the selected shape representing it. For example there are eight binary numbers with single 1 bit in eight different positions.

All possible shapes representing number of 1 bits and their types are given below.


Following table describes each shape and how many binary numbers it can represent with its sub types and rotations.

# 1 bits Shape Types Pattern count
0 None 0 1
1 Point 0 8
2 Line 4 28
3 Triangle 7 56
4 Quadrilateral 10 70
5 Pentagon 7 56
6 Hexagon 4 28
7 Heptagon 0 8
8 Octagon 0 1


There are different types of schemes to encode binary numbers using geometric code.

Fixed Encoding

Fixed encoding scheme requires 3  bits to represent eight different shapes, 0 to 4 bits to represent the type of shape and 0 to 3 bits to represent up to 8 possible orientations. For example the binary number 255 is encoded with 3 bits representing the Octagon shape. The number 51 is encoded with 9 bits, 3 bits representing the Quadrilateral shape, 4 bits representing the type Rectangle and 2 bits to represent the orientation. The binary number 0 is also encoded by same 3 bits to encode the binary number 255. To differentiate 0 from 255 an additional bit is used. In this scheme, the numbers are encoded in 4 bits to maximum 10 bits in the worst case. If the data contains more numbers represented by Quadrilateral shape then the compressed data size can be larger than the actual data size.

Variable Encoding

Variable encoding scheme uses 2 to 3 bits to represent Shape. There are seventy binary numbers represented using Quadrilateral shapes and only two binary numbers 255 and 0 represented using Octagon shape. Like Huffman coding the most frequently occurring shape is encoded with fewer number of bits and least occurring shapes encoded with more bits. With this scheme 9 bits required in the worst case.


Block Encoding

In block encoding the data is split in to blocks of fixed number of bytes and each block is encoded independently. In each block the bytes are encoded using variable encoding scheme. There are 8 possible different shapes and some shapes occur multiple times in a block and some shapes may not present in a block. We can use statistical redundancy to further reduce number bits used to compress a block.

Once each number in a block is converted in to shape,type and orientation, there are two ways to encode them. First method is to encode each number's components together and the second method is to encode each component separately.



Using words instead of bytes

We can also split data in to 16 or 32 bit words to achieve greater statistical redundancy. If 16 bit words used then there are 16 different shapes to represent binary numbers. In a sample experiment with jpeg images split in to 64 byte blocks (32 16-bit words), some blocks contained fewer than 8 shapes and some shapes were repeated in succession. If a pattern is repeated in succession we can use single bit to encode the repeated patterns.

Error Correction

There is a possibility of using Geometric coding to correct errors. It uses fixed patterns to encode binary numbers so at receiving end errors can be detected.

patterns.py

patterns.py is a tool to count the patterns for a byte or 16-bit word.


In the coming blog articles I will discuss in detail the block encoding method.