Getting started¶

Building¶

To build the library run:

python waf configure
python waf build


To integrate this library into another project, you can also install all the nessecary files needed to use the library (the static library, headers, etc.). You can change the output folder by passing a different path to --destdir:

python waf install --destdir=kodo-slide_install


Usage¶

See the API of the encoder and decoder in src/kodo_slide/encoder.hpp and src/kodo_slide/decoder.hpp.

An example is available in test/src/test_code.cpp.

Definitions¶

To understand the API of kodo-slide’s ECC implementation a bit of terminology is needed:

• A source symbol: Unit of data used for the encoding. Typically a symbol corresponds to a data packet to be sent over the network.

• A symbol: The output of the encoder. Typically a linear combination of a set of source symbols.

• The symbol size: This is the size in bytes of a symbol. Currently the algorithm expects that all symbols have the same size - for most applications this is not a problem. In special applications with varying symbol sizes we currently need to perform padding before passing the symbol data to the encoding/decoding algorithms.

• The finite field and coding coefficients: The finite field determines the size of the coding coefficients used to produces a coded symbol. As a rough rule of thumb:

• A larger finite field yields a “stronger” code i.e. better decoding probability. However, using a large finite field is also more computational complex.

We have one coding coefficient for each symbol used to produce a coded symbol.

• The coding rate: Most erasure coding algorithms allow you to select a coding rate. The rate is typically specified by two numbers n and k where the coding rate is k/n (n divided by k). They specify that after sending n symbols, k of those will be original source symbols. Often r is defined as r = n - k, which denotes the number of repair symbols we are sending. See Parameter selection.

• The stream: This is the set of symbols currently available at the encoder for encoding or at the decoder for decoding. Using the API it is possible to either push new symbols to the front of the stream or pop symbols from the back of the stream.

• The window: This is the set of symbols included in a specific encoding. The window is a subset of the stream. The API allows full control over both the stream and window.

The following diagram gives an overview of how these three concepts relate:

Data Symbols:
+---+  +---+  +---+  +---+  +---+  +---+  +---+  +---+
| 0 |  | 1 |  | 2 |  | 3 |  | 4 |  | 5 |  | 6 |  | 7 |      ...
+---+  +---+  +---+  +---+  +---+  +---+  +---+  +---+
|            stream             |
+-------------------------------+
|    window      |
+----------------+                           ...


In the above example the stream includes five symbols 1 through 5. The window which is a subset of the stream includes symbol 2 through 4. To recap:

• Symbols 1 through to 5 are currently available in memory (in the stream).
• Symbols 2 through to 4 are in the window i.e. they will be included in a coded symbol produced by the encoder. At the decoder the next coded symbol read is expected to be a linear combination of the symbols in the window.

Coefficients memory layout¶

When encoding or decoding data the API expects to receive the coding coefficients. You can generate these coding coefficients yourself - or use the kodo-slide in-built coefficient generator.

The algorithms expect the coding coefficients to have a specific memory layout which we will explain here.

The reason we choose this memory layout was to achieve two goals:

1. The bit position of a coefficient corresponding to a specific symbol index in the window should be constant within memory.
2. It should be possible to resize the coding window without remapping coefficients.

We will use LSB 0 bit numbering (see Bit numbering (bit endianness)).

Example: Binary field¶

In the binary field each coefficient is a single bit.

Initial example, if we have three symbols in the window with the first symbol being index zero (we call this the window_lower_bound).

In this case we would have three coefficients at bit index 0, 1 and 2 below represented by capital C:

  7   6   5   4   3   2   1   0
+-------------------------------+
| 0   0   0   0   0   C   C   C |
+-------------------------------+
^
|
least significant +------+
bit


As the window moves to encode a different set of symbols, so will the coefficients offset within the byte e.g. lets see the coefficients for encoding symbol 2,3 and 4:

  7   6   5   4   3   2   1   0
+-------------------------------+
| 0   0   0   C   C   C   0   0 |
+-------------------------------+
^
|
least significant +------+
bit


Notice how the coefficient for symbol 2 remains at the same position with in the byte.

Once the window_lower_bound reaches 6, coefficients will “spill” and we need to use an additional byte:

  7   6   5   4   3   2   1   0   7   6   5   4   3   2   1   0
+-------------------------------+-------------------------------+
| C   C   0   0   0   0   0   0 | 0   0   0   0   0   0   0   C |
+-------------------------------+-------------------------------+
^                               ^
|                               |
least significant +------+      least significant +------+
bit in first byte               in second byte


Effectively bit index 0 in the second byte corresponds to the coefficient encoding the symbol with index 8.

Once the coding window leaves the first byte i.e. window_lower_bound >= 8 we can omit the fist byte since it only contains leading zeros.

It is important that all bits not in the window are zero’ed.

Example: Binary8 field¶

In the binary8 field each coefficient is eight bits (or one byte).

As in the binary example, lets imagine we have three symbols in the window. Since we are in binary8 we simply need set three bytes:

                  0                 8                 16
+-----------------+-----------------+-----------------+
| C C C C C C C C | C C C C C C C C | C C C C C C C C |
+-----------------+-----------------+-----------------+
^                 ^                 ^
|                 |                 |
first byte +     second byte +      third byte +



The coding coefficient corresponding to symbol zero is the first byte, the second byte corresponds to symbol with index one and so forth.

As the window moves we interpret the first byte as the one corresponding to the symbol at the window_lower_bound.

Bit numbering (bit endianness)¶

In the following we will use LSB 0 bit numbering. LSB 0 is where the bit numbering starts at zero for the least significant bit (LSB). As an example say we have a single byte (8 bits):

   least significant +--------+
bit                        |
v
+-------------------------------+
| 0   1   0   1   1   1   0   0 |
+-------------------------------+
^
|             most significant
+-----------+ bit


Lets number the bits inside byte given earlier according to the LSB 0 bit numbering:

  7   6   5   4   3   2   1   0
+-------------------------------+
| 0   1   0   1   1   1   0   0 |
+-------------------------------+


In the following we will omit all but the LSB index:

                              0
+-------------------------------+
| 0   1   0   1   1   1   0   0 |
+-------------------------------+


For a stream of bytes we number assume little endian byte order least significant byte first:

                              0                               8
+-------------------------------+-------------------------------+
| 0   0   0   0   0   0   0   0 | 0   0   0   0   0   0   0   0 |
+-------------------------------+-------------------------------+
^                               ^
|                               |
least significant +------+      least significant +------+
bit in first byte               in second byte