HK1172164B

HK1172164B - Methods and apparatus employing fec codes with permanent inactivation of symbols for encoding and decoding processes

Info

Publication number: HK1172164B
Application number: HK12113015.0A
Authority: HK
Inventors: M．G．卢比; M．A．肖克罗拉西; L．C．明德
Original assignee: 高通股份有限公司
Priority date: 2009-08-19
Filing date: 2010-08-19
Publication date: 2015-11-20

Description

Method and apparatus for employing FEC codes that permanently passivate symbols for encoding and decoding processes

Cross-referencing

The present application is a partial continuation of U.S. patent application No. 12/604,773, filed on 23.10.2009, entitled "Method and Apparatus for using FEC Codes that permanently passivate Symbols" by the inventor, entitled m.amin Shokrollahi et al, and further claims that the inventors are all m.amin Shokrollahi et al, entitled "Method and Apparatus for using FEC Codes that permanently passivate Symbols for Encoding and Decoding", and that the inventors are all priority of the following provisional applications: U.S. provisional patent application No. 61/353,910 filed on 11/2010, U.S. provisional patent application No. 61/257,146 filed on 11/2/2009, and U.S. provisional patent application No. 61/235,285 filed on 19/8/2009. Each of the provisional and non-provisional applications cited above are hereby incorporated by reference in their entirety.

The following references are hereby incorporated by reference in their entirety:

1) luby, U.S. patent No. 6,307,487 entitled "Information Additive Code Generator and decoder for Communication Systems" (hereinafter "Luby I");

2) luby, U.S. Pat. No. 6,320,520 entitled "Information Additive Group code Generator and Decoder for Communication Systems" (hereinafter "Luby II");

3) U.S. patent No. 7,068,729 entitled Multi-Stage Code Generator and decoder for Communication Systems (hereinafter referred to as "Shokrollahi I") to m.amin Shokrollahi;

4) amin shutdown U.S. patent No. 6,856,263 entitled "Systems and Processes for Decoding a chain Reaction Code Through activation" (hereinafter "shutdown II");

5) american patent No. 6,909,383 entitled "Systematic Encoding and Decoding of chain Reaction Codes" (hereinafter referred to as "Shokrollahi III");

6) U.S. patent publication No. 2006/0280254 (hereinafter "Luby III") to Michael g.luby and m.amin Shokrollahi and entitled "In-place transformations with Applications to Encoding and Decoding variants Classes of codes" (applied to the In-place transformations of Encoding and Decoding Various Classes of codes);

7) U.S. patent publication No. 2007/0195894 (hereinafter "shokrollahij") entitled Multiple Field Based codegenerators and decoders for Communications Systems, entitled m.amin Shokrollahi.

Technical Field

The present invention relates to encoding and decoding data in a communication system, and more particularly to a communication system that encodes and decodes data in a highly efficient manner to account for errors and gaps in the communicated data.

Background

Techniques for transferring files between a sender and a receiver over a communication channel are the subject of many documents. Preferably, the receiver wishes to receive an exact copy of the data transmitted by the sender over the channel with some degree of certainty. In the case of channels without perfect fidelity, which encompasses almost all physically realizable systems, it is of interest how to deal with data lost and garbled in transmission. Lost data (erasures) are generally easier to handle than corrupted data (errors) because the receiver cannot always discern when corrupted data is data that was received in error. Many error correcting codes have been developed to correct erasures and/or errors. Typically, the particular code used is selected based on some information regarding the distortion of the channel over which the data is being transmitted and the characteristics of the data being transmitted. For example, where the channel is known to have a long period of distortion, burst error codes may be most suitable for this application. Where only short and infrequent errors are expected, a simple parity code may be optimal.

As used herein, "source data" refers to data available at one or more senders and obtained using a receiver by recovering from a transmitted sequence with or without errors and/or erasures, etc. As used herein, "encoded data" refers to data that is communicated and can be used to recover or obtain source data. In a simple case, the encoded data is a copy of the source data, but if the received encoded data is different (e.g., due to errors and/or erasures) than the transmitted encoded data, then in the simple case, the source data may not be fully recoverable in the absence of additional data about the source data. The transmission may be in space or time. In more complex cases, encoded data is generated in a transform based on source data and transmitted from one or more senders to a receiver. If the source data is found to be part of the encoded data, the encoding is referred to as "systematic". In a simple example of systematic encoding, redundant information about the source data is appended to the end of the source data to form encoded data.

As used herein, "input data" refers to data present at an input of an FEC (forward error correction) encoder device or FEC encoder module, component, step, etc. ("FEC encoder"), while "output data" refers to data present at an output of the FEC encoder. Accordingly, the desired output data will appear at the input of the FEC decoder, and the FEC decoder will be desired to output the input data or its corresponding data based on the output data it processes. In some cases, the input data is or includes source data, and in some cases, the output data is or includes encoded data. In other cases, the sender device or sender program code may include more than one FEC encoder, i.e., the source data is transformed into encoded data in a series of multiple FEC encoders. Similarly, at the receiving side, there may be more than one FEC decoder applied to generate the source data from the received encoded data.

The data may be considered to be divided into symbols. An encoder is a computer system, device, electronic circuit, or the like that generates encoded symbols or output symbols from a sequence of source symbols or input symbols, and a decoder is a counterpart that recovers the sequence of source symbols or input symbols from received or recovered encoded symbols or output symbols. The encoder and decoder are separated in time and/or space by a channel, and any received encoded symbols may not be exactly the same as the corresponding transmitted encoded symbols, and they may not be received in exactly the same order in which they were transmitted. The "size" of a symbol can be measured in bits, whether or not the symbol is actually broken into a stream of bits, where a symbol is selected from 2^MAn alphabet of symbols, the symbols have a size of M bits.In many of the examples herein, the symbols are measured in bytes and the code may be located over 256 possible fields (256 possible 8-bit patterns), but it is understood that different units of data measurement may be used and that it is known to measure data in various ways.

Luby I describes the use of codes such as chain reaction codes to address error correction in a computationally efficient, memory efficient, and bandwidth efficient manner. One attribute of the encoded symbols produced by the chain reaction encoder is that the receiving party is able to recover the original file as long as enough encoded symbols have been received. Specifically, to recover the original K source symbols with high probability, the receiving side needs about K + a encoded symbols.

The "absolute reception overhead" for a given situation is represented by the value a, while the "relative reception overhead" may be calculated as the ratio a/K. The absolute receive overhead is a measure of how much additional data needs to be received in addition to the information theory minimum data amount, and it may depend on the reliability of the decoder and may vary as a function of the number of source symbols, K. Similarly, the relative reception overhead a/K is a measure of how much additional data needs to be received relative to the size of the source data being recovered in addition to the information theory minimum data amount, and may also depend on the reliability of the decoder and may vary as a function of the number of source symbols K.

Chain reaction codes are extremely useful for communication over packet-based networks. However, they can sometimes be quite computationally intensive. A decoder may be able to decode more frequently or more easily if a static encoder is used to encode the source symbols before a dynamic encoder that encodes using chain reaction or other rate-free codes (rateless). Such a decoder is shown, for example, in Shokrollahi I. In the example shown therein, the source symbol is an input symbol to a static encoder that produces an output symbol as an input symbol to a dynamic encoder that produces an output symbol that is an encoded symbol, where the dynamic encoder is a rateless encoder that can generate a number of output symbols in an amount that is not a fixed ratio with respect to the number of input symbols. The static encoder may include more than one fixed rate encoder. For example, the static encoder may include a Hamming (Hamming) encoder, a low density parity check ("LDPC") encoder, a high density parity check ("HDPC") encoder, and/or the like.

Chain reaction codes have the following properties: when some symbols are recovered from the received symbols at the decoder, the symbols may be used to recover additional symbols, which in turn may be used to recover more symbols. Preferably, the chain reaction of symbol solution at the decoder can continue so that all the desired symbols are recovered before the pool of received symbols is exhausted. Preferably, the computational complexity of performing the chain reaction encoding and decoding process is low.

The recovery process at the decoder may involve determining which symbols are received, creating a matrix that may map the original input symbols to those encoded symbols received, then inverting the matrix and performing a matrix multiplication of the inverted matrix with the vector of received encoded symbols. In a typical system, this brute-force implementation can consume excessive computing effort and memory requirements. Of course, for a particular set of received encoded symbols, all of the original input symbols may not be recovered, and even where it is possible to recover all of the original input symbols, the results may be computationally expensive to calculate.

Shokrollahi ii describes an approach called "passivation", where decoding occurs in two steps. In a first step, the decoder observes what the matrix may look like what it has available received encoded symbols, and at least roughly determines the sequence of decoding steps that will allow the chain reaction process to complete given these received encoded symbols. In a second step, the decoder runs a chain reaction decoding according to the determined sequence of decoding steps. This may be done in a memory efficient manner (i.e., a manner that requires less memory storage to operate than a less memory efficient process).

In the passivation approach, a first decoding step involves manipulating a matrix or its equivalent to determine a certain number of input symbols that can be solved out and, when the determination is interrupted, designating one of the input symbols as a "passivation symbol" and continuing the determination process by assuming that the passivation symbol was actually solved out, and then at the end, solving for these passivation symbols using gaussian elimination or some other method to invert the matrix that is much smaller than the original decoding matrix. Using this determination, a chain reaction sequence may be performed on the received encoded symbols to obtain recovered input symbols, which may be all or a suitable set of original input symbols.

For some applications that impose strict constraints on the decoder, such as where the decoder is a low power device with limited memory and computational power, or such as when there are strict constraints on allowable absolute or relative reception overhead, improved methods may be indicated with respect to the passivation approach described above.

In addition, methods for dividing a file or large data block into as few source blocks as possible subject to a minimum sub-symbol size, and then splitting the source blocks into as few sub-blocks as possible subject to a maximum sub-block size, may be useful.

Brief summary of the invention

According to one embodiment of an encoder in accordance with aspects of the invention, an encoder of a sender transmits an ordered set of source symbols from one or more senders to one or more receivers over a communication channel, wherein the encoder generates data to be transmitted that includes a plurality of encoded symbols generated from the source symbols. In a first step, the intermediate symbols are generated from the source symbols using a reversible method, i.e. there is also an inverse method for generating the source symbols from the intermediate symbols. In another step, the intermediate symbols are divided into a first set of intermediate symbols and a second set of intermediate symbols, wherein there is at least one intermediate symbol in the first set of intermediate symbols and at least one intermediate symbol in the second set of intermediate symbols, and at least one encoded symbol is generated from at least one intermediate symbol from each of the two sets. In some variations, there are more than two sets.

In some embodiments, values of the first and second sets of temporary symbols are generated, wherein the values of the first set of temporary symbols depend on the values of the first set of intermediate symbols and the values of the second set of temporary symbols depend on the values of the second set of intermediate symbols. The values of the encoded symbols are generated from the first and second sets of temporary symbols.

In some variations, the number of encoded symbols that can be generated is independent of the number of source symbols.

Decoder embodiments are also provided. In one embodiment of the decoder according to aspects of the present invention, the decoder of the receiver receives encoded symbols generated from intermediate symbols, wherein the intermediate symbols are generated from the source symbols using a reversible method, i.e., there is also an inverse method for generating the source symbols from the intermediate symbols, and wherein at least one intermediate symbol is designated as a permanently inactive symbol and there is at least one other intermediate symbol that is not among the permanently inactive symbols. A decoder decodes the set of intermediate symbols from the received encoded symbols, and the decoder accounts for the at least one permanently inactive symbol and generates source symbols from the set of decoded intermediate symbols using the inverse method.

Upon decoding, the decoding step is scheduled and the scheduling of the permanently inactive symbols is put aside. The permanently inactive symbols may be solved using a novel or conventional method and then used to solve for other intermediate symbols. One way to solve for the permanently passivated symbols (and other on-the-fly passivations if used) may be to solve for the passivated symbols by applying gaussian elimination. Some of the remaining intermediate symbols are recovered based on the recovered permanently inactive symbols and the value of the received encoded symbol.

In some variations of the decoding method, the permanently inactive symbols include a second set of intermediate symbols from the encoding embodiment. In some variations of the decoding method, the permanently inactive symbols comprise a subset of the intermediate symbols, wherein the corresponding encoding method is not a multi-level chain reaction code. Such encoding methods may include one or more of a Tornado code, a Reed-Solomon code, a chain reaction code (an example described in Luby I), or the like for the subset of intermediate symbols.

The intermediate symbols are used for encoding and decoding, wherein the method for generating the intermediate symbols from the source symbols and the corresponding inverse method are indicated for a desired set of performance characteristics, such as decodability. In some embodiments, the intermediate symbols comprise source symbols. In some embodiments, the intermediate symbols include source symbols along with redundant symbols generated from the source symbols, where the redundant symbols may be chain reaction symbols, LDPC symbols, HDPC symbols, or other types of redundant symbols. Alternatively, the intermediate symbols may be based on a prescribed relationship between the symbols, such as a relationship between the intermediate symbols and the source symbols, and additional LDPC and HDPC relationships between the intermediate symbols, with the decoding method being used to generate the intermediate symbols from the source symbols based on the prescribed relationship.

The methods and systems may be implemented by electronic circuitry, or by a processing device executing programming instructions and having appropriate instruction program code for implementing the encoding and/or decoding.

Numerous benefits are achieved by the present invention. For example, in particular embodiments, computational overhead of encoding data for transmission over a channel is reduced. In another embodiment, computational expense for decoding such data is reduced. In another embodiment, the absolute and relative reception overhead is reduced considerably. Depending on the embodiment, one or more of these benefits may be achieved. These and other benefits are provided in more detail throughout the present specification and more particularly below.

A further understanding of the nature and advantages of the inventions disclosed herein may be realized by reference to the remaining portions of the specification and the attached drawings.

Brief Description of Drawings

Fig. 1 is a block diagram of a communication system using multi-level coding including permanent passivation, along with other features and elements.

Fig. 2 is a table of variables, arrays, and the like as used in the other figures herein.

Fig. 3 is a block diagram of a particular embodiment of the encoder shown in fig. 1.

Fig. 4 is a block diagram illustrating the dynamic encoder of fig. 3 in more detail.

Fig. 5 is a flow chart illustrating a permanent Passivation (PI) encoding process.

Fig. 6 is a flow chart illustrating a dynamic encoding process.

Fig. 7 is a flow diagram of operations to calculate weights for symbol calculation.

Fig. 8 illustrates a table, which may be stored in memory, that may be used to determine a degree of a symbol based on a lookup value.

Fig. 9 shows a matrix used in an encoding or decoding process.

FIG. 10 shows equations representing portions of the matrix shown in FIG. 9 with respect to specific minimum polynomials.

Fig. 11 is a flow chart illustrating a process for building an array for use in encoding or decoding.

Fig. 12 illustrates a matrix representation of a system of equations to be solved by a decoder using a sub-matrix SE representing R static symbols or equations known to the decoder to recover an array C () representing recovered source symbols from an array D () representing received encoded symbols.

Fig. 13 illustrates a matrix resulting from row/column permutation of the matrix of fig. 12 using OTF passivation.

FIG. 14 is a block diagram describing a process for generating the matrix in FIG. 12.

Fig. 15 illustrates a matrix representation of a system of equations to be solved by a decoder using the sub-matrix SE and the sub-matrix corresponding to the permanently inactive symbols to recover the array C () representing the recovered source symbols from the array D () representing the received encoded symbols.

Fig. 16 is a flow chart illustrating a process for generating an LT sub-matrix as may be used in the matrix of fig. 12 or the matrix of fig. 15.

Fig. 17 is a flow chart illustrating a process for generating a PI sub-matrix as may be used in the matrix of fig. 15.

Fig. 18 is a block diagram of a matrix generator.

Fig. 19 is a flowchart illustrating a process for generating SE submatrices.

Fig. 20 is a flowchart illustrating a process for generating a PI submatrix.

Fig. 21 is a flow chart illustrating a process for resolving recovered symbols in a decoder.

Fig. 22 illustrates a matrix representation after permutation of a system of equations to be solved by a decoder to recover an array C () representing recovered source symbols from an array D () representing received encoded symbols.

Fig. 23 illustrates a matrix representation of a system of equations to be solved by the decoder and corresponding to the matrix shown in fig. 26.

Fig. 24 illustrates a matrix representation that may be used as part of a decoding process.

Fig. 25 illustrates a matrix representation that may be used as another part of the decoding process.

Fig. 26 illustrates a matrix representation of the system of equations to be solved by the decoder after partial solution.

Fig. 27 is a flow chart illustrating another process for resolving recovered symbols in a decoder.

Fig. 28 illustrates a matrix representation of a system of equations to be solved by the decoder.

Fig. 29 illustrates a matrix representation of the system of equations to be solved by the decoder.

Fig. 30 illustrates an example encoding system that may be implemented as hardware modules, software modules, or portions of program code stored in a program storage and executed by a processor, perhaps as integrated code units that are not separate as shown in this figure.

Fig. 31 illustrates an example decoding system that can be implemented as hardware modules, software modules, or portions of program code stored in a program storage and executed by a processor, possibly as integrated code units that are not separate as shown in this figure.

Attached appendix a is a code specification for a specific embodiment of an encoder/decoder system, error correction scheme, and application for reliable delivery of data objects, sometimes with details of the invention used, which also includes specifications of how the system encoder/decoder can be used for object delivery transmission. It should be understood that the specific embodiments described in appendix A are not limiting examples of the invention and that some aspects of the invention may use the teachings of appendix A while other aspects may not. It is also to be understood that the restrictive language in appendix A may or may not be dependent on the claimed invention, and therefore the claim language is not limited by such restrictive language.

Detailed description of the specific embodiments

Details for implementing parts of the encoder and decoder cited herein are provided by Luby I, LubyII, Shokrollahi I, Shokrollahi II, Shokrollahi III, Luby III, and Shokrollahi IV and are not all repeated here for the sake of brevity. Their entire disclosure is hereby incorporated by reference in its entirety, and it should be understood that implementations thereof are not essential to the invention unless otherwise specified, and that many other variations, modifications, or alternatives may also be used.

Multi-level encoding as described herein encodes source data in multiple levels. Typically, but not always, the first stage adds a predetermined amount of redundancy to the source data. The second stage then uses a chain reaction code or the like to generate encoded symbols from the original source data and the redundant symbols calculated by the first stage encoding. In one embodiment, a chain reaction decoding process is used to first decode the received data. If the process does not successfully recover the original data in its entirety, a second decoding step may be applied.

Some embodiments taught herein may be applied to many other types of code, for example, to code as described in the Internet Engineering Task Force (IETF) request for comments (RFC)5170 (hereinafter "IETF ldpc code"), and to code described in U.S. patent nos. 6,073,250, 6,081,909, and 6,163,870 (hereinafter "Tornado code"), to improve the reliability and/or CPU and/or memory performance of these types of code.

One advantage of some embodiments taught herein is that fewer arithmetic operations are required to generate an encoded symbol than chain reaction coding alone. Another advantage of some embodiments that include first level encoding and second level encoding is that the first level encoding and second level encoding may be done at separate times and/or by separate devices, thus dividing the computational load and minimizing the total computational load as well as memory size and access pattern requirements. In the multi-level encoding embodiment, redundant symbols are generated from the input file during the first level of encoding. In these embodiments, in the second level of encoding, encoded symbols are generated from a combination of the input file and the redundant symbols. In some of these embodiments, the encoded symbols may be generated on-demand. In embodiments in which the second stage includes chain reaction coding, each encoded symbol may be generated without regard to how other encoded symbols are generated. These encoded symbols, once generated, may then be placed into packets and transmitted to their destinations, where each packet contains one or more encoded symbols. Non-packetized transmission techniques may also be used instead of or in addition to.

As used herein, the term "file" refers to any data stored at one or more sources and to be delivered as a unit to one or more destinations. Thus, documents, images, and files from a file server or computer storage device are all examples of "files" that can be delivered. The file may be of a known size (such as a one megabyte image stored on a hard disk) or may be of an unknown size (such as a file retrieved from the output of a streaming source). In either case, the file is a sequence of source symbols, where each source symbol has a position and a value within the file. "file" may also be used to refer to a short portion of the streaming source, i.e., the data stream may be divided into one-second intervals, and the source data blocks within each such one-second interval may be considered a "file". As another example, blocks of data from a video streaming source may be further divided into portions based on, for example, a priority of the data as defined by a video system capable of playing the video stream, and each portion of each block may be considered a "file". Thus, the term "document" is used generically and is not intended to be broadly limiting.

As used herein, a source symbol represents data to be transmitted or communicated, and an encoded symbol represents data generated based on the source symbol, communicated over a communication network, or stored to enable reliable reception and/or regeneration of the source symbol. An intermediate symbol represents a symbol used or generated during an intermediate step of an encoding or decoding process, where there is typically a method for generating the intermediate symbol from a source symbol and a corresponding inverse method for generating the source symbol from the intermediate symbol. The input symbols represent data input to one or more steps during an encoding or decoding process, and the output symbols represent data output from one or more steps during the encoding or decoding process.

In many embodiments, these different types or labeled symbols may be the same or include, at least in part, other types of symbols, and in some examples, these terms are used interchangeably. In one example, assume that the file to be transmitted is a text file having 1,000 characters, each character being treated as a source symbol. If these 1,000 source symbols are provided to the encoder as they are, which in turn outputs encoded symbols that are transmitted, the source symbols are also input symbols. However, in embodiments where the 1,000 source symbols are converted to 1,000 (or more or less) intermediate symbols in a first step and the intermediate symbols are provided to the encoder to generate encoded symbols in a second step, then in the first step the source symbols are input symbols and the intermediate symbols are output symbols, and in the second step the intermediate symbols are input symbols and the encoded symbols are output symbols, however, the source symbols are the total input symbols of the two-step encoder and the encoded symbols are the total output symbols of the two-step encoder. In this example, if the encoder is a systematic encoder, the encoded symbols may include source symbols along with repair symbols generated from intermediate symbols that are different from both the source symbols and the encoded symbols. In this example, if the encoder is instead a non-systematic encoder, the intermediate symbols may comprise the source symbols together with redundant symbols generated from the source symbols in the first step using, for example, an LDPC and/or HDPC encoder, while the encoded symbols are different from both the source symbols and the intermediate symbols.

In other examples, there are more symbols and each symbol represents more than one character. In either case where there is source-to-intermediate symbol conversion in the transmitter, the receiver may have the corresponding intermediate-to-source symbol conversion as the inverse conversion.

Transmission is the process of transmitting data from one or more senders to one or more recipients over a channel to deliver a file. The sender is sometimes also referred to as an encoder. If a sender is connected to any number of recipients through an ideal channel, the received data may be an exact copy of the source file since all of the data will be received correctly. Here, it is assumed that the channel is not ideal, which is the case for most practical channels. Of the many channel imperfections, two of the imperfections of interest are data erasure and data incompleteness (which can be considered as a special case of data erasure). Data erasure occurs when the channel is lost or data is discarded. Data incompleteness occurs when the receiver starts receiving data until some data has passed the receiver, the receiver stops receiving data before the end of the transmission, the receiver chooses to receive only a portion of the transmitted data, and/or the receiver intermittently stops and starts receiving data again. As an example of incomplete data, a mobile satellite sender may be transmitting data representing a source file and beginning transmission before a receiver is in range. Once the receiver is in range, the data may be received until the satellite moves out of range, at which point the receiver may redirect its satellite antenna (which does not receive data during that time) to begin receiving data about the same input file transmitted by another satellite moving into range. As will be apparent from reading this description, incomplete data is a special case of data erasure, because the receiver can treat incomplete data as if the receiver was within the range for the entire time but the channel was lost until the receiver started receiving all data before (and the receiver had the same problem). Also, as is well known in communication system design, a detectable error can be considered equivalent to an erasure by simply discarding all data blocks or symbols having detectable errors.

In some communication systems, a receiving side receives data generated by a plurality of transmitting sides, or data generated by one transmitting side using a plurality of connections. For example, to expedite downloading, a recipient may connect to more than one sender transmitting data about the same file at the same time. As another example, in a multicast transmission, multiple multicast data streams may be transmitted to allow receivers to be connected to one or more of the streams to match the aggregate transmission rate to the bandwidth of the channel connecting the receivers to the sender. In all such cases, it is of interest to ensure that all transmitted data is independently usable to the receiver, i.e. that multiple source data are not redundant between the streams, even when the transmission rates differ greatly for different streams and when there are any loss patterns.

In general, a communication channel is a channel that connects a sender and a receiver for data transmission. The communication channel may be a real-time channel, where the channel moves data from the sender to the receiver as the data is obtained, or the communication channel may be a storage channel that stores some or all of the data as it is transported from the sender to the receiver. Examples of the latter are disk storage or other storage devices. In this example, the program or device that generates the data may be considered the sender that transmits the data to the storage device. The receiving party is a program or device that reads data from the storage device. The mechanism by which the sender sends data to the storage device, the storage device itself, and the mechanism by which the receiver retrieves data from the storage device collectively form a channel. If there is a chance that these mechanisms or storage devices will lose data, it can be considered a data erasure in the communication channel.

When the sender and receiver are separated by a communication channel in which symbols are erased, it is preferable not to transmit an exact copy of the input file, but instead to transmit data generated from the input file that assists in erasure recovery. An encoder is a circuit, device module, or code segment that handles this task. One way to look at the encoder operation is for the encoder to generate encoded symbols from source symbols, where a sequence of source symbol values represents an input file. Each source symbol will therefore have a location and value within the input file. A decoder is a circuit, device, module, or code segment that reconstructs source symbols from encoded symbols received by a receiving party. In multi-level coding, the encoder and decoder are sometimes further divided into sub-modules that each perform different tasks.

In embodiments of a multi-level encoding system, the encoder and decoder may be further divided into sub-modules that each perform different tasks. For example, in some embodiments, the encoder includes what is referred to herein as a static encoder and a dynamic encoder. As used herein, a "static encoder" is an encoder that generates a number of redundant symbols from a set of source symbols, where the number of redundant symbols is determined prior to encoding. When static encoding is used in a multi-level encoding system, the combination of the source symbols and the redundant symbols generated from the source symbols using the static encoder is often referred to as intermediate symbols. Examples of potentially static encoding codes include Reed-Solomon codes, Tornado codes, hamming codes, LDPC codes such as IETF LDPC codes, and the like. The term "static decoder" is used herein to refer to a decoder capable of decoding data encoded by a static encoder.

As used herein, a "dynamic encoder" is an encoder that generates encoded symbols from a set of input symbols, where the number of possible encoded symbols is independent of the number of input symbols, and where the number of encoded symbols to be generated need not be fixed. Typically in multi-level coding, the input symbols are intermediate symbols generated using static coding, while the encoded symbols are generated from the intermediate symbols using a dynamic encoder. One example of a dynamic encoder is a chain reaction encoder, such as the encoders taught in Luby I and Luby II. The term "dynamic decoder" is used herein to refer to a decoder capable of decoding data encoded by a dynamic encoder.

In some embodiments, a decoding process applied to the source symbols for multi-level encoding and systematic encoding is used to obtain intermediate symbol values based on a relationship defined by a static encoder between the intermediate symbols and a relationship defined by a dynamic encoder between the intermediate symbols and the source symbols, and then to generate additional encoded or repair symbols from the intermediate symbols using the dynamic encoder. Similarly, the corresponding decoder has a decoding process that receives the encoded symbols and decodes intermediate symbol values from the received encoded symbols based on the relationship between the intermediate symbols as defined by the static encoder and the relationship between the intermediate symbols and the received encoded symbols as defined by the dynamic encoder, and then uses the dynamic encoder to generate any missing source symbols from the intermediate symbols.

Embodiments of multi-level coding need not be limited to any particular type of symbol. Typically, the symbol values are from 2 for some positive integer M^MSelected from an alphabet of symbols. In such a case, the source symbols may be represented by a sequence of M-bit data from the input file. The value of M is often based on, for example, the application used,The size of the communication channel and/or the encoded symbols. In addition, the size of the encoded symbols is often determined based on the size of the application, channel, and/or source symbols. In some cases, the encoding process may be simplified if the encoded symbol values and the source symbol values have the same size (i.e., may be represented by the same number of bits, or selected from the same alphabet). If this is the case, then when the encoded symbol value size is limited, the source symbol value size is also limited. For example, it may be desirable to place the encoded symbols into a packet of limited size. If some data regarding the key associated with the encoded symbol is to be transmitted to recover the key at the receiver, the encoded symbol will preferably be small enough to accommodate the encoded symbol value and the data regarding the key in one packet.

As an example, if the input file is a multi-megabyte file, the input file may be broken into thousands, tens of thousands, or hundreds of thousands of source symbols, where each source symbol encodes thousands, hundreds, or only a few bytes. As another example, for a packet-based internet channel, a packet with a payload size of 1024 bytes may be appropriate (8 bits for one byte). In this example, assuming that each packet contains one encoded symbol and 8 bytes of side information, an encoded symbol size of 8128 bits ((1024-8) × 8) would be appropriate. Thus, the source symbol size may be chosen to be M ═ (1024-8) × 8, or 8128 bits. As another example, some satellite systems use the MPEG packet standard, where the payload of each packet includes 188 bytes. In this example, assuming that each packet contains one encoded symbol and 4 bytes of side information, an encoded symbol size of 1472 bits ((188-4) × 8) would be appropriate. Thus, the source symbol size may be chosen to be M ═ (188-4) × 8, or 1472 bits. In a general communication system using multi-level coding, a dedicated parameter such as source symbol size (i.e., M, i.e., the number of bits encoded by one source symbol) may be a variable set by an application.

Each encoded symbol has a value. In one preferred embodiment considered below, each encoded symbol is also associated with an identifier called its "key". Preferably, the key for each encoded symbol can be simply determined by the receiving party to allow the receiving party to distinguish one encoded symbol from other encoded symbols. Preferably, the key of one encoded symbol is different from the keys of all other encoded symbols. Various forms of keyword methodologies are discussed in the prior art. For example, Luby I describes various forms of keyword methodologies that may be employed in embodiments of the present invention. In other preferred embodiments, such as the one described in appendix A, the key of the encoded symbol is referred to as the "encoded symbol identifier", or more simply "ESI".

Multi-level coding is particularly useful where data erasure is expected or where the receiving party does not start and end reception exactly when the transmission starts and ends. The latter case is referred to herein as "data incomplete". With respect to erasure events, multi-level coding enjoys many of the benefits of chain reaction coding taught in Luby I. In particular, the multi-level encoded symbols are additive in information, so any suitable number of packets may be used to recover the input file to achieve the desired accuracy. When multi-level coding is used, these conditions do not negatively impact the communication process because the coded symbols generated with multi-level coding are information additive. For example, if one hundred packets are lost due to data erasure caused by a noise burst, an additional one hundred packets may be acquired after the burst to replace the lost erased packets. If thousands of packets are lost because the receiver is not tuned to the transmitter at the time the transmitter begins transmitting, the receiver may only acquire the thousands of packets from any other transmission period or even from another transmitter. With multi-level encoding, the receiver is not limited to acquiring any particular set of packets, so it can receive some packets from one transmitter, switch to another, lose some packets, miss the beginning or end of a given transmission, and still recover the incoming file. The ability to join and leave transmissions without receiver-transmitter coordination helps to simplify the communication process.

In some embodiments, transmitting a file using multi-level encoding may include generating, forming, or extracting source symbols from an input file, computing redundant symbols, encoding the source symbols and the redundant symbols into one or more encoded symbols, where each encoded symbol is generated independently of all other encoded symbols based on its key, and transmitting the encoded symbols over a channel to one or more recipients. Additionally, in some embodiments, receiving (and reconstructing) a copy of the input file using multi-level encoding may include receiving some set or subset of encoded symbols from one or more data streams and decoding the source symbols from the values and keys of the received encoded symbols.

Systematic code and non-systematic code

Systematic codes are codes in which the source symbols are among the encoded symbols that can be transmitted. In this case, the encoded symbols include source symbols and redundant symbols, also referred to as repair symbols, generated from the source symbols. Systematic codes are preferred over non-systematic codes for a number of applications for a variety of reasons. For example, in a file delivery application, it is useful to be able to start transmitting data in sequential order while the data is being used to generate repair data, where the process of generating repair data may take a certain amount of time. As another example, many applications preferably send original source data to one channel and repair data to another channel in sequential order in unmodified form. One typical reason for this is to both support legacy receivers that do not incorporate FEC decoding, while at the same time providing a better experience to enhanced receivers that incorporate FEC decoding, where legacy receivers only join the source data channel, while enhanced receivers join both the source data channel and the repair data channel.

In these and related types of applications, the following may sometimes be the case: the loss pattern and loss score in the source symbols received by the receiver is very different from the loss pattern and loss score experienced in the received repair symbols. For example, when the source symbols are transmitted before the repair symbols, the loss fraction and pattern in the source symbols may be very different from the corresponding loss fraction and pattern in the repair symbols due to the bursty loss condition of the channel, and the loss pattern in the source symbols is far from the typical loss pattern that would be possible if the losses were uniformly random. As another example, when source data is sent on one channel and repair data is sent on another channel, there may be quite different loss conditions on the two channels. Therefore, it is desirable to have systematic FEC codes that work well under different types of loss conditions.

Although examples herein refer to systematic codes (where output or encoded symbols comprise source or input symbols) or non-systematic codes, the teachings herein should be considered applicable to both codes unless otherwise noted. Shokrollahi III teaches a method for converting non-systematic chain reaction code into systematic code such that the systematic code so constructed maintains the robustness properties of the non-systematic code.

Specifically, using the method taught in Shokrollahi III, the constructed systematic code has the following properties: the difference in recoverability of the decoder between the lost source symbols and the lost repair symbols is small, i.e., the decoding recovery probability is substantially the same for a given total amount of loss, almost independent of the proportion of loss in the source symbols compared to the proportion of loss in the repair symbols. Furthermore, the lost pattern in the encoded symbols does not significantly affect the decoding recovery probability. In contrast, for other systematic code constructions, such as those described for Tornado codes or for IETF LDPC codes, in many cases the difference in recoverability of the decoder between lost source symbols and lost repair symbols is large, i.e. for a given total amount of loss, the decoding recovery probability varies widely, depending on the proportion of loss in source symbols compared to the proportion of loss in repair symbols. Furthermore, lost patterns in the encoded symbols may have a strong impact on the decoding recovery probability. Tornado codes and IETF LDPC codes have reasonably good recovery properties if the loss of coded symbols is uniformly random across all coded symbols, but the recovery properties degrade as the loss model deviates from uniform random loss. Thus, in this sense, the embodiments taught in Shokrollahi III have advantages over other configurations of system code.

For FEC codes having properties that depend on the proportion of lost source symbols and lost repair symbols and on the loss pattern, which has a strong impact on the recoverability of the decoder, one way to overcome this property, where applicable, is to send the encoded symbols in a uniform random order, i.e. to send the combination of source symbols and repair symbols in a uniform random order, so that the source symbols are randomly scattered between the repair symbols. Sending the encoded symbols in random order has the following advantages: the loss of encoded symbols remains random regardless of the channel loss model, whether the loss is bursty or uniformly random, or some other type of loss. However, as noted above, this approach may not be desirable for some applications, such as for applications in which it is desirable to transmit source symbols in order before repair symbols, or in which the source symbols are transmitted on a different channel than the repair symbols.

In such cases, construction of systematic codes in which the loss pattern in the encoded symbols does not significantly affect the recovery properties of the decoder is desirable, and some examples are provided herein.

As used herein, "random" and "pseudo-random" are intended to be equivalent and/or interchangeable and may depend on the context. For example, random loss may refer to which symbols are lost by the channel, which may be truly random events, while random selection of symbol neighbors may actually be a repeatable pseudo-random selection according to a non-random process, but with the same or similar properties or behavior as in the case of truly random selection. Unless explicitly or otherwise indicated by context, characterizing something as random does not mean excluding pseudorandom.

In one approach to such systematic FEC encoders, source symbols are obtained by an encoder that includes multiple encoder sub-blocks or sub-processes, one of which acts as a decoder that generates intermediate symbols that are input symbols to another sub-block or sub-process. The intermediate symbols are then applied to another sub-block or sub-process that encodes the intermediate symbols into encoded symbols such that the encoded symbols include source symbols generated from one consistency process (along with additional redundant symbols), thereby providing robustness benefits and other benefits over encoders that are systematic encoders that use one process (e.g., replication) to acquire the source symbols of a set of encoded symbols and another process to acquire the redundant symbols of the set of encoded symbols.

The output encoding may be a chain reaction encoder, a static encoder, or other variations. Appendix a describes system code embodiments. After reading this disclosure, one of ordinary skill in the art should be able to readily extend the teachings of Shokrollahi III to apply to systematic codes such as Tornado codes and IETF LDPC codes to produce new versions of these codes that are also systematic codes but with better recovery properties. In particular, new versions of these codes, obtained by applying the general method described below, are enhanced to have the following properties: the rate of loss in the source symbols does not significantly affect the decoding recovery probability compared to the rate of loss in the repair symbols, and furthermore the loss pattern does not significantly affect the decoding recovery probability. Thus, these codes can be effectively used in the above-mentioned applications requiring the use of systematic FEC codes having recovery properties that are not strongly affected by different amounts of fractional loss or different loss patterns in the source and repair symbols.

The new coding method can be applied generally to the coding of systematic FEC codes, non-systematic FEC codes, fixed rate FEC codes and chain reaction FEC codes to produce an overall coding method for new enhanced systematic FEC codes. There are corresponding new decoding methods that can be applied.

Examples of decoders in encoders

An example of a decoder in an encoder will now be provided.

Let encoding method E be the encoding method used by an encoder (in the transmitter or elsewhere) for a fixed-rate (non-systematic or systematic) FEC code E that generates N encoded symbols from K source symbols, where N is at least K. Similarly, let decoding method E be the corresponding decoding method for FEC code E used by a decoder in the receiver or elsewhere.

Assume that FEC code E has the following properties: a random set of K of the N encoded symbols is sufficient to recover the original K source symbols with a reasonable probability, which may be, for example, probability 1/2, using decoding method E. The reasonable probability may be some requirement set by the use or application and may be a value other than 1/2. It should be understood that the construction of a particular code need not vary for a particular recovery probability, but applications and systems may be designed for a particular degree of robustness thereof. In some examples, the recovery probability may be improved by considering more than K symbols and then using a decoding process to determine a set of K symbols of the considered symbols that allow successful decoding.

Assume for FEC code E that ESI (encoded symbol identifier) is associated with each encoded symbol and that the ESI identifies the encoded symbol. Without loss of generality, ESI is labeled herein with 0, 1,2, …, N-1.

In one embodiment of a systematic encoding method F for a systematic FEC code F generated using the method for FEC code E, K and N are input parameters. The source symbols of FEC code F would have ESI 0, …, K-1, and the repair symbols of FEC code F would have ESI K, …, N-1. Systematic encoding method F for FEC code F generates N encoded symbols from K source symbols C (0), …, C (K-1) using an encoding method E and a decoding method E for FEC code E performed by hardware and/or software as follows:

(1) randomly permuting the N ESIs associated with the FEC code E to obtain a set of permuted ESIs X (0), …, X (N-1) for the FEC code E, wherein the set of permuted ESIs is organized such that K source symbols of the FEC code E can be decoded from the first K encoded symbols of the FEC code E with respect to the permutation order of ESIs X (0), …, X (K-1),

(2) for each i-0, …, N-1, the ESI i of the FEC code F is associated with the ESI X (i) of the FEC code E,

(3) for each i-0, …, K-1, the value of the coded symbols of the FEC code E with ESI X (i) is set to the value of the source symbol c (i),

(4) applying decoding method E to source symbols C (0), …, C (K-1) with corresponding FEC codes E ESI X (0), …, X (K-1) to generate decoded symbols E (0), …, E (K-1), and

(5) applying an encoding method E to the decoded symbols E (0), …, E (K-1) to generate FEC code E encoded symbols D (0), …, D (N-1) with associated FEC codes ESI 0, …, N-1,

(6) the encoded symbols of encoding method F having ESI 0, 1, …, N-1 are D (X (0)), D (X (1)), …, D (X (N-1)).

Note that the output of encoding method F is N encoded symbols, with the first K encoded symbols being source symbols C (0), …, C (K-1) with associated ESI 0, 1, …, K-1. Thus, encoding method F results in systematic encoding of the source data.

An embodiment of a decoding method F corresponding to the encoding method F just described is as follows, where K and N are input parameters used throughout the method. The decoding method F recovers K source symbols C (0), …, C (K-1) from K received encoded symbols D (0), …, D (K-1) with associated FEC codes F ESI Y (0), …, Y (K-1). The received symbol need not be exactly the transmitted symbol. The method, performed by hardware and/or software, is as follows:

(2) applying a decoding method E to the encoded symbols D (0), …, D (K-1) with associated FEC codes E ESI X (Y (0)), …, X (Y (K-1)) to generate decoded symbols E (0), …, E (K-1),

(3) generating encoded symbols C (0), …, C (K-1) with FEC codes E ESI X (0), …, X (K-1) from E (0), …, E (K-1) using encoding method E,

(4) the decoded source symbols for FEC code F with ESI 0, …, K-1 are C (0), …, C (K-1).

The method and apparatus operating as just described have some desirable attributes. For example, consider FEC code E, which is a systematic code and has the property that it can decode a random set of K received encoded symbols with high probability, but also has the property that it cannot decode with high probability when K encoded symbols are received and the proportion of source symbols in the received encoded symbols is not close to K/N. In this case, the embodiment describes a new FEC code F using the encoding and decoding method of the FEC code E, and the new FEC code F has the following desirable properties: it decodes with high probability from a set of K received encoded symbols, independent of the proportion of received encoded symbols that are source symbols.

There are many variations of the above embodiments. For example, in step (1) of encoding method F, the random permutation of ESI may be pseudo-random or some other method based on producing a good choice of ESI but neither random nor pseudo-random. In case the FEC code E is a systematic code, it is preferred that the fraction of the first K ESIs selected from the systematic ESIs in step (1) in the permutation is proportional to the code rate of the FEC code E, i.e. proportional to K/N. The random selection of ESI preferably made by the new encoding method F in step (1) can be represented by a compact amount of data, for example by a seed for a known or agreed upon pseudo-random generator along with an agreed upon method for selecting ESI based on how the seed and pseudo-random generator operate, so that the new decoding method F can make exactly the same ESI permutation selection in step (1) based on the same seed and pseudo-random generator and method used to generate ESI. In general, it is preferred that both the process used by the new encoding method F to generate the ESI sequence in step (1) and the process used by the new decoding method F to generate the ESI sequence in step (1) generate the same ESI sequence to ensure that the new decoding method F is the inverse of the new encoding method F.

Other variations exist where, for example, without the use of explicit ESI, the unique identifier of an encoded symbol is instead by its position with respect to other encoded symbols or by other means.

In the above description, the original ESI of FEC code E is remapped by FEC code F, so that an ordered set of source symbols is assigned ESI 0, …, K-1 in sequential order, and repair symbols are assigned ESI K, …, N-1. Other variations are possible, e.g., remapping of ESI may occur at the sender just after encoding method F has generated the encoded symbols but before the encoded symbols are transmitted, and inverse remapping of ESI may occur at the receiver as the encoded symbols are received but before the encoded symbols are processed by decoding method F to recover the original source symbols.

As another variation, in step (1) of the new encoding method F, the permutation may be performed by first selecting K + a FEC codes E ESI, where a is a value chosen to ensure a high probability of decodability, and then determining which K ESI of the K + a ESIs was actually used during decoding during the simulation of the decoding process, and the selected permutation may select the K ESIs of the initial set of K + a ESIs that were actually used during decoding as the first K ESIs of the permutation. A similar variation applies to the new decoding method F.

As another variant of the encoding method F, the seed used to generate the random permutation is pre-computed for the value of K to ensure that the first K encoded symbols of the FEC code E associated with the ESI permutation produced in step (1) are decodable, and then this seed is always used for K in step (1) of the encoding method F and the corresponding decoding method F to generate the permutation in step (1). The method for selecting such seeds comprises randomly selecting seeds until one is found that ensures decodability in step (1) and then selecting the seed. Alternatively, a seed having these properties may be dynamically generated by the encoding method F, and then the seed may be communicated to the decoding method F.

As another variation of the encoding method F, a partial permutation may be selected in step (1), i.e. all ESIs need not be generated in step (1) of the new encoding method F, and all encoded symbols need not be generated if all encoded symbols are not needed in steps (5) and (6), e.g. because these encoded symbols correspond to source symbols that are part of the encoded symbols or because less than N encoded symbols need to be generated. In other variations, it is not necessary to recalculate all of the encoded symbols in steps (3) and (4) of the new decoding method F, as some of the received encoded symbols may correspond to some of the source symbols being recovered. Similarly, in step (2) of the new decoding method F, it is not necessary to decode all K symbols E (0), …, E (K-1), for example if some of the symbols decoded in step (2) are not needed in the subsequent steps for generating the encoded symbols.

The methods and embodiments described above have many applications. For example, encoding method F and decoding method F and their variations can be applied to Tornado codes and IETF LDPC codes to provide improved reception overhead and decoding failure probability performance. In general, these new methods are applicable to any fixed rate FEC code. Variations of these new methods may also be applied to FEC codes without a fixed code rate, i.e., may be applied to FEC codes such as chain reaction codes where the number of encoded symbols that may be generated is independent of the number of source symbols.

Shokrollahi III contains similar teachings for creating systematic encoding and decoding methods for chain reaction codes. In some embodiments, encoding and decoding methods E for these codes are those taught in Luby I, LubyII, Shokrollahi I, Shokrollahi II, Luby III, Shokrollahi IV. For the purpose of describing the systematic encoder, it is often sufficient to describe the encoding method E and the decoding method E and to use the general principles described above and known from these references to transform these methods into the systematic encoding method F and the systematic decoding method F. Thus, upon reading this disclosure and the cited references, one of ordinary skill in the art would understand how to utilize the teachings describing encoding method E and decoding method E and apply them to systematic encoding method F and systematic decoding method F or the like.

Passivation of

As taught in Shokrollahi II, passivation decoding is a general method that can be applied in conjunction with belief propagation whenever solving a set of unknown variables from a set of known linear equation values, and is particularly beneficial in implementing efficient encoding and decoding methods based on linear equation sets. To distinguish passivation decoding described in Shokrollahi II from permanent passivation decoding described below, the method and teachings of Shokrollahi II are referred to using "on-the-fly" passivation (abbreviated "OTF passivation" throughout) while the method and teachings herein in which passivation is selected ahead of time are referred to using "permanent passivation".

One principle of belief propagation decoding is that the decoder should solve for a remaining unknown variable using (possibly simplified) equations that depend on the variable whenever possible during the decoding process and are thus associated with the variable, and then simplify the remaining unused equations by eliminating their dependence on the solved variable. Such simple belief propagation based decoding processes have been used, for example, in some embodiments of Tornado codes, chain reaction codes such as described in Luby I, LubyII, Shokrollahi I, Shokrollahi II, Luby III, Shokrollahi IV, and IETF LDPC codes.

OTF passivation decoding is performed in multiple stages. In the first phase of the OTF deactivation decoding method, whenever the belief propagation decoding process cannot continue due to the absence of residual equations that depend on only one remaining unknown variable, the decoder deactivates one or more unknown variables by "OTF and treats them as" solved out "with respect to the belief propagation process and" cancels "from the remaining equations (even if they are not cancelled in fact), thereby possibly allowing the belief propagation decoding process to proceed. The variables passivated by OTF during the first stage are then solved, for example in the second stage, for example using gaussian elimination or a computationally more efficient method, and then in the third stage the values of these variables passivated by OTF are used to completely solve the variables associated with the equations during the first decoding stage.

OTF passivation decoding as taught in more detail in Shokrollahi II is applicable to many other types of code in addition to chain reaction code. For example, it may be applied to the general class of LDPC and LDGM codes, and in particular to IETF LDPC codes and Tornado codes, resulting in improved reliability (reduced probability of decoding failure) and/or CPU and/or memory performance (increased speed of encoding and/or decoding and/or reduced required memory size and/or access pattern requirements) for these types of codes.

Chain reaction code embodiments are described in Shokrollahi IV in conjunction with some variations of OTF passivation decoding. Other variations are described in the present application.

Overview of the System

Fig. 1 is a block diagram of a communication system 100 using multi-level coding. It is similar to that shown in Shokrollahi I, but in this case, encoder 115 takes into account the designation of which intermediate symbols are "permanently inactivated" and operates differently on these intermediate symbols during the dynamic encoding process than intermediate symbols that are not permanently inactivated. Likewise, decoder 155 also accounts for intermediate symbols that are permanently passivated when decoding.

As illustrated in fig. 1, K source symbols (C (0), …, C (K-1)) are input to encoder 115, and decoder 115 may output copies of these K source symbols if decoding of symbols that become available to decoder 155 is successful. In some embodiments, the stream is parsed into blocks of K symbols, and in some embodiments, a file with some number of source symbols greater than K is divided into symbol blocks of size K and transmitted as such. In some embodiments where block size K '> K is preferred, K' -K padding symbols may be added to the K source symbols. These padding symbols may have a value of 0, or any other fixed value known to both encoder 115 and decoder 155 (or a value that can be otherwise determined at decoder 155). It is to be understood that encoder 115 may include multiple encoders, modules, or the like, and that decoder 155 may also be the case.

As illustrated, the encoder 115 also receives a dynamic key sequence from the dynamic key generator 120 and a static key sequence from the static key generator 130, the dynamic key generator 120 and the static key generator 130 each being drivable by a random number generator 135. The output of dynamic key generator 120 may simply be a radix sequence, but this need not be the case. The operation of the key generator may be as shown in Shokrollahi I.

It should be understood that the various functional blocks shown in the figures may be implemented as hardware specifying that the inputs be provided as input signals, or they may be implemented by a processor executing instructions stored in an instruction memory and executed in an appropriate order to perform the corresponding functions. In some cases, specialized hardware is used to perform these functions and/or execute program code. Program code and processors are not always shown, but one of ordinary skill in the art will know how to implement such details upon reading this disclosure.

Encoder 115 also receives input from passivation specifier 125 as well as other parameters input to system 100 along lines described elsewhere herein. The output of the deactivation designator 125 may include a value P representing the number of intermediate symbols designated as "permanently deactivated" for decoding purposes (a "PI list" indicates which P intermediate symbols are on the list). As explained elsewhere, the intermediate symbols used for the encoding process are in some embodiments exactly K source symbols, while in other embodiments there is some type of processing, conversion, encoding, decoding, etc., that generates intermediate symbols from the K source symbols in addition to copying only the K source symbols.

The input parameters may include a random seed (described in more detail below) used by the key generator and/or the encoder's encoding process, the number of encoded symbols to be generated, the number of LDPC symbols to be generated, the number of HDPC symbols to be generated, the number of intermediate symbols to be generated, the number of redundant symbols to be generated, etc., and/or some of these values are calculated from other values available to the encoder 115. For example, the number of LDPC symbols to be generated may be calculated entirely from a fixed formula and the value of K.

Encoder 115 generates a sequence of encoded symbols (B (I) from its input₀)、B(I₁)、B(I₂) …) and provides them to the transfer module 140, the transfer module 140 also receiving the dynamic key value (I) from the dynamic key generator 120₀、I₁、I₂…) if there are other ways of communicating that information, this may not be necessary. The delivery module 140 may communicate the content given to the channel 145 in a conventional manner that need not be described in greater detail herein. The receiving module 150 receives the encoded symbols and the dynamic key values (where needed). The channel 145 may be a through-space channel (for transmission from one place to receive at another place) or a through-time channel (for recording to a medium for playback at a later time, for example). Channel 145 may result in some encoded symbols being lost. Thus, the encoded symbol B (I) received by the decoder 115 from the receive module 150_a)、B(I_b) … may not be identical to the encoded symbols sent by the transmit module. This is indicated by the different index indices of the subscripts.

Decoder 155 is preferably capable of regenerating keys for received symbols (which may be different) using a dynamic key regenerator 160, a random number generator 163, and a static key generator 165, and receiving as input various decoding parameters. Some of these inputs may be hard-coded (i.e., input during construction of the device) and some may be changeable inputs.

Fig. 2 is a table of variables, arrays, and the like, along with a summary of labels, in the other figures and as used most often throughout this disclosure. Unless otherwise noted, K denotes the number of source symbols used for the encoder, R denotes the number of redundant symbols generated by the static encoder, and L is the number of "intermediate symbols", i.e. the combination of source symbols and redundant symbols and thus L ═ K + R.

As explained below, in some embodiments of the static encoder, two types of redundant symbols are generated. In particular embodiments used in many of the examples herein, the first set includes LDPC symbols and the second set includes HDPC symbols. Without loss of generality, many examples herein refer to the number of LDPC symbols as S and the number of HDPC symbols as H. There may be more than two redundant symbols, and thus R ═ S + H is not required. LDPC symbols and HDPC symbols have different degree distributions, and one of ordinary skill in the art upon reading this disclosure will see how to use redundant symbols that are not LDPC or HDPC symbols, but where the redundant symbols comprise two (or more) sets of symbols, where each set has a degree distribution that is different from the degree distributions of the other sets. As is well known, the degree distribution of a set of redundant symbols is a distribution of degrees, where the degree of a redundant symbol refers to the number of source symbols on which the redundant symbol depends.

P denotes the number of permanently inactive symbols in the intermediate symbols. Permanently passivated symbols are those symbols that are designated for a particular treatment, i.e., are "parked" or "passivated" in order to continue belief propagation in a belief propagation network (and then solved back after solving for the passivated symbol), where a permanently passivated symbol differs from other passivated symbols in that a permanently passivated symbol is designated for such treatment at the encoder.

N represents the number of received symbols on which decoder 155 attempts to decode, and a is the number of "overhead" symbols, i.e., the number of received encoded symbols that exceed K. Thus, a ═ N-K.

K. R, S, H, P, N and a are integers, typically all greater than or equal to 1, but in particular embodiments some of them may be 1 or 0 (e.g., R ═ 0 is the case where there are no redundant symbols, and P ═ 0 falls back to Shokrollahi II, where there is only OTF passivation).

The source symbol vector is denoted as (C (0), …, C (K-1)), and the redundant symbol vector is denoted as (C (K), …, C (L-1)). Thus, in the systematic case, (C (0), …, C (L-1)) represents the intermediate symbol vector. The P number of these intermediate symbols is designated as "permanently inactive". The "PI list" indicates which of the intermediate symbols are permanently passivated intermediate symbols. In many embodiments, the PI list simply points to the last P intermediate symbols, namely C (L-P), …, C (L-1), but this is not a requirement. This scenario is assumed merely to simplify the remainder of this description.

The intermediate symbols that are not on the PI list are referred to herein as "LT intermediate symbols". In this example, the LT intermediate symbols would be C (0), …, C (L-P-1). D (0), …, D (N-1) represent the received encoded symbols.

It should be noted that where the array of values is described as "N (0), …, N (x)" or the like, this should not be assumed to require at least 3 values, as it is not intended to exclude the case of only one or two values.

Encoding method using permanent passivation

Fig. 3 is a block diagram of one particular embodiment of the encoder 115 shown in fig. 1. As illustrated therein, source symbols are stored in the input buffer 205 and provided to the static encoder 210 and the dynamic encoder 220, the static encoder 210 and the dynamic encoder 220 also receiving key inputs and other inputs. The static encoder 210 may include internal storage 215 (memory, buffers, virtual memory, register storage, etc.) for storing internal values and program instructions. Likewise, the dynamic encoder 220 may include internal storage 225 (memory, buffers, virtual memory, register storage, etc.) for storing internal values and program instructions.

In some embodiments, the redundancy calculator 230 determines the number R of redundant symbols to create. In some embodiments, the static encoder 210 generates two distinct sets of redundant symbols, and in particular embodiments, the first set is the first S redundant symbols, symbols C (K), …, C (K + S-1) and they are LDPC symbols, while the second set is the next H redundant symbols, C (L-H), …, C (L-1) and they are HDPC symbols. If the PI list is the last P redundant symbols, then all H redundant symbols may be on the PI list (if P ≧ H) or all P redundant symbols may be HDPC symbols (if P < H).

The operations that result in the generation of these two symbol sets may be quite different. For example, in some embodiments described below, the operation used to generate LDPC redundant symbols is a binary operation, while the operation used to generate HDPC symbols is non-binary.

The operation of the dynamic encoder 220 is explained in more detail in fig. 4. According to one embodiment, the dynamic encoder 220 includes two encoders, a PI encoder 240 and an LT encoder 250. In some embodiments, LT encoder 250 is a chain reaction encoder and PI encoder 240 is a particular type of chain reaction encoder. In other embodiments, the two encoders may be very similar, or the PI encoder 240 is not a chain reaction encoder. Regardless of how these encoders are defined, they generate symbols, with LT encoder 250 generating its symbols from LT intermediate symbols C (0), …, C (L-P-1) designated as being non-permanently inactive, and PI encoder 240 generating its symbols from permanently inactive intermediate symbols C (L-P), …, C (L-1). The two generated symbols enter a combiner 260, and the combiner 260 generates a final encoded symbol 270.

In some embodiments of the present invention, some of the permanently inactive symbols may participate in the LT encoding process, and some symbols that are not permanently inactive symbols may participate in the PI encoding process. In other words, the PI list and the set of symbols comprising LT intermediate symbols need not be non-intersecting.

In a preferred embodiment, the symbols provided to combiner 260 may be of the same length, and the function performed by combiner 260 is an exclusive-or (XOR) operation on these symbols to generate encoded symbol 270. However, this is not essential to the operation of the invention. Other types of combiners that can lead to similar results are envisioned.

In other embodiments, the intermediate symbols are subdivided into more than two sets, e.g., one set of LT symbols and several sets (more than one) of PI symbols, each set of PI symbols having its associated encoder 240. Of course, each associated encoder may be implemented as a common computing element or hardware element that operates on different instructions according to an encoding process while acting as different encoders of different sets.

An example operation of a PI encoding process 241 as may be performed by PI encoder 240 is illustrated in fig. 5. Using the key I _ a corresponding to the encoded symbol to be generated, the encoder determines in step 261 a positive weight WP and a list ALP containing WP integers between L-P and L-1 (including L-P and L-1). In step 263, if the list ALP is equal to (t (0), …, t (WP-1)), the value of symbol X is set to X ═ C (t (0)), (…) and C (t (WP-1)), where, indicates an exclusive or operation.

In some embodiments, the weight WP is fixed to a certain number, such as 3 or 4 or some other fixed number. In other embodiments, the weight WP may belong to a small set of possible such numbers, such as chosen to be equal to 2 or 3. For example, as shown in the embodiment of appendix a, the weight WP depends on the weight of the symbol generated by LT encoding process 251 as may be performed by LT encoder 250. If the weight generated by the LT encoder 250 is 2, WP is selected to be 2 or 3 depending on the keyword I _ a, wherein the degree proportion of WP being 2 or 3 is approximately equal; if the weight generated by LT encoder 250 is greater than 3, WP is selected as 2.

Fig. 6 is an example of an LT encoding process 251 using the teachings of Luby I and Shokrollahi I, according to one of the embodiments of the present invention. At step 267, the keyword I _ a is used to generate a weight WL and a list AL, respectively. In step 269, if the list ALP is (j (0), …, j (WL-1)), the value of symbol X is set to X ═ C (j (0)) > C (j (1)) > … × (j (WL-1)).

Fig. 7 illustrates the operation of calculating the weight WL. As shown therein, at step 272, a number v associated with the encoded symbol to be generated is created and may be calculated based on the key I _ a of the encoded symbol. It may be an index, a representative mark, etc. of the encoded symbol, or a different number as long as the encoder and decoder canAnd (4) the components are consistent. In this example, v is between 0 and 2²⁰In other examples, other ranges are possible (such as 0 to 2)³²). The generation v may be done explicitly using a random generation table, but the exact operation of how these random numbers are generated may vary.

Assuming that the encoder has access to table M, an example of table M is provided in fig. 8. The table M, referred to as the "degree distribution lookup" table, contains two columns and multiple rows. The left column is marked with possible values for the weight WL, and the right column is marked with 0 and 2²⁰Between (including 0 and 2)²⁰) Is an integer of (1). For any value of v, the degree distribution looks up M [ d ] of the table]Exactly one cell in a column, where M [ d-1]]＜v≤M[d]Is true. For this one unit, there is a corresponding value in column d and the encoder uses it as the weight WL for the encoded symbol. For example, where an encoded symbol has a v of 900,000, the weight of the encoded symbol will be WL of 7.

The static encoder 210 has access to the element SE (k, J), where k is 0, …, R-1 and J is 0, …, L-1. These elements may belong to any finite field, and there is an operation between the element α and the symbol X of the field such that α X is the symbol, and α × (X ≦ Y) ≦ α ≦ Y, where ≦ represents an exclusive or operation. Such fields and operations have been detailed in Shokrollahi IV. The operation of the static encoder 210 may be described as: for a given source symbol sequence C (0), …, C (K-1), a sequence of redundant symbols C (K), …, C (L-1) is calculated that satisfies the relationship shown in equation 1, where Z (0), …, Z (R-1) are values known to the encoder and decoder (e.g., 0).

(formula 1)

In equation 1, the entries SE (k, j) may be all binary, or some of the entries may belong to the field GF (2) and some others to other fields. For example, the corresponding matrix for the annex a embodiment is given in fig. 9. It comprises two sub-matrices, one with S rows and one with H rows. The upper sub-matrix comprises two parts: a sub-matrix comprising the last P columns, where each row has two consecutive 1's (these positions are counted modulo P). The head W-L-P column of the matrix consists of a circulant matrix followed by an SxS identity matrix. The circulant matrix includes B columns and each column (except possibly the last column) has S rows. The number of these circulants is ceil (B/S). There are exactly 3 columns in these circulant matrices each of 1. The first column of the kth circulant matrix has 1 at position 0, (k +1) mod S and (2k +1) mod S. The other columns are cyclic shifts of the first column. The lower H row in fig. 9 includes a matrix Q with entries in GF (256) followed by an HxH identity matrix.

If alpha represents the minimum polynomial x⁸+x⁴+x³+x²The element of GF (256) of +1, the matrix Q is then equivalent to the matrix given in fig. 10. Here,. DELTA.₁、…、Δ_K+S-1Is a column with weight 2 whose positions of the 2 non-zero entries are determined pseudo-randomly according to the procedure outlined in section 5.3.3.3 of appendix a. Judicious choice of values S, P and H (such as the values provided in appendix A), the matrix in FIG. 10 results in superior recovery properties for the corresponding code. The procedure described above is illustrated in fig. 11. At step 276, the matrix SE is initialized to 0. At step 278, an input variable S equal to the number of LDPC symbols is provided to the process, and for the pair (i, j) where i ═ j mod S, or i ═ 1+ floor (j/S)) + j mod S, or i ═ 2 × (1+ floor (j/S)) + j mod S, the value of SE (i, j) is set to 1. This step is responsible for the circulant matrix in fig. 9.

In step 280, the identity matrix I of FIG. 9 is compared with_SThe corresponding position is set to 1. At step 282, the position corresponding to the PI portion of the matrix in fig. 9 is set to 1. These positions are of the forms (i, l) and (i, t), where l ═ i mod P and t ═ i +1 mod P. At step 284, the position corresponding to the matrix Q in fig. 9 is set. Accordingly, a matrix Q is provided as an additional input to this step. At step 286, the identity matrix I of the matrix of FIG. 9 is compared to_HThe corresponding position is set to 1.

Other choices for the matrix SE are possible and depend on the requirements of the particular application and the overall code requirements. Regardless of how the matrix in equation 1 is selected, the tasks of the static encoder 210 may be accomplished in a variety of ways. For example, Gaussian elimination may be used as a process to recover the unknown values C (K), …, C (L-1), as will be apparent to one of ordinary skill in the art upon reading this disclosure.

Decoding and permanent passivation

The decoding problem can be stated as follows: decoder 155 has a decoder with a corresponding key I_a、I_b… of N encoded symbols B (I)_a)、B(I_b) …. The entire set of these encoded symbols, or a subset thereof, may have been received by the decoder, while other encoded symbols may have been given to the decoder by other means. The goal of the decoder is to recover the source symbols C (0, …, C (K-1). for simplicity of representation, the received encoded symbols are denoted as D (0), …, D (N-1).

Many decoding operations may be described succinctly using a matrix language and the operations on these matrices, in particular solving a system of equations with such matrices. In the following description, the equations may correspond to received encoded symbols, and the variables may correspond to source symbols to be solved based on the received encoded symbols or a combined set of source symbols and redundant symbols (often referred to as intermediate symbols) generated from the source symbols. In the specification provided as appendix a, an encoded symbol may be referred to as an "encoded symbol" (and other variations exist), but it will be apparent how these references relate after reading the entire specification and appendix. It should also be understood that the matrices and the operations and solutions to equations may be implemented as computer instructions corresponding to these mathematical operations, and it is not practical to perform such operations without a computer, processor, hardware, or some electronic component.

Prior to initiating the first phase of the decoding process, permanent passivation is used at the decoder to determine a set of variables to be passivated, referred to as permanent passivation symbols or variables. The permanent passivation decoding method described below may be applied to existing codes, or the codes may be specifically designed to work even better in conjunction with permanent passivation decoding. The permanent passivation decoding method can be applied to solve any system of linear equations, and in particular to chain reaction codes, IETF LDPC codes, and Tornado codes.

Permanent passivation decoding is a general method that can be applied in conjunction with belief propagation decoding and/or OTF passivation decoding whenever solving a set of unknown variables from a set of known linear equation values, and is particularly beneficial in implementing efficient encoding and decoding methods based on a set of linear equations. In the first stage, based on the structure of the known encoding method or based on the received equations, it is stated that a set of unknown variables will be permanently passivated, and that the permanently passivated variables are removed from the linear equations and treated as "solved" in the second stage of the decoding process (except that the same simplification is performed on the permanently passivated variables when the second stage linear equations are simplified).

In a second phase, belief propagation decoding is applied to unknown variables that are not permanently passivated using the previously described belief propagation decoding, or OTF passivation decoding similar to that described for the first phase of the OTF passivation decoding method is applied to unknown variables that are not permanently passivated, thereby producing a simplified set of encoded symbols or equations. The simplified encoded symbols or equations resulting from the second stage have the following properties: its dependence on variables or symbols that have not been passivated has been eliminated, so these simplified encoded symbols or equations rely only on passivated variables or symbols. Note that in some implementations, the original encoded symbols or equations may also be maintained, so that both original encoded symbols and simplified encoded symbols are available.

In the third stage, the permanent passivation variables are solved using the simplified encoded symbols or equations together with any additional OTF passivation variables generated using OTF passivation decoding in the second stage, e.g., using gaussian elimination, or a special structure of the relationship between the permanent passivation variables and the linear equations, if any, is solved more efficiently than using gaussian elimination.

In the fourth stage, the solved value of the passivation variable (OTF passivation variable or permanent passivation variable) is used in conjunction with the original encoded symbols or equations (or re-derived original encoded symbols or equations) to solve for the variables that are not passivated.

One advantage of the permanently inactivated decoding method is that the number of times w of OTF inactivation, in addition to the permanent inactivation, can be generally smaller or 0, and can be largely independent of which encoded symbols are received. This may allow decoding complexity to be consistently smaller independent of which encoded symbols are received, allowing for more reliable decoding, and more predictable and fewer memory accesses that may be more efficiently scheduled. Since there is only a small amount of OTF passivation in the second phase, and since it is generally only determined during the decoding process that OTF passivation in the second phase renders the symbol operating mode somewhat unpredictable, the memory access mode is more predictable during decoding, allowing a more predictable efficient decoding process overall.

There are many variations of the above. For example, the stages may be performed in a non-sequential interleaved order. As another example, the passivated symbols may in turn be solved in the third stage using OTF passivation decoding or permanent passivation decoding in a number of additional stages. As another example, permanent deactivation decoding may be applied to linear equations and variables that may be used for error correction codes, or erasure correction codes, or for other applications that may be solved using linear equations. As another example, the methods may be applied to both systematic and non-systematic codes. As another example, these methods may also be applied during the encoding process, for example when generating systematic code from non-systematic code using the methods taught in Shokrollahi III.

In some cases, it is possible to design the encoding process such that a permanent passivation decoding method would be particularly effective. For example, belief propagation decoding is known to be computationally efficient whenever it can be applied, but it is also known that it cannot provide high reliability decoding when used alone. When using belief propagation decoding within OTF passivation decoding, the belief propagation steps can be handled very efficiently, but OTF passivation steps interspersed within belief propagation steps can slow down the decoding, and the more such OTF passivation steps, the slower the decoding process.

In a typical embodiment of OTF passivation decoding, the number of OTF passivation steps is typically maximized when attempting to solve K + R unknown variables using N + R linear equation values, and when N ═ K, i.e., when attempting to solve these variables using a 0 overhead. On the other hand, as N increases to greater than K, this is typically the case: the complexity of OTF passivation decoding is reduced due to fewer OTF passivation steps until N is large enough that in some cases there are no OTF passivation steps and passivation decoding is computationally efficient like or almost like belief propagation decoding. In other embodiments of OTF passivation decoding, the number of OTF passivations may still be large even when N is considerably larger than K.

In a preferred embodiment of the permanent passivation decoding, the number of permanent passivation variables P and the structure of the linear equations are designed such that when solving the L-P variables that are not permanently passivated from the K + R linear equation values using OTF passivation decoding, the number of OTF passivation steps during OTF passivation decoding is small and in some cases 0, so the OTF passivation decoding steps are almost as computationally efficient as belief propagation.

In a preferred embodiment, the structure of the linear equations is designed such that the OTF passivation decoding stage is almost as efficient as belief propagation decoding. In such preferred embodiments, the relationship of the permanent passivation variables to the linear equations enables the phase of solving for the passivation variables (including the permanent passivation variables along with any OTF passivation variables from the OTF passivation decoding phase) to be performed efficiently. Furthermore, in a preferred embodiment, the structure of the permanently passivated symbols is such that the phase of solving for the unpassivated variables from the solved passivated symbols is computationally efficient to accomplish.

Chain reaction code decoding with permanent deactivation

Fig. 12 illustrates a matrix representation of a set of variables to be solved by a decoder using N received encoded symbols or equations and R known static symbols or equations. The task of the decoder is to solve the system of linear equations given in the figure. Typically, the symbols/equations are represented by values stored in a memory or storage accessible to the decoder, and the matrix operations described below are implemented by instructions executable by the decoder.

The matrix shown in fig. 12 includes L-K + R columns and N + R rows. The LT submatrix represents the relationship between the N encoded symbols and the L-P LT symbols determined by LT encoding process 251 among the L intermediate symbols. The PI submatrix represents the relationship between the N encoded symbols and the P PI symbols determined by the PI encoding process 241 among the L intermediate symbols. The matrix SE of equation 1 represents the relationship between the intermediate symbols as determined by the static encoder 210. The decoder can determine these relationships based on the key of the received encoded symbol and from the code construction.

The system of linear equations of fig. 12 was solved by row/column permuting the above matrix using the OTF passivation method taught in Shokrollahi II to transform to the form shown in fig. 13. It includes a lower triangular matrix LO310, a number of columns including a matrix 320 corresponding to OTF passivation (referred to as OTFI), a matrix 330PI corresponding to a set of permanently passivated intermediate symbols or a subset thereof, and a matrix 340EL corresponding to encoded or static symbols not used in the triangularization process to arrive at matrix LO.

FIG. 14 is a block diagram depicting elements that may perform the process of obtaining the matrix of FIG. 12. It includes LT matrix generator 347, PI matrix generator 349, and static matrix generator 350. Upon receipt of the keyword I_a、I_b…, LT matrix generator creates matrix LT in FIG. 12, and PI matrix generator 349 creates matrix PI in FIG. 12. The concatenation of these two matrices is forwarded to the static matrix generator 350, and the static matrix generator 350 may use the static keys S _0, S _1, … as additional hints. The task of the static matrix generator is to create the matrix SE and its output is the complete matrix given in fig. 12.

The operations of LT matrix generator 347 and PI matrix generator 349 are closely related to the operations of LT encoder 250 and PI encoder 240 in fig. 15, respectively. The operation of the static matrix generator 350 is to recreate the matrix SE of equation 1 for static encoding.

LT matrix generator 347, PI matrix generator 349 and static matrix generator will now be described in further detail with reference to the operations they may perform.

Fig. 16 is a flow chart illustrating one embodiment 500 of a method employed by LT matrix generator 347. In step 505, LT matrix generator 347 initializes N x (L-P) formatted matrix LT to all 0's. Next, at step 510, the keyword I is used_a、I_b… to generate weights WL (0), …, WL (N-1) and lists AL (0), …, AL (N-1), respectively. Each of the lists AL (i) includes WL (i) integers (j (0), …, j (WL (i) -1)) in the range 0, …, L-P-1. In step 515, the entries LT (i, j (0)), …, LT (i, j (WL (i) -1)) are set to 1 using these integers. As explained above, the matrix LT contributes to the system of equations for the unknowns (C (0), …, C (L-1)) in the sense that the symbols (D (0), …, D (N-1)) are received.

As can be appreciated by those skilled in the art, the operation of the LT matrix generator as described herein is similar to the operation of the LT encoding process 251 of fig. 6.

FIG. 17 is a flow chart illustrating one embodiment 600 of a method employed by PI matrix generator 349. At step 610, PI matrix generator 349 initializes matrix PI in nxp format to all 0's. Next, at step 615, the keyword I is used_a、I_b… to generate weights WP (0), …, WP (N-1) and lists ALP (0), …, ALP (N-1), respectively. Each of the lists ALP (i) includes WP (i) an integer (j (0), …, j (WP (i) -1)) in the range 0, …, P-1. In step 620, the entries PI (i, j (0)), …, PI (i, j (wp (i) — 1)) are set to 1 using these integers. The operation of the PI matrix generator is similar to the operation of PI encoding process 241 in fig. 5.

As explained above, the matrices LT and PI contribute to the system of equations for the unknowns (C (0), …, C (L-1)) in the sense that the symbols (D (0), …, D (N-1)) are received. The reason is as follows: once the LT encoder selects the weight wl (i) and associates the list al (i) ═ (j (0), …, j (wl (i) — 1)), and the PI encoder selects the weight wp (i) and associates the list alp (i) ═ (t (0), …, t (wp (i) — 1)), the corresponding encoded symbol d (i) is obtained as shown below. These equations, accumulated for all values of i between 0 and N-1, result in the desired set of equations represented in equation 2.

D (i) · C (j (0)). ≧ … ^ C (j (wl (i)) -1)). gtc (t (0)). gtq … ^ C (t (wp (i)) (1)) (expression 2)

The weight WL may be calculated using a procedure similar to the one given in fig. 7. One of ordinary skill in the art, upon reviewing this disclosure, will see how to extend this to situations where there are more than two encoders, each operating with a different degree distribution.

A slightly different flow diagram of the matrix generator is provided in fig. 18. It includes LT matrix generator 710, static matrix generator 715, and PI matrix generator 720. Upon receipt of the keyword I_a、I_b…, LT matrix generator 710 creates the matrix LT illustrated in fig. 15, while static matrix generator 715 creates the matrix SE illustrated in fig. 15 and takes the additional static keys S _0, S _1, … as its other input. The concatenation of these two matrices is forwarded to PI matrix generator 720, and PI matrix generator 720 creates a matrix PI. The operation of LT matrix generator 710 may be identical to the operation of LT matrix generator 347 detailed in fig. 16. The operation of static matrix generator 715 may be different from the operation of static matrix generator 350 in fig. 14. In particular, fig. 19 details an exemplary embodiment of such an operation.

At step 725, the matrix SE is initialized to 0. At step 730, an input variable S equal to the number of LDPC symbols is provided to the process, and the value of SE (i, j) is set to 1 for pair (i, j) when i ═ j mod S, i ═ 1+ floor (j/S)) + j mod S, or i ═ 2 × (1+ floor (j/S)) + j mod S. In step 735, the position corresponding to the identity matrix IS in fig. 9 IS set to 1. At step 740, the location corresponding to the matrix T is provided as an additional input to the step. The matrix may have entries in multiple finite fields and may be different for different applications. It may be chosen based on the requirements of the code requirements.

Fig. 20 is a simplified flow diagram illustrating one embodiment of a method employed by PI matrix generator 720. At step 745, PI matrix generator 349 initializes matrix PI in (N + R) x P format to all 0 s. Next, at step 750, the keywords I _ a, I _ b, … are used to generate the weights WP (0), …, WP (N-1) and the lists ALP (0), …, ALP (N-1), respectively. Each of the lists ALP (i) includes WP (i) an integer (j (0), …, j (WP (i) -1)) in the range 0, …, P-1. In step 755, the entries PI (i, J (0)), …, PI (i, J (wp (i) — 1)) are set to 1 using these integers. The operation of the PI matrix generator in fig. 20 is similar to that of the PI matrix generator of fig. 17, except that the matrix generator creates a matrix with multiple R rows and is closely related to the matrix of fig. 15.

The system of equations in fig. 12 or fig. 15 is typically sparse, i.e. the number of non-zero entries in the involved matrix is typically much less than half of the possible entries. In this case, it may not be necessary to store the matrices directly, but rather an indication of each individual entry that helps recreate the matrices may be stored. For example, for each row of the matrix LT or PI, the process may want to store the weights and neighbor lists as calculated in fig. 5-6. Other methods are also possible, and many of them have been explained herein or in the disclosure incorporated by reference herein.

Once the matrix generator has created a system of equations of the form given in fig. 12 or fig. 15, the task of the decoder will be to solve the unknown values C (0), …, C (L-1) of the system of equations. Several different methods can be applied to achieve this goal, including but not limited to gaussian elimination, or any of the methods described in Luby I, Luby II, Shokrollahi I, II, III, IV, or V.

A possible method for solving the system of equations in fig. 12 or fig. 15 is now outlined with reference to fig. 21-26. A flow chart of the operation of a decoder according to some embodiments of the present invention is given in fig. 21. At step 1305, a decoding matrix is created using some of the methods described earlier. At step 1310, the matrix is rearranged using row and column permutation. As mentioned above, such a matrix may be obtained from the matrix of fig. 12 or 15 by applying row and column permutations. Chain reaction decoding in combination with the on-the-fly passive decoding of Shokrollahi II can be used to achieve this. There is thus a permutation pi operating on set {0, 1, …, L-1} and a permutation tau operating on set {0, 1, …, N + R-1} such that the equations in FIG. 22 are satisfied.

In this context, w denotes the number of rows and columns of the matrix LO in fig. 13, i.e. the number of intermediate symbols which are neither permanently passivated nor passivated by OTF. At step 1315, all entries of matrix LO below the diagonal are zeroed out using matrix LO of FIG. 13. By doing so, the set of symbols to the right of the equation in FIG. 23 need to follow the same operation, so the new right side of the equation set is obtained by XOR of some D (tau (i)).

As illustrated in fig. 24, after this operation, matrix 810 becomes the identity matrix, matrix EL in 840 will not be touched, and matrices OTFI and PI will become OTFI-2 in 820 and PI-2 in 830, since the decoding process requires xoring the rows of these matrices together according to the operations necessary to reduce matrix LO to the identity matrix.

The next step in the decoding process may be step 1320, where the rest of the remaining matrix below LO is de-metailed to obtain a matrix of the form indicated in fig. 25. Taking the permuted and reduced values of the original symbols D (0), …, D (N _ R-1) after this step as E (0), …, E (N + R-1), the number of rows of matrix EL _2 as u, and the number of columns of EL _2 as g, the structure of the matrix in FIG. 25 yields a smaller set of u linear equations in terms of the values of equation 3 for C (pi (L-g)), …, C (pi (L-1)).

Formula 3

A decoding process such as that described in fig. 21 may solve the system of equations in step 1330 by various means, such as by using a gaussian elimination process, or a combination of chain reaction coding and gaussian elimination, or by other applications of passivation decoding, or by other means. If the matrix EL has elements belonging to multiple domains, the gaussian elimination method can be modified to separate the computations in GF (2) from those in larger domains such as GF (256), as taught for example in Shokrollahi IV.

If the system of equations of equation 3 is not solvable using the processes employed by the decoder, the decoder may apply countermeasures in step 1335. Such countermeasures may include marking errors and stopping the process, or may include requesting more encoded symbols, or may stop the process and return to the application using the decoder a list of intermediate or source symbols that it has been able to recover so far. If the system of equations is solvable, the decoder may recover the values of the inactive intermediate symbols C (pi (L-g)), …, C (pi (L-1)). In some variations, some other intermediate symbols may also be recovered in step 1330 in addition to the inactive intermediate symbols.

Once the values of the symbols are recovered, the decoder proceeds to step 1340 involving back generation. The values for C (pi (L-g)), …, C (pi (L-1)) are recovered giving a system of equations of the type given in FIG. 26. This system of equations is easier to solve than a general system of equations. For example, the decoder may do so using the process indicated in fig. 23. The process of obtaining the first vector on the right side of fig. 23 may be referred to as a back-substitution because it is the process of substituting the values of the known symbols into the system of equations. As can be seen by one of ordinary skill in the art upon reading this disclosure, the set of equations set forth in fig. 23 and 26 are mathematically equivalent.

In FIG. 23, the decoder obtains unknown values C (pi (0)), …, C (pi (L-g-1)) by implementing a process in which entries on the right side of the matrix are multiplied by entries for which the vector C (pi (L-g)), …, C (pi (L-1)) has been solved and the obtained entries are XOR-ed with E (0), …, E (L-g-1). The process of XOR-ing the obtained entries with E (0), …, E (L-g-1) and thus recovering the values of C (pi (0)), …, C (pi (L-g-1)) includes step 1345 of the decoder in FIG. 21.

Although useful in some applications, this approach may result in a large computational overhead in some preferred embodiments, since the matrix on the right side of fig. 23 is typically not sparse, so to obtain one of the elements C (pi (j)), a few xor's proportional to g have to be performed. In some embodiments, the number may be large, for example, because the number of permanent passivates, P, is initially chosen to be large, and g may be at least as large as P. This may place a severe limit on the number of permanently inactive symbols, i.e., the value of P, and if a smaller value of P is used, this may result in an increased number of OTF inactive intermediate symbols.

Fig. 27 depicts a modified decoding process that may be more computationally efficient than the process depicted in fig. 21. Steps 1405 to 1435 of the process may be the same as the corresponding steps of the process in fig. 14. Optionally, the process may save a copy of the original matrix in FIG. 12 or FIG. 15, or the relevant portion of this matrix, and the original symbols D (0), …, D (N + R-1) in additional memory locations for future use. This is not necessary for the process to work, but may lead to further speed advantages if the application has sufficient memory resources to hold these copies. Alternatively, the process may save only a copy of the original symbols D (0), …, D (N + R-1) without saving the matrix and recreate the matrix when needed. Step 1440 uses the stored copy of the matrix or undoes the process in step 1415 to recover the original equation set in FIG. 22, or just the upper portion of the equation set as presented in FIG. 28. At this time, the matrix 1510 given in fig. 29 is sparse, and the values of C (pi (w)), …, C (pi (L-1)) are known, where w ═ L-g.

As is well known, the right side of the equation in fig. 29 can be computed via a computationally efficient process involving a small xor of the symbols, i.e., equal to the number of non-zero entries in the matrix OTFI plus the number of non-zero entries in the matrix PI. This step of the process is represented by 1445 in fig. 27. After this step is complete, the right side of the equation in FIG. 29 has been calculated, and the system of equations where the unknowns are the values of C (pi (0)), …, C (pi (w-1)) will be solved. The system of equations may be solved in step 1450 using chain reaction decoding, since the lower triangle LO on the right is sparse, i.e. the number of xoring the symbols to solve the system of equations is equal to the number of non-zero entries in the matrix LO and is typically much smaller than w, which is the maximum number of possible non-zero entries.

Selection of the number of permanent passivations

The choice of the number of permanent passivates may affect the overall performance and may therefore be important. On the other hand, the number needs to be chosen as large as possible: if the number is large, the number of OTF passivations may be reduced to a very small number, sometimes even 0. This is because the combination of LT and SE matrices in fig. 15 (or its corresponding variant in fig. 23) is actually a decoding matrix of chain reaction codes with large overhead. This fact makes the number of OTF passivates small. In certain embodiments, OTF passivation may be more difficult to manage, so reducing its number may be advantageous in terms of speed and/or memory.

On the other hand, increasing the number of permanent passivates may have a negative effect on the run time: for example, step 1330 in the decoding process of FIG. 21 and corresponding step 1430 in the process of FIG. 27 require solving a system of equations having at least P rows and P columns. One way to do this would be to identify the invertible sub-matrix of matrix EL-2 in FIG. 25, invert the matrix, and use the inverted matrix to obtain the values of the intermediate symbols C (pi (L-g-1)), …, C (pi (L-1)). Since matrix EL-2 may not be sparse in many embodiments, obtaining the values of these intermediate symbols may cause the magnitude of xor on symbols g x g times. Since g is at least P, the number of xors to a symbol may be at least P x P, so if the total number of xors to a symbol is to remain linear within K, a good choice is to set the number P to be proportional to the square root of K. The specific examples of appendix a pick P on the order of 2.5 × sqrt (k) and remain consistent with this observation. This is a good choice for P, since with this choice of P, typically the number of OTF passivations is quite small, varying from about P to very close to or equal to 0.

Another quantity of interest is the average number I of inactive intermediate symbol neighbors where the encoded symbol or static symbol exists. Step 1445 of the decoding process in fig. 27 may require an average of up to I symbol xors per unrecovered midamble to complete this step. If I is large, the number of XOR's may be excessive for the memory and computational resources of the process performing the decoding or encoding process. On the other hand, if I is too small, the matrix EL-2 of fig. 25 may not have full rank and decodability may be compromised.

A more detailed analysis reveals that an important aspect of the permanent passivation is that the behavior of the matrix PI of fig. 15 is such that the columns are linearly independent of each other, i.e. the matrix is as full rank as possible. It is well known to those skilled in the art that if the PI is a random binary matrix, the maximum possible full rank can be achieved. On the other hand, PI may have a fraction of 1 inversely proportional to the square root of K on average in each column and still satisfy the same rank property as a purely random matrix. For this reason, the specific embodiment in appendix a chooses I as a number between 2 and 3, so P is chosen to be proportional to the square root of K, which means that the number of 1 s in each column of PI is on average inversely proportional to the square root of K.

There are many variations of these methods, as will be recognized by those skilled in the art upon reading the present disclosure. For example, the xor may be replaced with other operations, e.g. linear operators over a larger finite field, or the operators may be a mixture of different operators, e.g. some linear operators over a large finite field for some operations and other linear operators over a smaller large finite field for other operations.

Reference to the detailed example in appendix A

As detailed above, without permanent passivation (i.e., a predetermined decision as to which encoded symbols will not be part of the matrix manipulation that will be part of determining the sequence of chain reaction decoding), the number of OTF passivations can be quite random and cause potential problems in the sense of memory consumption. Where the number of source symbols is very large and the overhead is small, the error probability may be unacceptably close to 1.

Due to the high error probability of a small overhead, finding good system information may become increasingly difficult when the number of source symbols is large. Herein, the system information refers to information that needs to be provided to an encoder and a decoder to construct a system code in the sense of Shokrollahi III. Furthermore, whenever system information is obtained, the code behavior can be expected to deviate far from its average behavior, since the code should fail at 0 overhead "on average".

Some parameters for constructing chain reaction codes through permanent passivation may include the degree distribution Ω for LT encoder 250 of fig. 4, parameters for PI encoder 240, a determination as to the number of permanently passivated symbols, a determination as to the number of redundant static symbols and their structure, and the particular manner in which random numbers may be generated and shared between encoder 115 and decoder 155 of fig. 1.

Encoder and decoder using RQ codes

A preferred embodiment of the code, referred to hereinafter as the "RQ code," using the methods described herein is indicated in more detail in section 5 of appendix a. The remainder of appendix a describes a method of applying RQ codes for reliable object delivery over broadcast or multicast networks.

The RQ code implements systematic codes using the methods described previously and below, meaning that all source symbols are in the encoded symbol to be generated, so the encoded symbol can be considered a combination of the original source symbol and the repair symbols generated by the encoder.

Although some previous codes have good properties, there are some improvements that will improve their practical applications. Two important potential improvements are a steeper overhead-failure curve and a larger number of supported source symbols per source block. The overhead is the difference between the number of received encoded symbols and the number of source symbols in the source block, e.g., overhead 2 means that K +2 encoded symbols are received to decode a source block with K source symbols. The probability of failure at a given overhead is the probability that the decoder fails to fully recover the source block when the number of received encoded symbols corresponds to the overhead. The cost-failure curve is a plot of how the failure probability drops as a function of increasing cost starting from cost 0. The overhead-failure curve is better if the failure probability of the decoder drops quickly or steeply as a function of the overhead.

Random binary codes have a cost-failure probability curve where the failure probability drops by substantially a factor of 2 for each additional overhead symbol, with infeasible computational complexity, but the subject matter of this discussion is limited to the cost-failure probability curve rather than the computational complexity. In some applications this is a sufficient cost-failure curve, but for other applications a steeper cost-failure curve is preferred. For example, in streaming applications, the number of source symbols in a source block may be wide, e.g., K40, K200, K1,000, K10,000. To provide a good streaming experience, it may be desirable that the failure probability is low, e.g., 10^-5Or 10^-6. Since bandwidth tends to be at a premium for streaming applications, the percentage of repair symbols sent as a fraction of the source symbols should be minimized. For example, assume that a network transmitting a stream should be protected from packet loss up to 10% when using a source block with K200, and a failure probability of at most 10 is required^-6. The random binary code requires an overhead of at least 20 to achieve 10^-6I.e., the receiver needs 220 encoded symbols to decode with the failure probability. Each source block needs to send a total of 245 coded symbols to meet this requirement, since ceil (220/(1-0.1)) -245. Thus, the repair symbols add an additional 22.5% to the bandwidth requirement of the stream.

The RQ codes described herein and in section 5 of annex a achieve less than 10 for the values K ═ K ' (for all supported K ' values) and for K ═ 1 and K ═ K ' +1 (for all values except the last supported value of K), for overheads 0, 1, and 2, respectively, less than 10^-2、10^-4And 10^-6The probability of failure of (c). Various loss probabilities have been tested, such as loss probabilities of 10%, 20%, 50%, 70%, 90%, and 95%.

For the above example using RQ codes, overhead 2 is sufficient to achieve a failure probability of 10^-6Thus, only a total of 225 coded symbols need to be transmitted for each source block to meet this requirement, since ceil (202/(1-0.1)) -225. In this case, the repair symbols add an additional 12.5% to the bandwidth requirement of the stream, i.e., 10% less bandwidth overhead than the random binary code requires. Therefore, RQ codes that improve the overhead-failure curve have some very positive practical results.

There are applications where it is desirable to support a large number of source symbols per source block. For example, in mobile file broadcast applications, it is advantageous from a network efficiency point of view to encode the file as a single source block, or more generally, to divide the file into as few source blocks as possible. Assume, for example, that a 5-kilobyte file is to be broadcast and the available size within each packet used to carry the encoded symbols is one kilobyte. To encode the file as a single source block requires that a value of 50,000 be supported. (Note that there are molecular block techniques that allow decoding using much less memory as previously described).

The number of source symbols supported by the code may be limited for several reasons. One typical reason is that computational complexity becomes unreasonable as K increases, such as for Reed-Solomon codes, but this is not the case for codes such as chain reaction codes. Another reason may be that the probability of failure at 0 overhead increases to almost 1 as K increases, making it more difficult to find a systematic index that yields a good systematic code construction. The probability of failure at 0 overhead may indicate the difficulty of deriving a good code construction, since this is essentially the probability that the resulting systematic code construction has the property that the first K encoded symbols can decode the K source symbols when the systematic index is chosen randomly.

Since the overhead-failure curve of RQ code designs is so steep for all values of K, it may be easy to find a good system index and thus support much larger values of K. The RQ code as described in appendix a, section 5, supports K values up to 56,403, and also supports a total number of coded symbols per source block of up to 16,777,216. These limits on supported values of RQ codes are set due to practical considerations based on perceived application requirements, rather than due to limitations of RQ code design. Other embodiments than those shown in appendix a may have different values.

The RQ code limits the number of different source block sizes supported as follows. Given a source block of K source symbols to encode or decode, the value of K' is selected based on the table shown in section 5.6 of appendix a. The first column of the table lists the possible values for K'. The value of K 'selected is the minimum of the possibilities that makes K ≦ K'. The K source symbols C '(0), …, C' (K-1) are padded with K '-K symbols C' (K), …, C '(K' -1) set to a value of 0 to generate a source block including the K 'source symbols C' (0), …, C '(K' -1), and then encoding and decoding are performed on the padded source block.

The above approach has the benefit of reducing the number of system indices that need to be supported, i.e. only a few hundred instead of tens of thousands. There is no disadvantage for K in the sense of overhead-failure probability, since it is the same as the overhead-failure curve for the selected K': given the value of K, the decoder can calculate the value K ' and set the values of C ' (K), …, C ' (K ' -1) to 0, and thus it only needs to decode the remaining K of the K ' source symbols of the source block. The only potential drawback is that encoding and decoding with slightly more source symbols may require slightly more memory or computational resources. However, the spacing between consecutive values of K 'is about 1% for larger values of K', and thus the potential disadvantage is negligible.

Since the source block is padded from K to K ', the identifiers of the encoded symbols C' (0), C '(1) … within the RQ code are referred to as intra-symbol identifiers, abbreviated as ISI, where C' (0), …, C '(K' -1) are the source symbols and C '(K'), C '(K' +1) … are the repair symbols.

External applications employing the encoder and decoder use encoded symbol identifiers (also referred to as encoding symbol identifiers, abbreviated as ESI) ranging from 0 to K-1 to identify the original source symbols C '(0), …, C' (K-1), and use encoded symbol identifiers continuing with K, K +1 … to identify repair symbols C '(K'), C '(K' +1) …. Thus, the repair symbol C '(X) identified by ISI X within the RQ code is externally identified by ESI X- (K' -K). This is described in more detail in appendix A, section 5.3.1.

The encoding and decoding of RQ codes is defined by two types of relationships: a constraint relationship between the intermediate symbols and an LT-PI relationship between the intermediate symbols and the encoded symbols. The constraint relationship corresponds to a relationship between intermediate symbols defined by the SE matrix shown in fig. 12 or fig. 15, for example. The LT-PI relationship corresponds to the relationship between the intermediate symbols and the encoded symbols defined by the LT matrix and the PI matrix shown in, for example, fig. 12 or fig. 15.

The encoding is performed by determining an intermediate symbol value based on: (1) a source symbol value; (2) LT-PI relationships between the source symbols and the intermediate symbols; and (3) constraint relationships between intermediate symbols. The value of the repair symbol may be generated from the intermediate symbol based on an LT-PI relationship between the intermediate symbol and the repair symbol.

Similarly, decoding is performed by determining an intermediate symbol value based on: (1) receiving encoded symbol values; (2) receiving an LT-PI relationship between the encoded symbol and the intermediate symbol; and (3) constraint relationships between intermediate symbols. The value of the missing source symbol may be generated from the intermediate symbol based on an LT-PI relationship between the intermediate symbol and the missing source symbol. Thus, encoding and decoding are essentially symmetric procedures.

Example hardware Components

30-31 illustrate block diagrams of hardware that may be used to implement the methods described above. Each element may be hardware, program code or instructions executed by a general-purpose or special-purpose processor, or a combination thereof.

Fig. 30 illustrates an example encoding system 1000 that can be implemented as hardware modules, software modules, or portions of program code stored in a program storage 1002 and executed by a processor 1004, perhaps as integrated code units that are not separate as shown in the figure. The encoding system 1000 receives a signal input conveying source symbols and parameter information and outputs a signal conveying the information.

Input interface 1006 stores incoming source symbols into source symbol buffer 1008. The source-midamble generator 1010 generates midambles from the source symbols. This may be a passthrough in some embodiments and a decoder module in other embodiments (such as "system" embodiments).

The redundancy symbol generator 1012 generates redundancy symbols from the source symbols. This may be implemented as a chain reaction encoder, an LDPC encoder, an HDPC encoder, or similar encoder. Passivator 1014 receives source, intermediate, and/or redundant symbols, as the case may be, and stores some of the permanently passivated symbols in PI buffer 1018 and provides other symbols to output encoder 1016. This process may be logical only and not physical.

An operator 1020, such as an exclusive or operator, operates on one or more (in some embodiments 1) encoded symbols from output encoder 1016 and one or more (in some embodiments 1) PI symbols from PI buffer 1018, and the results of the operation are provided to transmit interface 1030, which transmit interface 1030 outputs signals from system 1000.

Fig. 31 illustrates an example decoding system 1100 that can be implemented as hardware modules, software modules, or portions of program code stored in a program storage 1102 and executed by a processor 1104, possibly as integrated code units that are not separate as shown in the figure. Some processes may be implemented only logically and not physically.

The decoding system 1100 takes an input signal and possibly other information and outputs source data if possible. The signal input is provided to a receive interface 1106, and the receive interface 1106 stores the received symbols in a buffer 1108. The ESI of the received symbols is provided to a matrix generator 1110, the matrix generator 1110 generates a matrix depending on the particular symbols received as described herein, and stores the results in a matrix memory 1112.

Scheduler 1114 may read matrix details from matrix memory 1112 and generate a schedule, which is stored in schedule memory 1016. Schedule 1114 may also generate a completion signal when completed and communicate the PI matrix to PI solver 1118. PI solver 1118 provides the solved PI symbol values to solver 1120, and solver 1120 also uses the schedule to decode intermediate symbols from the received symbols, the schedule, and the PI symbols.

The midamble symbols are provided to a midamble-source symbol generator 1122, and the midamble-source symbol generator 1122 can be an encoder or a straight-through. The output of mid-source symbol generator 1122 is provided to output interface 1124, which outputs the source data or any source data available for output.

Other considerations

In some cases, enhanced decodability may be required. In examples provided elsewhere herein, an LDPC symbol only has LT neighbors or PI neighbors that are not in HDPC symbols when the encoded symbol has both LT neighbors and PI neighbors. In some examples, decodability is improved if the LDPC symbol also has PI neighbors that include HDPC symbols. With neighbors between all PI symbols, including HDPC symbols, the decoding value of an LDPC symbol may be more similar to the decoding value of an encoded symbol. As explained elsewhere herein, the symbols depend on LT symbols (which can be easily encoded and decoded) and also on PI symbols including HDPC symbols (which can provide high reliability decoding), so there can be two advantages.

In one example, each LDPC symbol has two PI neighbors, i.e., the value of the LDPC symbol depends on the values of the two PI symbols.

Decodability may also be improved in some cases where the occurrence of repeated encoded symbols is reduced, where two encoded symbols are repeated if they have identical total neighbor sets, where the total neighbor sets of encoded symbols include LT neighbor sets and PI neighbor sets. Repeated encoded symbols having the same total set of neighbors carry exactly the same information about the intermediate source block that generated the encoded symbols, so there is no better chance of receiving more than one repeated encoded symbol at the time of decoding than one of the received repeated encoded symbols, i.e., receiving more than one repeated symbol increases the reception overhead and only one of the encoded symbols in the repetition is useful for decoding.

The preferred property is that each received encoded symbol is not a repetition of any other received encoded symbol, since this means that each received encoded symbol may be useful for decoding. Therefore, it may be preferable to reduce the number of such repetitions or to reduce the probability of repetition occurring.

One approach is to limit the number of LT neighbors each encoded symbol can have. For example, if there are W possible neighbors, the maximum number of neighbors may be limited to W-2. This reduces the chance that the total neighbor set will be repeated in some cases, since the neighbor set including all W possible neighbors will not be allowed. When the constraint is Deg [ v ] ═ min (d, W-2), there are W × (W-1)/2 different neighbor sets with a degree of W-2. Thus, it is unlikely that a duplicate total set of neighbors will be generated for the encoded symbol. Other constraints may be used instead, such as min (d, W-Wg) for some Wg other than Wg 2, or some other constraint.

Another technique that may be used alone or in conjunction with the above repetition reduction technique is to choose more than one PI neighbor for each encoded symbol such that the encoded symbol is less likely to have repeated PI neighbors and less likely to generate a repeated total set of neighbors for the encoded symbol. PI neighbors may be generated in a similar manner as how LT neighbors are generated, for example by first generating (d1, a1, b1) as shown in appendix a, section 5.3.5.4, according to the following code fragments:

if(d＜4)then{d1＝2+Rand[y，3，2]}else{d1＝2}；

a1＝1+Rand[y，4，P1-1]；

b1＝Rand[y，5，P1]；

note that in this example there is a significant random distribution defined for the number of PI neighbors d1, and the distribution depends on the chosen LT number of neighbors d, and the number of PI neighbors is likely to be larger when the LT number of neighbors is smaller. This provides the following attributes: the overall degree of the encoded symbols is such that the chances of repeating encoded symbols to be generated and thus received are reduced.

The encoded symbol values may be generated as shown in appendix A, section 5.3.5.3, and using the neighbor defined by (d1, a1, b1) by the following code fragment:

to support these decodability features or to provide decodability separately, a different system index J (K ') for the value of K' may be used, such as the system index shown in table 2 of section 5.6 of appendix a.

An example of a process that may be performed in a transmission and/or reception system to generate the system index J (K') is illustrated below. For each K 'in the list of possible K', one procedure that can typically be performed by an appropriately programmed circuit or processor is to check the appropriateness of the number of indices. For example, the circuit/processor may check, for J ═ 1 … 1000[ or some other limit ], whether the following criteria are met for the possible system index J:

(a) is decoding possible from K' source symbols at 0 overhead?

If so, the number of passivates in operation is recorded

(b) Is there a repeated total set of neighbors between the first K '/0.06 possible encoded symbols (with ESI 0, …, K'/0.06)? Other thresholds may be used instead. ]

(c) When decoding within 10,000 cycles [ or some other test ] using the first K' received encoded symbols, if each encoded symbol is lost with a probability of 0.93[ or some other threshold ] within each cycle independently of the other encoded symbols, then the probability of decoding failure is below 0.007[ or some other threshold ]?

The circuit/processor then selects among the possible system indices J that satisfy the above criteria (a), (b), and (c) to choose the system index that records the average number of passivates in the run in step (a).

Note that there are many variations of the above selection criteria. For example, in some cases, it may be preferable to choose a system index that satisfies (a), (b), and (c) above and that produces the least number of decoding failures in step (c) within a specified number of cycles. As another example, a combination of the number of passivations in operation and the probability of decoding failure may be considered in choosing the system index. As another example, there may be multiple system indices available for each K' value, and one of them is chosen randomly within a particular application.

The system index for the K' value listed in table 2 in section 5.6 of appendix a is one potential list of system indices for the codes described in appendix a.

Variation of the chunking process

It is known for various purposes to divide a block into sub-blocks of smaller units for further processing, either physically or logically. For example, it is used in IETF RFC 5053. It is also known from us patent No. 7,072,971. One of the main uses of the molecular block approach is to allow FEC codes to protect a large data block as a single entity, while using an FEC decoder at the receiver to recover the data block by using an amount of memory that is much smaller than the size of the data block.

One approach for choosing the number of sub-blocks described in IETF RFC5053 provides good source block partitioning and sub-block partitioning for many reasonable parameter settings, but in some circumstances may produce a solution that may not strictly meet the upper limit WS for the sub-block size (but even in these cases, a solution is produced in which the sub-block size is a moderate factor of a given constrained WS size for the sub-block size). As another example, in section 4.2 of draft-luma-rmt-bb-fec-raptor-object-00 (where the maximum number of source symbols in a source block is much larger than the maximum number of source symbols in a source block in IETF RFC 5053), the following recipe is provided to calculate T, Z and N, where T is the symbol size, Z is the number of source blocks into which a file (or data block) is divided, and N is the number of sub-blocks. Additionally, P 'is the packet payload size of the symbol, F is the file size in bytes, K' _ max is the maximum number of source symbols supported (e.g., 56,404), Al is an alignment factor specifying that the size of a symbol or sub-symbol should be a multiple of the Al bytes to allow for more efficient decoding, e.g., Al 4 is preferred for modern CPUs, and WS is a desirable upper limit for the sub-block size in bytes.

Note that the derivation of parameters T, Z and N may be done on the sender or an alternative server based on the values of F, Al and P'. The receiver only needs to know the values of F, Al, T, Z and N to determine the sub-block and source block structure of a file or data block in a received packet for that file or data block. The receiver may determine P' from the size of the received packet. Note that the transmitted and received packets typically also contain other information identifying the packet content, such as an FEC payload ID, typically 4 bytes in size and carrying the Source Block Number (SBN), and an Encoded Symbol Identifier (ESI) of the first symbol carried in the packet.

Previous methods for calculating T, Z, N described in section 4.2 of draft-iuby-rmt-bb-fec-raptor-object-00 set them to the following values:

●T＝P′

●Kt＝ceil(F/T)

●Z＝ceil(Kt/K′_max)

●N＝min{ceil(ceil(Kt/Z)*T/WS)，T/Al}

in these calculations, ceil () is a function that outputs the smallest integer greater than or equal to its input, and floor () is a function that outputs the largest integer less than or equal to its input. In addition, min () is a function that outputs the minimum value among its inputs.

One problem with deriving some parameter settings for the source block and sub-block partitioning in this way is that if T/Al is smaller than ceil (ceil (Kt/Z) × T/WS), the upper bound W for the sub-block size may not be observed.

A potential ancillary problem is that this allows the sub-symbols to be as small as Al, which is typically set to 4 bytes and may be too small to be practically efficient. Typically, the smaller the sub-symbol size, the more processing overhead is required to decode or encode the sub-block. Furthermore, especially at the receiver, smaller sub-symbol sizes mean more sub-blocks that need to be demultiplexed and decoded, and this consumes receiver resources such as CPU cycles and memory accesses. On the other hand, the smaller the allowable sub-symbol size means that the more the source block can be divided into sub-blocks complying with the specified upper limit WS of the sub-block size. Thus, smaller sub-symbols allow for larger supported source blocks, and thus the FEC protection provided across the source blocks results in better protection and better network efficiency. Indeed, in many cases, it is preferable to ensure that the sub-symbols are at least a specified minimum size, which provides the opportunity to better balance processing requirements with memory requirements at the receiver, as well as efficient utilization of network resources.

As an example of the derived parameters, the previous method described in section 4.2 of draft-iuby-rmt-bb-fec-raptorg-object-00 was used to calculate T, Z, N:

inputting:

F＝56,404KB

1KB 1,024 bytes

WS＝128KB

Al＝4

K′_max＝56,404

Calculation:

T＝1KB

Kt＝56,404

Z＝1

n256 (due to the second input of the min function)

In this example, there would be one active block, comprising 256 sub-blocks, where each sub-block is about 220KB (larger than WS), with at least some sub-blocks having a sub-symbol size of 4 bytes (extremely small).

A third problem is that the AL-FEC solution may not support all possible numbers of source symbols, i.e. it may only support a selected list of K 'values, where K' is the number of source symbols supported in the source block, and if the desired actual number of source symbols K in the source block is not among these K 'values, K is padded to the closest K' value, which means that the size of the source block used may be somewhat larger than the K value calculated from above.

A new molecular mass approach is now described which is an improvement over the previous approach described above. For descriptive purposes, the module for partitioning blocks may have as its inputs the data F to be partitioned and values including WS, Al, SS, and P', where the implications of these variables are intended to be described in more detail below.

WS represents a prescribed constraint on the maximum size of subblocks decodable in the working memory of the receiver, possibly in bytes. Al denotes a memory alignment parameter. Since receiver memory may operate more efficiently for the criteria if symbols and sub-symbols are in memory along memory alignment boundaries, it may be useful to track Al and store values in multiples of Al bytes. For example, typically Al-4, because many memory devices inherently address data in memory on four byte boundaries. Other values of Al are also possible, for example Al ═ 2 or Al ═ 8. Typically, Al can be set to the least common multiple memory alignment of all the many possible receivers. For example, if some receivers support 2-byte memory alignment but other receivers support 4-byte memory alignment, Al-4 would be recommended.

The parameter SS is determined based on a preferred sub-symbol size lower limit such that the sub-symbol size lower limit is SS Al bytes. It may be preferable to have the sub-symbol size be a multiple of a1 because the decoding operation is typically performed on the sub-symbols.

The following is a detailed explanation of a method for dividing data F into Z source blocks and then dividing the Z source blocks into N sub-blocks. In this description, P 'refers to a variable stored in memory (or implied) that represents the available bytes within a packet for a symbol to be transmitted, and P' is assumed to be a multiple of Al. T is a variable representing the size of the symbol to be placed within the transmitted packet. Other variables may be inferred from the text.

Novel molecular mass method for determining T, Z and N

●T＝P′

●Kt＝ceil(F/T)；

●N_max＝floor(T/(SS*Al))；

● N _ max for all N ═ 1, …

O KL (n) is the maximum value of K' supported as the number of possible source symbols in the source block, which satisfies

■K′≤WS/(Al*(ceil(T/(Al*n))))；

●Z＝ceil(Kt/KL(N_max))；

● N-minimum N such that ceil (Kt/Z) ≦ KL (N);

once these parameters are determined, the size of each of the Z source blocks and the sub-symbol size of the N sub-blocks of each source block can be determined as described in IETF RFC5053, i.e., Kt ceil (F/T), (KL, KS, ZL, ZS) Partition [ Kt, Z ], and (TL, TS, NL, NS) Partition [ T/Al, N ].

Kt is the number of source symbols in the file. In the subblock block, Kt source symbols are divided into Z source blocks: ZL source blocks each having KL source symbols and ZS source blocks each having KS source symbols. Subsequently, KL is rounded up to KL ', where KL' is at least the minimum supported number of source symbols for KL (and additional KL '-KL zero pad symbols are added to the source block for encoding and decoding purposes, but these additional symbols are typically not transmitted or received), and similarly KS is rounded up to KS', where KS 'is at least the minimum supported number of source symbols for KS (and additional KS' -KS zero pad symbols are added to the source block for encoding and decoding purposes, but these additional symbols are typically not transmitted or received).

These algorithms (performed by sub-block modules, other software modules, or hardware) ensure that the number of source symbols for each source block is as equal as possible, constrained by the number amounting to the number of source symbols Kt in the file. These algorithms also ensure that the sub-symbol sizes of the sub-blocks are as equal as possible, constrained by the fact that they are multiples of a1 and their sizes sum to the symbol size.

Subsequently, sub-symbol parameters TL, TS, NL and NS are calculated, where NL sub-blocks using the larger sub-symbol size TL Al and NS sub-blocks using the smaller sub-symbol size TS Al exist. The function Partition [ I, J ] IS implemented in software or hardware and IS defined as a function whose output IS a sequence of four integers (IL, IS, JL, JS), where IL (I/J), IS (floor (I/J), JL (I-IS) J, and JS (J-JL).

Some of the attributes of these new methods are quite notable. The subblock block may be capable of determining a lower bound derived with respect to a minimum sub-symbol size used. From the above formula, TS ═ floor (T/(Al ×) is known, where TS × Al is the smallest sub-symbol size used, since TS ≦ TL. Note that when N — max, the minimum sub-symbol is used. For X and Y, X/(floor (Y)) is ≧ X/Y, TS is at least floor (T/(Al) floor (T/(SS) Al))), and further at least floor (SS) ═ SS. Due to these facts, the minimum sub-symbol size produced by the partitioning method described herein will be at least TS Al SS Al, as desired.

The sub-block module can determine an upper limit derived with respect to a maximum sub-block size. The maximum subblock size used is TL Al KL ', where KL ' is the minimum K ' value in the table above, which is at least KL ceil (Kt/Z). Note that by the definition of N, KL' ≦ KL (N) and TL ═ ceil (T/(Al × N)). Since KL (N) is less than or equal to WS/(Al (T/(Al) N))), WS is more than or equal to KL (N) Al (Tail (T/(Al) N)) > is more than or equal to KL' Al TL.

The variable N _ max may represent the maximum number of sub-symbols into which a source symbol of size T may be divided. Setting N _ max to floor (T/(SS Al)) ensures that the minimum symbol size is at least SS Al. Kl (n) is the maximum number of source symbols in a source block that can be supported when the symbols of the source block are each divided into n sub-symbols to ensure that the size of each sub-block of the source block is at most WS.

The number of source blocks Z may be chosen to be as small as possible, subject to the constraint that the number of source symbols in each source block is at most KL (N _ max), which ensures that each source symbol may be divided into sub-symbols of at least SS Al size and that the resulting sub-block is at most WS size. The sub-block module determines from the value of Z the number of source blocks and the number of symbols in each of the Z source blocks.

Note that if any smaller Z value is used than that produced by this partitioning method, there will be sub-blocks of one of the source blocks that are larger than WS, or sub-blocks of one of the source blocks that have a sub-symbol size smaller than SS Al. In addition, the smallest source blocks produced by the partitioning method are as large as possible subject to both constraints, i.e., there is no other way to partition a file or data block into source blocks that obey both constraints, such that the smallest source blocks are larger than the smallest source blocks produced by the partitioning method. Therefore, the Z value generated by this division method is optimal in this sense.

The number N of sub-blocks into which the source block is divided may be chosen as small as possible subject to the constraint that for each sub-block the number of sub-symbols of the sub-block multiplied by the number of source symbols in the source block into which the sub-block is divided is at most WS.

Note that if any smaller value of N is used than would result from the Z value by this partitioning method, there will be at least one sub-block whose size will exceed WS. In addition, subject to the constraint that the maximum subblock size should not exceed WS, the minimum sub-symbol size that the partitioning method produces from a given Z value is as large as possible, i.e. no other method produces a subblock of the source block determined by the Z value that obeys the maximum subblock constraint such that the minimum sub-symbol size is larger than the minimum sub-symbol size produced by the partitioning method. Thus, the value of N produced by this method is optimal in this sense.

In the following example, it is assumed that all possible values of K' are supported as the number of source symbols in the source block.

Example 1

Inputting:

SS＝5

4 bytes of Al

(minimum sub-symbol size ═ 20 bytes)

WS 128KB 131,072 bytes

1,240 bytes of P ═ c

6,291,456 bytes for F6 MB

Calculation:

1,240 bytes

Kt＝5,074

N_max＝62

KL(N_max)＝6,553

Z＝1

KL＝ceil(Kt/Z)＝5,074

N＝52(KL(N)＝5,461)

TL 6, larger subsymbol 24 bytes

TS is 5, smaller sub-code element is 20 bytes

TL*AL*KL＝121,776

Example 2

Inputting:

SS＝8

4 bytes of Al

(minimum sub-symbol size ═ 32 bytes)

WS 128KB 131,072 bytes

1KB 1,024 bytes

56,404KB 57,757,696 bytes

Calculation:

t1,024 bytes

Kt＝56,404

N_max＝32

KL(N_max)＝4,096

Z＝14

KL＝ceil(Kt/Z)＝4,029

N＝32(KL(N)＝4,096)

TL 8, larger subsymbol 32 bytes

TS 8, smaller subsymbol 32 bytes

TL*AL*KL＝128,928

There are many variations of the above method. For example, for some FEC codes, it is desirable to have at least a minimum number of source symbols in the source block to minimize the source block reception overhead of the FEC code. Since the size of the source symbols may become too small for a really small file size or data block size F, there may also be a maximum number of source symbols into which the packet is divided. For example, in IETF RFC5053, the desirable minimum number of source symbols in a source block is Kmin 1024, and the maximum number of source symbols into which a packet is divided is Gmax 10.

The following is another variant of the above new molecular block method, which takes into account the additional parameters Kmin and Gmax just described, where G is the number of symbols of the source block carried in each packet, which variant can be performed by the molecular block module or more generally by some module or software or hardware on the encoder, decoder, transmitter and/or receiver.

In this variation, each packet carries the ESI of the first symbol in the packet, and then each subsequent symbol in the packet implicitly has an ESI that is 1 greater than the previous symbol in the packet.

Novel molecular mass method for determining G, T, Z and N

●G＝min(ceil(P′*Kmin/F)，floor(P′/(SS*Al))，Gmax)；

●T＝floor(P′/(Al*G))*Al；

●Kt＝ceil(F/T)；

●N_max＝floor(T/(SS*Al))；

● N _ max for all N ═ 1, …

■K′≤WS/(Al*(ceil(T/(Al*n))))；

●Z＝ceil(Kt/KL(N_max))；

● N-minimum N such that ceil (Kt/Z) ≦ KL (N);

note that by means of the algorithm G, it is ensured that the symbol size is at least SS × Al, i.e., the symbol size is at least the minimum sub-symbol size. Note also that this should be the case: SS Al is at least P' to ensure that the symbol size can be at least SS Al (and if not, G will evaluate to 0).

Example 3

Inputting:

SS＝5

4 bytes of Al

(minimum sub-symbol size ═ 20 bytes)

WS 256KB 262,144 bytes

1,240 bytes of P ═ c

500KB 512,000 bytes

Kmin＝1,024

Gmax＝10

Calculation:

G＝3

T＝412

Kt＝1,243

N_max＝20

KL(N_max)＝10,992

Z＝1

KL＝ceil(Kt/Z)＝1,243

N＝2(KL(N)＝1,260)

TL 52, larger subsymbol 208 bytes

TS 51, smaller subcode 204 bytes

TL*AL*KL＝258,544

As has now been described, these new methods introduce constraints on the minimum sub-symbol size for any sub-block. The present disclosure provides a new method for partitioning sub-blocks that respects this additional constraint while strictly respecting the maximum sub-block size constraint. These approaches produce split-source block and sub-block solutions that meet the goal of dividing a file or data block into as few source blocks as possible subject to a minimum sub-symbol size, and then splitting the source blocks into as few sub-blocks as possible subject to a maximum sub-block size.

Variants

In some applications, it may be acceptable not to be able to decode all source symbols or to be able to decode all source symbols but with a relatively low probability. In such applications, the receiver may stop attempting to decode all of the active symbols after receiving the K + a encoded symbols. Alternatively, the receiver may cease receiving encoded symbols after receiving less than K + a encoded symbols. In some applications, the receiver may even receive only K or fewer encoded symbols. Thus, it will be appreciated that in some embodiments of the invention, the desired accuracy need not be a complete recovery of all source symbols.

Furthermore, in some applications where incomplete recovery is acceptable, the data may be encoded such that all source symbols cannot be recovered, or such that complete recovery of these source symbols would require the reception of much more encoded symbols than the number of source symbols. Such encoding will generally require less computational expense and, therefore, may be an acceptable way to reduce the computational expense of the encoding.

It will be appreciated that the various functional blocks of the above-described figures may be implemented by combinations of hardware and/or software, and that in a particular implementation, some or all of the functionality of some of the blocks may be combined. Similarly, it will also be understood that the various methods taught herein may be implemented by a combination of hardware and/or software.

The above description is illustrative only and not restrictive. Many variations of the invention will become apparent to those skilled in the art upon review of this disclosure. The scope of the invention should, therefore, be determined not with reference to the above description, but instead should be determined with reference to the appended claims along with their full scope of equivalents.

Appendix A

RaptorQ forward error correction scheme for object delivery

draft-ietf-rmt-bb-fec-raptorq-04

Abstract

This document describes a fully-canonical FEC scheme for RaptorQ forward error correction codes and their reliable data object delivery applications corresponding to FEC encoding ID 6 (to be validated (tbc)).

RaptorQ codes are a new family of codes that offer superior flexibility, support larger source block sizes, and better coding efficiency than Raptor codes in RFC 5053. RaptorQ is also a fountain code, i.e., the encoder can generate as many coded symbols as needed on the fly from the source symbols of a block of source data. The decoder is able to recover the source block from any set of code symbols equal to the number of source symbols for most cases, and in rare cases with any set of code symbols slightly more than the number of source symbols.

The RaptorQ code described herein is a systematic code, meaning that all source symbols are among the code symbols that can be generated.

The state of the memo

The internet draft is submitted in full compliance with the specifications of BCP 78 and BCP 79.

The internet draft is a work document of the Internet Engineering Task Force (IETF). Note that other groups may also distribute work documents as internet drafts. The current Internet draft is listed inhttp://datatracker.ietf.org/drafts/current/。

An internet draft is a draft document that is valid for up to 6 months and can be updated, replaced, or discarded by other documents at any time. It is not appropriate to refer to Internet drafts as reference or to them as not "in progress".

This internet draft will expire at 12 months 2, 2011.

Copyright declaration

This document conforms to BCP 78 and IETF Trust's legal terms regarding the effect of IETF documents on the release date of this document (http:// tree. Please review these documents carefully because they describe the reader's rights and limitations with respect to this document. The code components extracted from this document must include the simplified DSD license text as described in Trust legal clause section 4.e, and are not guaranteed to be provided as described in the simplified BSD license.

Directory

1. Introduction … … … … … … … … … … … … … … … … … … … … … … … … … … … … 5

2. Necessary notes … … … … … … … … … … … … … … … … … … … … … … … … … … 5

3. Format and code … … … … … … … … … … … … … … … … … … … … … … … … … 5

FEC payload ID … … … … … … … … … … … … … … … … … … … … … … 5

FEC object transport information … … … … … … … … … … … … … … … … … … … … … 6

3.2.1. Mandatory … … … … … … … … … … … … … … … … … … … … … … … … 6

3.2.2. Common … … … … … … … … … … … … … … … … … … … … … … … … … 6

3.2.3. Solution specific … … … … … … … … … … … … … … … … … … … … … … 6

4. Protocol … … … … … … … … … … … … … … … … … … … … … … … … … … … … 7

4.1. Introduction … … … … … … … … … … … … … … … … … … … … … … … … … … … 7

4.2. Content delivery protocol requirements … … … … … … … … … … … … … … … … … … … … … 7

4.3. Example parameter derivation Algorithm … … … … … … … … … … … … … … … … … … … … … 8

4.4. Object delivery … … … … … … … … … … … … … … … … … … … … … … … … … 9

4.4.1. Source block construction … … … … … … … … … … … … … … … … … … … … … … … … 9

4.4.2. Code packet construction … … … … … … … … … … … … … … … … … … … … … … 10

RaptorQ FEC code Specification … … … … … … … … … … … … … … … … … … … … … 11

5.1. Definitions, notations and abbreviations … … … … … … … … … … … … … … … … … … … … … 11

5.1.1. Definition … … … … … … … … … … … … … … … … … … … … … … … … … … 11

5.1.2. Symbol … … … … … … … … … … … … … … … … … … … … … … … … … … 12

5.1.3. Abbreviation … … … … … … … … … … … … … … … … … … … … … … … … … … 13

5.2. Overview … … … … … … … … … … … … … … … … … … … … … … … … … … … 14

5.3. Systematic RaptorQ encoder … … … … … … … … … … … … … … … … … … … … 14

5.3.1. Introduction … … … … … … … … … … … … … … … … … … … … … … … … … … 14

5.3.2. Coding overview … … … … … … … … … … … … … … … … … … … … … … … … 15

5.3.3. A first encoding step: intermediate symbol generation … … … … … … … … … … … … … … … 17

5.3.4. A second encoding step: code … … … … … … … … … … … … … … … … … … … 21

5.3.5. Generator … … … … … … … … … … … … … … … … … … … … … … … … … 21

5.4. Example FEC decoder … … … … … … … … … … … … … … … … … … … … … … 24

5.4.1. Overview … … … … … … … … … … … … … … … … … … … … … … … … … … 24

5.4.2. Decoding extended source block … … … … … … … … … … … … … … … … … … … … … … 24

5.5. Random number … … … … … … … … … … … … … … … … … … … … … … … … … … 28

5.5.1. TABLE V0 … … … … … … … … … … … … … … … … … … … … … … … … … … 28

5.5.2. TABLE V1 … … … … … … … … … … … … … … … … … … … … … … … … … … 29

5.5.3. TABLE V2 … … … … … … … … … … … … … … … … … … … … … … … … … … 30

5.5.4. TABLE V3 … … … … … … … … … … … … … … … … … … … … … … … … … … 31

5.6. System index and other parameters … … … … … … … … … … … … … … … … … … … … 31

5.7. Operation … … … … … … … … … … … … … … … … 47 of octets, symbols and matrix

5.7.1. Overview … … … … … … … … … … … … … … … … … … … … … … … … … … 47

5.7.2. Arithmetic … … … … … … … … … … … … … … … … … … 47 on octets

5.7.3. TABLE OCT _ EXP … … … … … … … … … … … … … … … … … … … … … … … 48

5.7.4. Table OCT _ LOG … … … … … … … … … … … … … … … … … … … … … … … 49

5.7.5. Operation on symbols … … … … … … … … … … … … … … … … … … … … … … 49

5.7.6. Operation … … … … … … … … … … … … … … … … … … … … … … 50 on matrix

5.8. Decoder compatible requirements … … … … … … … … … … … … … … … … … … … … … 50

6. Safety considerations … … … … … … … … … … … … … … … … … … … … … … … … … 50

IANA Accept … … … … … … … … … … … … … … … … … … … … … … … … … … 51

8. Thanks … … … … … … … … … … … … … … … … … … … … … … … … … … … … 51

9. Reference … … … … … … … … … … … … … … … … … … … … … … … … … … 51

9.1. Standard reference … … … … … … … … … … … … … … … … … … … … … … 51

9.2. Informative reference … … … … … … … … … … … … … … … … … … … … … … 52

Author address … … … … … … … … … … … … … … … … … … … … … … … … … … … 52

1. Introduction to

This document specifies a FEC scheme for RaptorQ forward error correction codes for object delivery applications. The concept of the FEC scheme is defined in RFC5052[ RFC5052] and the document follows the format specified therein and uses the terminology of the document. The Raptor q code described herein is the next generation of Raptor code described in RFC5053[ RFC5053 ]. RaptorQ codes offer superior reliability, better coding efficiency, and support larger source block sizes than the Raptor codes of RFC5053[ RFC5053 ]. These improvements simplify the use of raptor q code in the object delivery content delivery protocol compared to RFC5053[ RFC5053 ].

The RaptorQ FEC scheme is a full-specification FEC scheme corresponding to FEC coding ID 6 (tbc).

The editor notes: the final FEC encoding ID remains pending, but '6 (tbc)' will be used in this Internet draft

As a nonce, it is desirable to use FEC encoding IDs sequentially during IANA registration.

RaptorQ is a fountain code, i.e., the encoder can generate as many coded symbols as needed on the fly from the source symbols of the block. The decoder is able to recover the source block from any set of coded symbols that are only slightly more in number than the number of source symbols.

The code described in this document is systematic code, i.e. the original source symbols can be sent from the sender to the receiver without modification together with several repair symbols. See [ RFC3453] for more background on the use of forward error correction codes in reliable multicast.

2. Necessary notes

The keywords "must", "must not", "require", "should not", "recommended", "may", and "optional" in this document will be interpreted as described in [ RFC2119 ].

3. Formats and codes

FEC payload ID

The FEC payload ID must be a field of 4 octets defined as follows:

FIG. 1: FEC payload ID format

O Source Block Number (SBN), (8 bits, unsigned integer): a non-negative integer identifier of a source block to which the code symbols within the packet relate.

Encoding symbol id (esi), (24 bits, unsigned integer): a non-negative integer identifier of a code symbol within a packet.

The interpretation of the source block number and the code symbol identifier is defined in section 4.

FEC object transport information

3.2.1. Forced

The FEC encoding ID must have a value of 6 as assigned by IANA (see section 7).

3.2.2. In common

The common FEC object transmission information elements used by the FEC scheme are:

transmission length (F), (40 bits, unsigned integer): a non-negative integer of up to 946270874880. This is the object transfer length in octets.

O symbol size (T), (16 bits, unsigned integer): a positive integer less than 2^ 16. This is the symbol size in octets.

The encoded common FEC object transmission information format is shown in fig. 2.

FIG. 2: encoded common FEC OTI for RaptorQ FEC scheme

Note 1: the limit 946270874880 on transmission length is due to limiting the symbol size to 2^ 16-1, limiting the number of symbols in the source block to 56403, and limiting the number of source blocks to 2^ 8.

3.2.3. Dependent on the scheme

The following parameters are carried in the scheme-specific FEC object transmission information element for this FEC scheme:

o number of source blocks (Z) (12 bits, unsigned integer)

O number of sub-blocks (N) (12 bits, unsigned integer)

O symbol alignment parameter (A1) (8 bits, unsigned integer)

These parameters are solved as positive integers. The encoded scheme-specific object transport information is a field of 4 octets composed of parameters Z, N and Al as shown in fig. 3.

FIG. 3: encoded scheme-specific FEC object transport information

The encoded FEC object transmission information is a field of 12 octets consisting of a concatenation of the encoded common FEC object transmission information and the encoded scheme-specific FEC object transmission information.

These three parameters define the source block partitioning as described in section 4.4.1.2.

4. Protocol

4.1. Introduction to

For any undefined symbols or functions used in this section, in particular the functions "ceil" and "floor", see section 5.1.

4.2. Content delivery protocol requirements

This section describes the exchange of information between the RaptorQ FEC scheme and any Content Delivery Protocol (CDP) that utilizes the RaptorQ FEC scheme for object delivery.

The RaptorQ encoder scheme and RaptorQ decoder scheme for object delivery require the following information from the CDP:

object transfer length in octets F

O symbol alignment parameter A1

Symbol size T in octets, which must be a multiple of A1

O number of source blocks Z

The number of subblocks N in each source block

The RaptorQ encoder scheme for object delivery also requires:

-the object to be encoded is F octets

The RaptorQ encoder scheme provides the CDP with the following information about each packet to be transmitted:

o Source Block Number (SBN)

O Encoding Symbol ID (ESI)

O encoding code element

The CDP must communicate this information to the receiver.

4.3. Example parameter derivation Algorithm

This section provides recommendations on deriving three transmission parameters T, Z and N. The recommendation is based on the following input parameters:

o F, object transfer length, in octets

WS, maximum size of decodable Block in working memory, in octets

O P', maximum payload size in octets, which is assumed to be a multiple of A1

O A1, symbol alignment parameter in octets

SS, where the desired lower limit on the subsymbol size is a parameter of SS Al

K' _ max, maximum number of source symbols per source block.

Note: section 5.1.2 defines K' _ max as 56403.

Based on the above inputs, the transmission parameters T, Z and N are calculated as follows:

order:

○T＝P′

○Kt＝ceil(F/T)

○N_max＝floor(T/(SS*Al))

o 1, …, N _ max for all N

Kl (n) is the maximum K' value of WS/(Al (T/(Al n)))) in table 2, section 5.6

○Z＝ceil(Kt/KL(N_max))

N is the minimum N-1, …, N _ max such that ceil (Kt/Z) <kl (N)

It is recommended that each packet comprises exactly one symbol. However, the receiver should support a packet containing multiple symbols.

The value Kt is the total number of symbols necessary to represent the source data of the object.

The algorithm defined above and in section 4.4.1.2 ensures that the sub-symbol size is a multiple of the symbol alignment parameter a 1. This is useful because the summation operations for encoding and decoding are typically performed several octets at a time, e.g., at least 4 octets at a time on a 32 bit processor. Therefore, if the sub-symbol size is a multiple of the number of octets, encoding and decoding can be performed more quickly.

The recommended setting for input parameter a1 is 4.

The parameters WS may be used to generate encoded data that can be efficiently decoded at the decoder with limited working memory. Note that for a given WS value, the actual maximum decoder memory requirement depends on the implementation, but it is possible to implement decoding using a working memory that is only slightly larger than WS.

4.4. Object delivery

4.4.1. Source block structure

4.4.1.1. Overview

To apply a RaptorQ encoder to a source object, the object can be broken into Z > - ═ 1 blocks called source blocks. The RaptorQ encoder is applied independently to each source block. Each source block is identified by a unique Source Block Number (SBN), where the first source block has an SBN of 0, the second source block has an SBN of 1, and so on. Each source block is divided into a number K of source symbols each of size T octets. Each source symbol is identified by a unique Encoding Symbol Identifier (ESI), where the first source symbol of a source block has an ESI of 0, the second source symbol has an ESI of 1, and so on.

Each source block with K source symbols is divided into N ═ 1 sub-blocks, which are small enough to be decoded in the working memory. Each sub-block is divided into K sub-symbols of size T'.

Note that the value of K is not necessarily the same for each source block of the object, and the value of T' is not necessarily the same for each sub-block of the source block. However, the symbol size T is the same for all source blocks of the object, and the number of symbols K is the same for each sub-block of the source block. The exact division of the object into source blocks and sub-blocks is described in section 4.4.1.2 below.

4.4.1.2. Source block and sub-block partitioning

The construction of the source blocks and sub-blocks is determined based on the 5 input parameters F, Al, T, Z and N and the function Partition [ ]. These 5 input parameters are defined as follows:

o F, object transfer length, in octets

O A1, symbol alignment parameter in octets

O.T, symbol size, in octets, which must be a multiple of A1

Z, number of Source blocks

N, number of sub-blocks in each source block

These parameters must be set such that ceil (F/T)/Z) <k' _ max. Recommendations for the derivation of these parameters are provided in section 4.3.

The function Partition [ ] takes a pair of positive integers (I, J) as input and derives 4 non-negative integers (IL, IS, JL, JS) as output. Specifically, the value of Partition [ I, J ] IS the sequence (IL, IS, JL, JS), where IL-ceil (I/J), IS-floor (I/J), JL-I-IS J, and JS-JL. Partition [ ] derives the parameters for dividing a block of size I into J approximately the same size blocks. Specifically, JL blocks of length IL and JS blocks of length IS.

The source object must be divided into source blocks and sub-blocks as follows:

order:

○Kt＝ceil(F/T)，

○(KL，KS，ZL，ZS)＝Partition[Kt，Z]，

○(TL，TS，NL，NS)＝Partition[T/Al，N].

then, the object must be divided into Z × ZL + ZS contiguous source blocks, the first ZL source blocks each having KL × T octets, i.e. KL source symbols of T octets each, and the remaining ZS source blocks each having KS × T octets, i.e. KS source symbols of T octets each.

If Kt T > F, then for the encoding process, the last symbol of the last source block must be filled at the end with Kt T-F0 octets.

Next, each source block with K source symbols must be divided into N NL + NS contiguous sub-blocks, the first NL sub-blocks each being made up of K contiguous sub-symbols of size TL Al octets, and the remaining NS sub-blocks each being made up of K contiguous sub-symbols of size TS Al octets. The symbol alignment parameter a1 ensures that these sub-symbols are always a multiple of a1 octets.

Finally, the mth symbol of the source block is composed of a concatenation of the mth sub-symbols from each of the N sub-blocks. Note that this means that when N > 1, the symbol is not a contiguous part of the object.

4.4.2. Code packet construction

Each code packet contains the following information:

o Source Block Number (SBN)

O Encoding Symbol ID (ESI)

O encoding code element

Each source block is encoded independently of the other source blocks. The source blocks are numbered consecutively from 0.

The encoding symbol IDs from 0 to K-1 identify the source symbols of the source block in sequential order, where K is the number of source symbols in the source block. The encoding symbol ID K identifies forward the repair symbols generated from the source symbols using the RaptorQ encoder.

Each code packet is either composed entirely of source symbols (source packet) or entirely of repair symbols (repair packet). A packet may contain any number of symbols from the same source block. In the case where the last source symbol in the source packet includes padding octets added for FEC encoding purposes, these octets need not be included in the packet. Otherwise, only the ensemble of symbols must be included.

The code symbol ID X carried in each source packet is the code symbol ID of the first source symbol carried in the packet. Subsequent source symbols in the packet have encoding symbols ID X +1 through X + G-1 in sequential order, where G is the number of symbols in the packet.

Similarly, the code symbol ID X placed in the repair packet is the code symbol ID of the first repair symbol in the repair packet, and subsequent repair symbols in the packet have code symbols ID X +1 through X + G-1 in sequential order, where G is the number of symbols in the packet.

Note that the receiver does not have to know the total number of repair packets.

RaptorQ FEC code specification

5.1. Definitions, symbols and abbreviations

For purposes of the RaptorQ FEC code specification in this section, the following definitions, symbols and abbreviations apply

5.1.1. Definition of

Source block: block of K source symbols considered together for RaptorQ encoding and decoding purposes

Extended source block: a block of K 'source symbols constructed from a source block and zero or more padding symbols, where K' > ═ K.

Symbol o: a unit of data. The size of a symbol in octets is referred to as the symbol size. The symbol size is always a positive integer.

O source symbols: the smallest data unit used during the encoding process. All source symbols within a source block have the same size.

O padding symbols: symbols with all zero bits added to the source block to form an extended source block.

Encoding symbol: symbols that may be transmitted as part of encoding the source block. The encoding symbols of the source block are composed of the source symbols of the source block and the repair symbols generated from the source block. Repair symbols generated from a source block have the same size as the source symbols of the source block.

O repair symbol: the source block is not an encoded symbol of the source symbols. The repair symbols are generated based on the source symbols of the source block.

O middle symbol: symbols generated from the source symbols using an inverse encoding process. Repair symbols are then generated directly from the intermediate symbols. The encoding symbols do not include intermediate symbols, i.e., the intermediate symbols are not transmitted as part of encoding the source block. The intermediate symbol is divided into LT symbols and PI symbols.

O LT symbol: may be a subset of the intermediate symbols of the LT neighbors of the code symbol.

O PI symbol: may be a subset of the intermediate symbols of the PI neighbors of the code symbol.

O System code: wherein all source symbols are included as a code of a portion of the coded symbols of the source block. RaptorQ codes as described herein are systematic codes.

Encoding symbol id (esi): information that uniquely identifies each code symbol associated with a source block for transmission and reception purposes.

Inner symbol id (isi): information uniquely identifying each symbol associated with the extended source block for encoding and decoding purposes.

Arithmetic operations on octets and symbols and matrix: these operations are used to generate code symbols from source symbols and vice versa. See section 5.7.

5.1.2. Symbol

i. j, u, v, h, d, a, b, d1, a1, b1, v, m, x, y represent one type or another type of value or variable depending on the context.

X is a non-negative integer value that is either an ISI value or an ESI value depending on the context designation.

ceil (x) denotes the smallest integer greater than or equal to x, where x is a real value.

floor (x) denotes the largest integer less than or equal to x, where x is a real value.

min (x, y) denotes the minimum of the values x and y, and in general the minimum of all the argument values.

max (x, y) denotes the maximum of the values x and y, and in general the maximum of all the argument values.

i% j denotes i modulo j.

i + j denotes the sum of i and j. If i and j are octets/symbols, this specifies the arithmetic on octets/symbols, as defined in section 5.7. If i and j are integers, this represents normal integer addition.

i x j denotes the product of i and j. If i and j are octets, this specifies the arithmetic on octets, as defined in section 5.7. If i is an octet and j is a symbol, this means a multiplication of the symbol with the octet, as also defined in section 5.7. Finally, if i and j are integers, i x j represents the ordinary integer product.

a ^ b denotes the b-th power of operation a. If a is an octet and b is a non-negative integer, it is understood that this means a … a (item b), where' is the octet product as defined in section 5.7.

u ^ v denotes the bit-wise XOR of u and v for the equal length bit string u and v.

Transpose [ A ] designates the Transpose of matrix A. In this specification, all matrices have entries that are octets.

A ^ -1 denotes the inverse of matrix A. In this specification, all have octets as entries, so it is understood that the operations on the matrix entries will proceed as set forth in section 5.7, and A ^ -1 is the matrix inverse of A on octet arithmetic.

K denotes the number of symbols in a single source block.

K' denotes the number of source symbols plus padding symbols in the extended source block. For most of this specification, the padding symbols are considered as additional source symbols.

K' _ max denotes the maximum number of source symbols that can exist in a single source block. Set to 56403.

L denotes the number of intermediate symbols with respect to a single extended source block.

S denotes the number of LDPC symbols with respect to a single extended source block. These are LT symbols. For each value of K' shown in Table 2, section 5.6, the corresponding value of S is prime.

H denotes the number of HDPC symbols for a single extended source block. These are PI symbols.

B denotes the number of intermediate symbols that are LT symbols except for the LDPC symbol.

W denotes the number of intermediate symbols of LT symbols. For each value of K' in Table 2 shown in section 5.6, the corresponding value of W is prime.

P denotes the number of intermediate symbols of the PI symbol. These contain all HDPC symbols.

P1 denotes the smallest prime number greater than or equal to P.

U denotes the number of non-HDPC intermediate symbols of the PI symbol.

C denotes the middle symbol arrays C [0], C [1], C [2], …, C [ L-1 ].

C ' denotes a symbol array of the extended source block, wherein C ' [0], C ' [1], C ' [2], …, C ' [ K-1] are source symbols of the source block, and C ' [ K ], C ' [ K +1], …, C ' [ K ' -1] are padding symbols.

V0, V1, V2, V3 denote an array of 4 32-bit unsigned integers, V0[0], V0[1], …, V0[255 ]; v1[0], V1[1], … and V1[255 ]; v2[0], V2[1], … and V2[255 ]; and V3[0], V3[1], …, V3[255], as shown in section 5.5.

Rand [ y, i, m ] denotes a pseudo-random number generator

Deg v mark degree generator

Enc [ K', C, (d, a, b, d1, a1, b1) ] denotes a code symbol generator

Tuple [ K', X ] labeled Tuple generator function

T denotes the symbol size in octets.

J (K ') denotes the system index associated with K'.

G denotes any generator matrix.

I _ S denotes SxS identity matrix.

5.1.3. Abbreviations

ESI encoding symbol ID

HDPC high density parity check

ISI internal symbol ID

LDPC low density parity check

LTLuby transform

Permanent PI passivation

SBN source block numbering

SBL Source Block Length (in symbols)

5.2. Overview

This section defines the systematic RaptorQ FEC code.

A symbol is the basic unit of data for the encoding and decoding process. For each source block, all symbols have the same size, referred to as the symbol size T. The atomic operations performed on the symbols for both encoding and decoding are the arithmetic operations defined in section 5.7.

The basic encoder is described in section 5.3. The encoder first derives an intermediate symbol block from the source symbols of the source block. The middle block has the following properties: both source symbols and repair symbols may be generated from the intermediate block using the same process. The encoder uses an efficient process to generate repair symbols from the intermediate block, where each such repair symbol is an exclusive-or of a small number of intermediate symbols from the block. The same procedure can also be used to regenerate the source symbols from the intermediate block. The code symbols are a combination of source symbols and repair symbols.

An example of a decoder is described in section 5.4. The process for generating source symbols and repair symbols from the intermediate block is designed such that the intermediate block can be recovered from any sufficiently large set of encoding symbols independently of the mix of source symbols and repair symbols in the set. Once the intermediate blocks are recovered, the missing source symbols of the source block can be recovered using an encoding process.

The requirements for a RaptorQ-compatible decoder are provided in section 5.8. Several decoding algorithms are possible to achieve these requirements. An efficient decoding algorithm to achieve these requirements is provided in section 5.4.

The construction of the intermediate symbols and the repair symbols is based in part on the pseudo-random number generator described in section 5.3. The generator is based on a fixed set of 1024 random numbers that must be available to both the sender and the receiver. These numbers are provided in section 5.5. The RaptorQ encoding and decoding operations use operations on octets. Section 5.7 describes how these operations are performed.

Finally, constructing intermediate symbols from the source symbols is governed by the "systematic index", the value of which is provided in section 5.6 for a specific extended source block size between 6 and K' _ max 56403 source symbols. Thus, RaptorQ codes support source blocks that hold between 1 and 56403 source symbols.

5.3. Systematic raptorQ encoder

5.3.1. Introduction to

For a given source block with K source symbols, the source block is augmented with K '-K additional padding symbols for encoding and decoding purposes, where K' is at least the minimum of K in section 5.6 of the system index table 2. The reason for stretching the source block to multiples of K' is to achieve faster encoding and decoding and to minimize the amount of table information that needs to be stored in the encoder and decoder.

For the purpose of transmitting and receiving data, the value of K is used to determine the number of symbols in the source block, and therefore K needs to be known at the sender and receiver. In this case, the sender and receiver may calculate K 'from K and the K' -K padding symbols may be automatically added to the source block without requiring additional communication. The sender and receiver use an encoding symbol id (esi) to identify the encoding symbols of the source block, where the encoding symbols of the source block are comprised of source symbols and repair symbols associated with the source block. For a source block with K source symbols, the ESIs of the source symbols are 0, 1,2, …, K-1 and the ESIs of the repair symbols are K, K +1, K +2, …. Using ESI to identify encoding symbols in a transmission ensures that ESI values continue coherently between source symbols and repair symbols.

For the purpose of encoding and decoding data, a value of K ' derived from K is used as the number of source symbols of an extended source block on which the encoding and decoding operations are performed, wherein K ' source symbols are composed of original K source symbols and additional K ' -K padding symbols. The inner symbol id (isi) is used by the encoder and decoder to identify the symbols associated with the extended source block, i.e., for generating encoded symbols and for decoding. For a source block with K original source symbols, the ISI of the original source symbols is 0, 1,2, …, the ISI of the K-1, K ' -K filler symbols is K, K +1, K +2, …, K ' -1, and the ISI of the repair symbols is K ', K ' +1, K ' +2, …. Using ISI for encoding and decoding allows padding symbols of an extended source block to be treated in the same way as other source symbols of the extended source block, and a given prefix of repair symbols is generated in a manner consistent with a given number K' of source symbols in the extended source block, independent of K.

The relationship between ESI and ISI is simple: the ESI and ISI of the original K source symbols are identical, the K '-K pad symbols have ISI but do not have corresponding ESI (since they are symbols that are neither transmitted nor received), and the repair symbol ISI is simply the repair symbol ESI plus K' -K. It is the responsibility of the padding function in the raptor q encoder/decoder to identify the transitions between the ESI of the transmitted and received encoding symbols and the corresponding ISI used for encoding and decoding, and to properly pad the extended source block with padding symbols used for encoding and decoding.

5.3.2. Coding overview

Any number of repair symbols are generated from a source block consisting of K source symbols put into the extended source block C' using a systematic RaptorQ encoder. Fig. 4 shows a coding overview.

The first step of encoding is to construct the extended source block by adding zero or more padding symbols such that the total number of symbols, K', is one of the values listed in section 5.6. Each padding symbol consists of T octets, where each octet has a value of 0. K 'must be selected as the minimum value of K' from the table in section 5.6, which is greater than or equal to K.

FIG. 4: coding overview

Let C '[ 0], …, C' [ K-1] denote K source symbols.

Let C '[ K ], …, C' [ K '-1 ] denote K' -K padding symbols, all of which are set to 0 bits. Then, C '[ 0], …, C' [ K-1] are extended source blocks on which encoding and decoding are performed.

In the remainder of this description, these padding symbols will be considered as additional source symbols and are referred to as such. However, these padding symbols are not part of the code symbols, i.e., they are not transmitted as part of the encoding. At the receiver, the value of K ' may be calculated based on K, and then the receiver may insert K ' -K padding symbols at the end of the source block with K ' source symbols and recover the remaining K source symbols of the source block from the received encoded symbols.

The second step of encoding is to generate a number L > K 'of intermediate symbols from K' source symbols. In this step, K ' source tuples (d [0], a [0], b [0], d1[0], a1[0], b1[0]), …, (d [ K ' -1], a [ K ' -1], b [ K ' -1], d1[ K ' -1], a1[ K ' -1], b1[ K ' -1]) are generated using the Tuple [ ] generator as described in section 5.3.5.4. The K 'source tuples and ISIs associated with the K' source symbols are used to determine L intermediate symbols C [0], …, C [ L-1] from the source symbols using an inverse encoding process. This process may be implemented by a RaptorQ decoding process.

Within the L intermediate symbols, some "precoding relationship" must be established. Section 5.3.3.3 describes these relationships. Section 5.3.3.4 describes how intermediate symbols are generated from source symbols.

Once the intermediate symbols are generated, repair symbols may be generated. For repair symbols with ISI X > K', the non-negative integer tuples (d, a, b, d1, a1, b1) may be generated using the Tuple [ ] generator described in section 5.3.5.4. Subsequently, the (d, a, b, d1, a1, b1) tuple and ISI X are used to generate a corresponding repair symbol from the intermediate symbol using the Enc [ ] generator described in section 5.3.5.3. The corresponding ESI for the repaired symbol is then X- (K' -K). Note that the source symbols for the extended source block can also be generated using the same process, i.e., for any X < K ', the symbols generated using this process have the same value as C' [ X ].

5.3.3. A first encoding step: intermediate symbol generation

5.3.3.1. Overview

The encoding step is a precoding step that generates L intermediate symbols C [0], …, C [ L-1] from the source symbols C '[ 0], …, C' [ K '-1 ], where L > K' is defined in section 5.3.3.3. The intermediate symbols are uniquely defined by two sets of constraints:

1. the intermediate symbols are correlated with the source symbols by the set of source symbol tuples and the ISI of the source symbols. The generation of source symbol tuples is defined in section 5.3.3.2 using the Tuple [ ] generator described in section 5.3.5.4.

2. Within the source symbol itself, several precoding relationships hold. These are defined in section 5.3.3.3.

The generation of the L intermediate symbols is then defined in section 5.3.3.4.

5.3.3.2. Source symbol tuple

Each of the K 'source symbols is associated with a source symbol tuple (d [ X ], a [ X ], b [ X ], d1[ X ], a1[ X ], b1[ X ]), where 0 < ═ X < K'. The source symbol tuple is determined using the tuple generator defined in section 5.3.5.4 as:

for each X, 0 < ═ X < K'

(d[X]，a[X]，b[X]，d1[X]，a1[X]，b1[X])＝Tuple[K，X]

5.3.3.3. Precoding relationships

The precoding relationship between the L intermediate symbols is defined by requiring the set of S + H linear combinations of these intermediate symbols to be estimated to 0. There are S LDPC symbols and H HDPC symbols, so L ═ K' + S + H. Another division of the L intermediate symbols is into two sets, one set with W LT symbols and the other set with P PI symbols, hence also the case of L ═ W + P. During the encoding process, the P PI symbols are treated differently than the W LT symbols. The P PI symbols are made up of a set of H HDPC symbols along with P-H of U of the other K' intermediate symbols. The W LT symbols are composed of S LDPC symbols along with W-S of the other K' intermediate symbols. The values of these parameters are determined from K 'as described below, where H (K'), S (K ') and W (K') are derived from Table 2, section 5.6.

Order:

○S＝S(K′)

○H＝H(K′)

○W＝W(K′)

○L＝K′+S+H

○P＝L-W

o P1 denotes the smallest prime number greater than or equal to P

○U＝P-H

○B＝W-S

O C [0], …, C [ B-1] are denoted as LT symbols but not as intermediate symbols of LDPC symbols.

O [ C [ B ], …, C [ B + S-1] denote S LDPC symbols which are also LT symbols.

O [ C [ W ], …, C [ W + U-1] are marked as PI symbols but not as intermediate symbols of HDPC symbols.

O C L-H, …, C L-1 denote H HDPC symbols that are also PI symbols.

The first set of precoding relationships, referred to as the LDPC relationships, is described below and requires that at the end of the process the set of symbols D [0], …, D [ S-1] are all 0:

the initialization symbol D [0] ═ C [ B ], …, and D [ S-1] ═ C [ B + S-1 ].

○For i＝0，…，B-1 do

*a＝1+floor(i/S)

*b＝i％S

*D[b]＝D[b]+C[i]

*b＝(b+a)％S

*D[b]＝D[b]+C[i]

*b＝(b+a)％S

*D[b]＝D[b]+C[i]

○For i＝0，…，S-1 do

*a＝i％P

*b＝(i+1)％P

*D[i]＝D[i]+C[W+a]+C[W+b]

Recall that symbol addition will be performed as specified in section 5.7.

Note that the LDPC relationship defined in the above algorithm is linear, so that there are an S x B matrix G _ LDPC 1 and an S x P matrix G _ LDPC 2, such that

G_LDPC，1*Transpose[(C[0]，…，C[B-1])]+G_LDPC，2*Transpose(C[W]，…，C[W+P-1])+Transpose[(C[B]，…，C[B+S-1])]＝0

(the matrix G _ LDPC 1 is defined in the first cycle of the algorithm above, and G _ LDPC 2 can be deduced from the second cycle.)

A second set of relationships between the intermediate symbols C [0], …, C [ L-1] are HDPC relationships and they are defined as follows:

order:

a indicates the tuple represented by integer 2, as defined in section 5.7.

MT denotes an H x (K ' + S) matrix of octets, where for user 0, …, K ' + S-2, if i ═ Rand [ j +1, 6, H ] or i ═ Rand [ j +1, 6, H ] + Rand [ j +1, 7, H-1] + 1)% H, the entry for MT [ i, j ] is the octet represented by integer 1, and for all other values of i MT [ i, j ] is 0 element, and for j ═ K ' + S-1, MT [ i, j ] × a ^ i, where i ^ 0, …, H-1.

GAMMA identifies an (K '+ S) x (K' + S) matrix of octets, where

GAMMA[i，j]＝

a ^ (i-j), for i > ═ j,

0, otherwise.

Then, the relationship between the first K '+ S intermediate symbols C [0], …, C [ K' + S-1] and the H HDPC symbols C [ K '+ S ], …, C [ K' + S + H-1] is given by:

Transpose[C[K′+S]，…，C[K′+S+H-1]]+MT*GAMMA*Transpose[C[0]，…，C[K′+S-1]]＝0，

where '+' denotes a standard matrix multiplication using octet multiplication to define the multiplication between an octet matrix and a symbol matrix (specifically, a column vector of symbols), and '+' identifies the addition on the octet vector.

5.3.3.4. Intermediate code element

5.3.3.4.1. Definition of

Given K ' source symbols C ' [0], C ' [1], …, C ' [ K ' -1], L intermediate symbols C [0], C [1], …, C [ L-1] are uniquely defined symbol values that satisfy the following condition:

k 'source symbols C' 0, C '1, …, C' K '-1 satisfy a K' constraint

C ' [ X ] ═ Enc [ K ', (C [0], …, C [ L-1]), (d [ X ], a [ X ], b [ X ], d1[ X ], a1[ X ], b1[ X ]), for all X, 0 < ═ X < K ',

wherein (d [ X ], a [ X ], b [ X ], d1[ X ], a1[ X ], b1[ X ])) ═ Tuple [ K', X ], Tuple [ ] is defined in section 5.3.5.4, and Enc [ ] is described in section 5.3.5.3.

The L intermediate symbols C [0], C [1], …, C [ L-1] satisfy the precoding relationship defined in section 5.3.3.3.

5.3.3.4.2. Example method of calculating intermediate symbols

This section describes possible methods for calculating the L intermediate symbols C [0], C [1], …, C [ L-1] that satisfy the constraints in section 5.3.3.4.1.

The L intermediate symbols can be calculated as follows:

order:

o C denotes the column vector of L intermediate symbols C [0], C [1], …, C [ L-1 ].

O D denotes a column vector of S + H zero symbols followed by K ' source symbols C ' [0], C ' [1], …, C ' [ K ' -1 ].

The above constraint defines an L x L matrix a of octets such that:

A*C＝D

matrix a may be constructed as follows:

order:

o G _ LDPC 1 and G _ LDPC 2 are S x B matrix and S x P matrix as defined in section 5.3.3.3.

O G _ HDPC is an H x (K' + S) matrix, so that

G _ HDPC ═ transit (C [0], …, C [ K '+ S-1]) transtransit (C [ K' + S ], …, C [ L-1]), i.e., G _ HDPC ═ MT ═ GAMMA

I _ S is S × S identity matrix

I _ H is H x H identity matrix

G _ ENC is the K' x L matrix, so that

G_ENC*Transpose[(C[0]，…，C[L-1])]＝Transpose[(C′[0]，C′[1]，…，C′[K′-1])]，

That is, G _ Enc [ i, j ] ═ 1 if and only if C [ j ] is included in the symbols summed to produce Enc [ K', (C [0], …, C [ L-1]), (d [ i ], a [ i ], b [ i ], d1[ i ], a1[ i ], b1[ i ]), otherwise G _ Enc [ i, j ] ═ 0.

Then:

the first S row of A is equal to G _ LDPC 1| I _ S | G _ LDPC 2.

The next H row of A is equal to G _ HDPC | I _ H.

The remaining K' rows of A are equal to G _ ENC.

Matrix a is depicted in the following figure (fig. 5):

FIG. 5: matrix A

The intermediate symbol may then be computed as:

C＝(A^^-1)*D

the source tuples are generated such that for any K', the matrix a has full rank and is therefore invertible. This calculation may be accomplished by applying a RaptorQ decoding process to K ' source symbols C ' 0, C ' 1, …, C ' K ' -1 to produce L intermediate symbols C [0], C [1], …, C [ L-1 ].

To efficiently generate intermediate symbols from the source symbols, it is recommended to use an efficient decoder implementation such as that described in section 5.4.

5.3.4. A second encoding step: encoding

In the second encoding step, repair symbols having ISI X (X > -K ') are generated by applying the generators Enc [ K ', (C [0], C [1], …, C [ L-1]), (d, a, b, d1, a1, b1) defined in section 5.3.5.3 to L intermediate symbols C [0], C [1], …, C [ L-1], using the Tuple (d, a, b, d1, a1, b1) ═ Tuple [ K ', X ].

5.3.5. Generator

5.3.5.1. Random number generator

The random number generator Rand y, i, m is defined as follows, where y is a non-negative integer, i is a non-negative integer less than 256, and m is a positive integer, and the resulting value is an integer between 0 and m-1. Let V0, V1, V2 and V3 be the arrays provided in section 5.5.

Order:

○x0＝(y+i)mod 2^^8

○x1＝(floor(y/2^^8)+i)mod 2^^8

○x2＝(floor(y/2^^16)+i)mod 2^^8

○x3＝(floor(y/2^^24)+i)mod 2^^8

then

Rand[y，i，m]＝(V0[x0]^V1[x1]^V2[x2]^V3[x3])％m

5.3.5.2. Degree generator

The degree generator Deg [ v ] is defined as follows, where v is a non-negative integer less than 2^ 20 ^ 1048576. Given v, find the index d in table 1 such that f [ d-1] < ═ v < f [ d ], and set Deg [ v ] ═ min (d, W-2). Recall that W is derived from K' as described in section 5.3.3.3.

Table 1: defining a degree distribution for encoding symbols

5.3.5.3. Code symbol generator

The code symbol generators Enc [ K', (C [0], C [1], …, C [ L-1]), (d, a, b, d1, a1, b1) ] take the following inputs:

k' is the number of source symbols of the extended source block. Let L, W, B, S, P and P1 be derived from K' as described in section 5.3.3.3.

O (C0, C1, …, C L-1) is an array of L intermediate symbols (subsymbols) generated as described in section 5.3.3.4.

O (d, a, b, d1, a1, b1) is determined from ISI X using the tuple generator defined in section 5.3.5.4, whereby

D is a positive integer indicating the LT degree of the code symbol

A is a positive integer between 1 and W-1 (including 1 and W-1)

B is a non-negative integer between 0 and W-1 (including 1 and W-1)

D1 is a positive integer with a value of 2 or 3 indicating the PI degree of the code symbol

A1 is a positive integer between 1 and P1-1 (including 1 and P1-1)

B1 is a non-negative integer between 0 and P1-1 (including 1 and P1-1)

The code symbol generator produces as output a single code symbol (referred to as a result) according to the following algorithm:

○result＝C[b]

○For j＝1，…，d-1 do

*b＝(b+a)％W

*result＝result+C[b]

○While(b1＞＝P)do b1＝(b1+a1)％P1

○result＝result+C[W+b1]

○For j＝1，…，d1-1 do

*b1＝(b1+a1)％P1

*While(b1＞＝P)do b1＝(b1+a1)％P1

*result＝result+C[W+b1]

○Return result

5.3.5.4. tuple generator

Tuple generator Tuple [ K', X ] takes the following inputs:

o K' -extending the number of source symbols in the source block

○X-ISI

Order:

l is determined from K' as described in section 5.3.3.3

J (K ') is the system index associated with K', and the output of the tuple generator as defined in table 2 in section 5.6 is the tuple (d, a, b, d1, a1, b1), defined as follows:

○A＝53591+J*997

○if(A％2＝＝0){A＝A+1}

○B＝10267*(J+1)

○y＝(B+X*A)％2^^32

○v＝Rand[y，0，2^^20]

○d＝Deg[v]

○a＝1+Rand[y，1，W-1]

○b＝Rand[y，2，W]

○If(d＜4){d1＝2+Rand[X，3，2]}else{d1＝2}

○a1＝1+Rand[X，4，P1-1]

○b1＝Rand[X，5，P1]

5.4. example FEC decoder

5.4.1. Overview

This section describes efficient decoding algorithms for RaptorQ codes introduced in this specification. Note that each received code symbol is a known linear combination of intermediate symbols. Thus, each received encoded symbol provides a linear equation between the intermediate symbols, which together with the known linear precoding relationship between the intermediate symbols gives a system of linear equations. Thus, any algorithm used to solve the system of linear equations can successfully decode the intermediate symbols and hence the source symbols. However, the chosen algorithm has a large impact on the computational efficiency of decoding.

5.4.2. Decoding extended source blocks

5.4.2.1. Overview

It is assumed that the decoder knows the structure of the source block it is to decode, including the symbol size T, the number of symbols in the source block K, and the number of source symbols in the extended source block K'.

According to the algorithm described in section 5.3, the RaptorQ decoder may calculate the total number L ═ K' + S + H intermediate symbols and determine how they were generated from the extended source block to be decoded. In this description, it is assumed that received encoded symbols of an extended source block to be decoded are delivered to a decoder. Further, for each such code symbol, the number and set of intermediate symbols whose sum is assumed to be equal to the code symbol are passed to the decoder. In the case of source symbols that include padding symbols, the set of source symbols described in section 5.3.3.2 indicates the number and set of intermediate symbols whose sum gives each source symbol.

Let N > ═ K' be the number of received coded symbols (including padding symbols of the extended source block) to be used for decoding, and let M ═ S + H + N. Then a x C-D is obtained by the notation in section 5.3.3.4.2.

Decoding an extended source block is equivalent to decoding C from known a and D. Obviously, C can be decoded if and only if the rank of a is L. Once C is decoded, the missing source symbols can be obtained by using the source symbol tuples to determine the number and set of intermediate symbols that must be summed to obtain each missing source symbol.

The first step in decoding C is to form a decoding schedule. In this step, a is converted to L x L identity matrix using gaussian elimination (using row operations and row and column reordering) and after discarding M-L rows. The decoding schedule includes row operations and the order of row and column reordering during the gaussian elimination process and depends only on a and not on D. Decoding C from D may occur concurrently with forming the decoding schedule, or decoding may occur later based on the decoding schedule.

The correspondence between the decoding schedule and decoding C is as follows. Initially, c [0] ═ 0, c [1] ═ 1, …, c [ L-1] ═ L-1, and d [0] ═ 0, d [1] ═ 1, …, and d [ M-1] ═ M-1.

Each time a multiple β of a row i is added to row i 'in the decoding schedule, then symbols β x D [ i ] ] are added to symbols D [ i' ] ] in the decoding process.

Each time row i of a is multiplied by octet β, then during decoding, symbol D [ i ] ] is also multiplied by β.

Each time row i is swapped with row i 'in the decoding schedule, then the value of d [ i ] is swapped with the value of d [ i' ] in the decoding process.

Each time column j is swapped with column j 'in the decoding schedule, the value of cj is swapped with the value of cj' ] in the decoding process.

It is clear from this correspondence that the total number of operations on symbols when decoding the extended source block is the number of line operations (non-swapping) in gaussian elimination. Since A is an L x L identity matrix after Gaussian elimination and after discarding the last M-L rows, it is apparent that at the end of successful decoding, the L symbols D [ D [0] ], D [ D [1] ], …, D [ D [ L-1] ] are the values of the L symbols C [ C [0] ], C [ C [1] ], …, C [ C [ L-1] ].

The order in which gaussian elimination is performed to form the decoding schedule is independent of whether the decoding was successful. However, the decoding speed depends heavily on the order in which gaussian elimination is performed. (furthermore, it is important to maintain a sparse representation of A, although not described here). The remainder of this section describes a relatively efficient order in which gaussian elimination may be performed.

5.4.2.2. First stage

In the first stage of Gaussian elimination, matrix A is conceptually divided into sub-matrices, and additionally matrix X is created. This matrix has as many rows and columns as a, and it will be a lower triangular matrix throughout the first stage. At the beginning of this phase, matrix A is copied into matrix X.

The sub-matrix size is parameterized as non-negative integers i and u, which are initialized to 0 and PI symbol time P, respectively. The submatrix of A is:

1. a sub-matrix I defined by the intersection of the first I row and the first I column. This is the identity matrix at the end of each step of the phase.

2. A sub-matrix defined by the intersection of the first i row and all columns except the ith and last u columns. All entries of the sub-matrix are 0.

3. A sub-matrix defined by the intersection of the first i column and all but the ith row. All entries of the sub-matrix are 0.

4. A sub-matrix defined by the intersection of all rows and the last u column.

5. A sub-matrix V defined by the intersection of all columns except the first i column and the last u column with all rows except the first i row.

Fig. 6 illustrates a submatrix of a. At the beginning of the first stage, V ═ a. In each step, a row of A is selected.

FIG. 6: submatrix of A in the first stage

The following graph, defined by the structure of V, is used to determine which row of a is selected. The columns that intersect V are nodes in the graph, and those rows in V that have exactly 2 non-zero entries and are not HDPC rows are the edges in the graph that connect the two columns (nodes) at the two 1's. The component in the graph is the largest set of nodes (columns) and edges (rows) such that there is a path between each pair of nodes/edges in the graph. The size of a component is the number of nodes (columns) in the component.

There are at most L steps in the first stage. This phase ends successfully when i + U is L, i.e. when V and the all-zero submatrix above V have disappeared and a consists of the all-zero submatrix below I, I and U. If there are no non-zero rows to select in a step before V disappears, the phase ends unsuccessfully with a decoding failure. In each step, a row of a is selected as follows:

if all entries of V are 0, then no row is selected and decoding fails.

Let r be the smallest integer, so that at least one row of A has exactly r 1 s in V.

If r! Then the row with the smallest primeness among all such rows with exactly r 1 s in V is chosen, except that no HDPC row should be chosen before all non-HDPC rows have been processed.

If r is 2, then any row with exactly 21 s in V is taken that is part of the maximum size component defined by V in the above-described graph.

After the row is selected in this step, the first row in A that intersects V is swapped with the selected row so that the selected row is the first row that intersects V. Those columns in A that intersect V are reordered so that one of the r 1's in the selected row appears in the first column of V and the remaining r-1's appear in the last columns of V. The same row and column operations are also performed on matrix X. The appropriate multiple of the selected row is then added to all other rows in A that have non-zero entries in the first column of V below the selected row. Specifically, if the row below the selected row has an entry β in the first column of V, and the selected row has an entry α in the first column of V, then β/α times the selected row is added to the row so that the first column of V has a zero value remaining. Finally, i is incremented by 1 and u is incremented by r-1, completing the step.

Note that efficiency may be improved if the row operations identified above are not actually performed before the affected row itself is selected during the decoding process. This avoids the handling of line operations for lines that are not ultimately used in the decoding process, and in particular avoids β! Those rows of 1. Furthermore, the row operations required for the HDPC row may be performed on all such rows in one process by using the algorithm described in section 5.3.3.3.

5.4.2.3. Second stage

At this point, all entries in X that are outside the first i row and i column are discarded, so X has a lower triangular form. The last i rows and i columns of X are discarded, so X now has i rows and i columns. The sub-matrix U is further divided into the first i rows, U _ up, and the remaining M-i rows, U _ down. Gaussian elimination is performed on the U _ down in the second phase to determine that its rank is less than U (decoding failure) or to convert it to a matrix where the first U rows are identity matrices (second phase success). This u x u identity matrix is referred to as I _ u. The M-L lines in A that intersect U _ Down-I _ U are discarded. After this stage, a has L rows and L columns.

5.4.2.4. The third stage

After the second phase, the portion of a that needs to be zeroed out to complete the conversion of a to lxl identity matrix is U _ up. The number of rows i on the sub-matrix U _ is generally much larger than the number of columns U on U _ i. Furthermore, the matrix U _ is now typically dense, i.e., the number of non-zero entries of the matrix is large. To reduce this matrix to sparse form, the sequence of operations performed under the acquisition matrix U _ needs to be reversed. For this purpose, matrix X is multiplied by the sub-matrix of A, which is made up of the first i rows of A. After this operation, the sub-matrix of a, constituted by the intersection of the first i rows and i columns, is equal to X, where the matrix U _ is transformed up to sparse form.

5.4.2.5. Fourth stage

For each of the first i rows on U _, the following is done: if the row has a non-zero entry at position j and if the value of the non-zero entry is b, then row j of b multiplied by I _ u is added to the row. After this step, the sub-matrix of a, constituted by the intersection of the first I rows and the I columns, is equal to X, the sub-matrix U _ is constituted by 0, the sub-matrix U _ is constituted by the intersection of the last U rows and the first I columns is constituted by 0, and the sub-matrix U _ is constituted by the last U rows and the last U columns is the matrix I _ U.

5.4.2.6. The fifth stage

For j from 1 to i, the following operations are performed:

1. if A [ j, j ] is not 1, then the row j of A is divided by A [ j, j ].

2. For l from 1 to j-1, if A [ j, l ] is non-zero, then line l, where A [ j, l ] is multiplied by A, is added to line j of A.

After this stage, a is the L x L identity matrix and the complete decoding schedule has been successfully formed. Corresponding decoding, including summing of known encoded symbols, can then be performed to recover intermediate symbols based on the decoding schedule. The tuples associated with all source symbols are calculated according to section 5.3.3.2. The tuple on the received source symbol is used for decoding. The tuple for the missing source symbol is used to determine which intermediate symbols need to be summed to recover the missing source symbol.

5.5. Random number

The 4 arrays V0, V1, V2 and V3 used in section 5.3.5.1 are provided below. There are 256 entries in each of these 4 arrays. The index into each array starts at 0 and the entries are 32-bit unsigned integers.

5.5.1. TABLE V0

251291136，3952231631，3370958628，4070167936，123631495，3351110283，

3218676425，2011642291，774603218，2402805061，1004366930，

1843948209，428891132，3746331984，1591258008，3067016507，

1433388735，504005498，2032657933，3419319784，2805686246，

3102436986，3808671154，2501582075，3978944421，246043949，

4016898363，649743608，1974987508，2651273766，2357956801，689605112，

715807172，2722736134，191939188，3535520147，3277019569，1470435941，

3763101702，3232409631，122701163，3920852693，782246947，372121310，

2995604341，2045698575，2332962102，4005368743，218596347，

3415381967，4207612806，861117671，3676575285，2581671944，

3312220480，681232419，307306866，4112503940，1158111502，709227802，

2724140433，4201101115，4215970289，4048876515，3031661061，

1909085522，510985033，1361682810，129243379，3142379587，2569842483，

3033268270，1658118006，932109358，1982290045，2983082771，

3007670818，3448104768，683749698，778296777，1399125101，1939403708，

1692176003，3868299200，1422476658，593093658，1878973865，

2526292949，1591602827，3986158854，3964389521，2695031039，

1942050155，424618399，1347204291，2669179716，2434425874，

2540801947，1384069776，4123580443，1523670218，2708475297，

1046771089，2229796016，1255426612，4213663089，1521339547，

3041843489，420130494，10677091，515623176，3457502702，2115821274，

2720124766，3242576090，854310108，425973987，325832382，1796851292，

2462744411，1976681690，1408671665，1228817808，3917210003，

263976645，2593736473，2471651269，4291353919，650792940，1191583883，

3046561335，2466530435，2545983082，969168436，2019348792，

2268075521，1169345068，3250240009，3963499681，2560755113，

911182396，760842409，3569308693，2687243553，381854665，2613828404，

2761078866，1456668111，883760091，3294951678，1604598575，

1985308198，1014570543，2724959607，3062518035，3115293053，

138853680，4160398285，3322241130，2068983570，2247491078，

3669524410，1575146607，828029864，3732001371，3422026452，

3370954177，4006626915，543812220，1243116171，3928372514，

2791443445，4081325272，2280435605，885616073，616452097，3188863436，

2780382310，2340014831，1208439576，258356309，3837963200，

2075009450，3214181212，3303882142，880813252，1355575717，207231484，

2420803184，358923368，1617557768，3272161958，1771154147，

2842106362，1751209208，1421030790，658316681，194065839，3241510581，

38625260，301875395，4176141739，297312930，2137802113，1502984205，

3669376622，3728477036，234652930，2213589897，2734638932，

1129721478，3187422815，2859178611，3284308411，3819792700，

3557526733，451874476，1740576081，3592838701，1709429513，

3702918379，3533351328，1641660745，179350258，2380520112，

3936163904，3685256204，3156252216，1854258901，2861641019，

3176611298，834787554，331353807，517858103，3010168884，4012642001，

2217188075，3756943137，3077882590，2054995199，3081443129，

3895398812，1141097543，2376261053，2626898255，2554703076，

401233789，1460049922，678083952，1064990737，940909784，1673396780，

528881783，1712547446，3629685652，1358307511

5.5.2. TABLE V1

807385413，2043073223，3336749796，1302105833，2278607931，541015020，

1684564270，372709334，3508252125，1768346005，1270451292，

2603029534，2049387273，3891424859，2152948345，4114760273，

915180310，3754787998，700503826，2131559305，1308908630，224437350，

4065424007，3638665944，1679385496，3431345226，1779595665，

3068494238，1424062773，1033448464，4050396853，3302235057，

420600373，2868446243，311689386，259047959，4057180909，1575367248，

4151214153，110249784，3006865921，4293710613，3501256572，998007483，

499288295，1205710710，2997199489，640417429，3044194711，486690751，

2686640734，2394526209，2521660077，49993987，3843885867，4201106668，

415906198，19296841，2402488407，2137119134，1744097284，579965637，

2037662632，852173610，2681403713，1047144830，2982173936，910285038，

4187576520，2589870048，989448887，3292758024，506322719，176010738，

1865471968，2619324712，564829442，1996870325，339697593，4071072948，

3618966336，2111320126，1093955153，957978696，892010560，1854601078，

1873407527，2498544695，2694156259，1927339682，1650555729，

183933047，3061444337，2067387204，228962564，3904109414，1595995433，

1780701372，2463145963，307281463，3237929991，3852995239，

2398693510，3754138664，522074127，146352474，4104915256，3029415884，

3545667983，332038910，976628269，3123492423，3041418372，2258059298，

2139377204，3243642973，3226247917，3674004636，2698992189，

3453843574，1963216666，3509855005，2358481858，747331248，

1957348676，1097574450，2435697214，3870972145，1888833893，

2914085525，4161315584，1273113343，3269644828，3681293816，

412536684，1156034077，3823026442，1066971017，3598330293，

1979273937，2079029895，1195045909，1071986421，2712821515，

3377754595，2184151095，750918864，2585729879，4249895712，

1832579367，1192240192，946734366，31230688，3174399083，3549375728，

1642430184，1904857554，861877404，3277825584，4267074718，

3122860549，666423581，644189126，226475395，307789415，1196105631，

3191691839，782852669，1608507813，1847685900，4069766876，

3931548641，2526471011，766865139，2115084288，4259411376，

3323683436，568512177，3736601419，1800276898，4012458395，1823982，

27980198，2023839966，869505096，431161506，1024804023，1853869307，

3393537983，1500703614，3019471560，1351086955，3096933631，

3034634988，2544598006，1230942551，3362230798，159984793，491590373，

3993872886，3681855622，903593547，3535062472，1799803217，772984149，

895863112，1899036275，4187322100，101856048，234650315，3183125617，

3190039692，525584357，1286834489，455810374，1869181575，922673938，

3877430102，3422391938，1414347295，1971054608，3061798054，

830555096，2822905141，167033190，1079139428，4210126723，3593797804，

429192890，372093950，1779187770，3312189287，204349348，452421568，

2800540462，3733109044，1235082423，1765319556，3174729780，

3762994475，3171962488，442160826，198349622，45942637，1324086311，

2901868599，678860040，3812229107，19936821，1119590141，3640121682，

3545931032，2102949142，2828208598，3603378023，4135048896

5.5.3. TABLE V2

1629829892，282540176，2794583710，496504798，2990494426，3070701851，

2575963183，4094823972，2775723650，4079480416，176028725，

2246241423，3732217647，2196843075，1306949278，4170992780，

4039345809，3209664269，3387499533，293063229，3660290503，

2648440860，2531406539，3537879412，773374739，4184691853，

1804207821，3347126643，3479377103，3970515774，1891731298，

2368003842，3537588307，2969158410，4230745262，831906319，

2935838131，264029468，120852739，3200326460，355445271，2296305141，

1566296040，1760127056，20073893，3427103620，2866979760，2359075957，

2025314291，1725696734，3346087406，2690756527，99815156，4248519977，

2253762642，3274144518，598024568，3299672435，556579346，4121041856，

2896948975，3620123492，918453629，3249461198，2231414958，

3803272287，3657597946，2588911389，242262274，1725007475，

2026427718，46776484，2873281403，2919275846，3177933051，1918859160，

2517854537，1857818511，3234262050，479353687，200201308，2801945841，

1621715769，483977159，423502325，3689396064，1850168397，3359959416，

3459831930，841488699，3570506095，930267420，1564520841，2505122797，

593824107，1116572080，819179184，3139123629，1414339336，1076360795，

512403845，177759256，1701060666，2239736419，515179302，2935012727，

3821357612，1376520851，2700745271，966853647，1041862223，715860553，

171592961，1607044257，1227236688，3647136358，1417559141，

4087067551，2241705880，4194136288，1439041934，20464430，119668151，

2021257232，2551262694，1381539058，4082839035，498179069，311508499，

3580908637，2889149671，142719814，1232184754，3356662582，

2973775623，1469897084，1728205304，1415793613，50111003，3133413359，

4074115275，2710540611，2700083070，2457757663，2612845330，

3775943755，2469309260，2560142753，3020996369，1691667711，

4219602776，1687672168，1017921622，2307642321，368711460，

3282925988，213208029，4150757489，3443211944，2846101972，

4106826684，4272438675，2199416468，3710621281，497564971，285138276，

765042313，916220877，3402623607，2768784621，1722849097，3386397442，

487920061，3569027007，3424544196，217781973，2356938519，3252429414，

145109750，2692588106，2454747135，1299493354，4120241887，

2088917094，932304329，1442609203，952586974，3509186750，753369054，

854421006，1954046388，2708927882，4047539230，3048925996，

1667505809，805166441，1182069088，4265546268，4215029527，

3374748959，373532666，2454243090，2371530493，3651087521，

2619878153，1651809518，1553646893，1227452842，703887512，

3696674163，2552507603，2635912901，895130484，3287782244，

3098973502，990078774，3780326506，2290845203，41729428，1949580860，

2283959805，1036946170，1694887523，4880696，466000198，2765355283，

3318686998，1266458025，3919578154，3545413527，2627009988，

3744680394，1696890173，3250684705，4142417708，915739411，

3308488877，1289361460，2942552331，1169105979，3342228712，

698560958，1356041230，2401944293，107705232，3701895363，903928723，

3646581385，844950914，1944371367，3863894844，2946773319，

1972431613，1706989237，29917467，3497665928

5.5.4. TABLE V3

1191369816，744902811，2539772235，3213192037，3286061266，

1200571165，2463281260，754888894，714651270，1968220972，3628497775，

1277626456，1493398934，364289757，2055487592，3913468088，

2930259465，902504567，3967050355，2056499403，692132390，186386657，

832834706，859795816，1283120926，2253183716，3003475205，1755803552，

2239315142，4271056352，2184848469，769228092，1249230754，

1193269205，2660094102，642979613，1687087994，2726106182，446402913，

4122186606，3771347282，37667136，192775425，3578702187，1952659096，

3989584400，3069013882，2900516158，4045316336，3057163251，

1702104819，4116613420，3575472384，2674023117，1409126723，

3215095429，1430726429，2544497368，1029565676，1855801827，

4262184627，1854326881，2906728593，3277836557，2787697002，

2787333385，3105430738，2477073192，748038573，1088396515，

1611204853，201964005，3745818380，3654683549，3816120877，

3915783622，2563198722，1181149055，33158084，3723047845，3790270906，

3832415204，2959617497，372900708，1286738499，1932439099，

3677748309，2454711182，2757856469，2134027055，2780052465，

3190347618，3758510138，3626329451，1120743107，1623585693，

1389834102，2719230375，3038609003，462617590，260254189，3706349764，

2556762744，2874272296，2502399286，4216263978，2683431180，

2168560535，3561507175，668095726，680412330，3726693946，4180630637，

3335170953，942140968，2711851085，2059233412，4265696278，

3204373534，232855056，881788313，2258252172，2043595984，3758795150，

3615341325，2138837681，1351208537，2923692473，3402482785，

2105383425，2346772751，499245323，3417846006，2366116814，

2543090583，1828551634，3148696244，3853884867，1364737681，

2200687771，2689775688，232720625，4071657318，2671968983，

3531415031，1212852141，867923311，3740109711，1923146533，

3237071777，3100729255，3247856816，906742566，4047640575，

4007211572，3495700105，1171285262，2835682655，1634301229，

3115169925，2289874706，2252450179，944880097，371933491，1649074501，

2208617414，2524305981，2496569844，2667037160，1257550794，

3399219045，3194894295，1643249887，342911473，891025733，3146861835，

3789181526，938847812，1854580183，2112653794，2960702988，

1238603378，2205280635，1666784014，2520274614，3355493726，

2310872278，3153920489，2745882591，1200203158，3033612415，

2311650167，1048129133，4206710184，4209176741，2640950279，

2096382177，4116899089，3631017851，4104488173，1857650503，

3801102932，445806934，3055654640，897898279，3234007399，1325494930，

2982247189，1619020475，2720040856，885096170，3485255499，

2983202469，3891011124，546522756，1524439205，2644317889，

2170076800，2969618716，961183518，1081831074，1037015347，

3289016286，2331748669，620887395，303042654，3990027945，1562756376，

3413341792，2059647769，2823844432，674595301，2457639984，

4076754716，2447737904，1583323324，625627134，3076006391，345777990，

1684954145，879227329，3436182180，1522273219，3802543817，

1456017040，1897819847，2970081129，1382576028，3820044861，

1044428167，612252599，3340478395，2150613904，3397625662，

3573635640，3432275192

5.6. System index and other parameters

Table 2 below specifies the supported K' values. The table also specifies the system index J (K '), the number of HDPC symbols H (K '), the number of LDPC symbols S (K '), and the number of LT symbols W (K ') for each supported value of K '. For each value of K ', the corresponding values of S (K ') and W (K ') are prime numbers.

The system index J (K') is designed to have the following properties: the set of source symbol tuples (d [0], a [0], b [0], d1[0], a1[0], b1[0]), …, (d [ K '-1 ], a [ K' -1], b [ K '-1 ], d1[ K' -1], a1[ K '-1 ], b1[ K' -1]) are such that the L intermediate symbols are uniquely defined, i.e. the matrix A in FIG. 6 has full rank and is therefore invertible.

Table 2: system index and other parameters

5.7. Operations of octets, symbols and matrices

5.7.1. Overview

The remainder of this section describes the arithmetic operations used to generate code symbols from the source symbols and to generate source symbols from the code symbols. Mathematically, the octets can be regarded as elements of a finite field, i.e. the finite field GF (256) with 256 elements, thus defining addition and multiplication operations and the unit elements and inverses on both operations. The matrix operation and the symbol operation are defined based on arithmetic operations on octets. This allows these arithmetic operations to be fully implemented without having to understand the underlying mathematics of the finite field.

5.7.2. Arithmetic operations on octets

Octets are mapped to non-negative integers in the range 0 to 255 in the usual way: individual octets of data from the symbols, B [7], B [6], B [5], B [4], B [3], B [2], B [1], B [0], where B [7] is the highest order bit and B [0] is the lowest order bit, are mapped to the integer i ═ B [7] × 128+ B [6] × 64+ B [5] × 32+ B [4] + 16+ B [3] + 8+ B [2] × 4+ B [1] + B [0 ].

The addition of zero octets u and v is defined as an exclusive or operation, i.e.,

u+v＝u^v。

the subtraction is defined in the same way, thus obtaining

u-v＝u^v。

The 0 element (additive unit) is an octet represented by an integer 0. The additive inverse of u is simply u, i.e.

u+u＝0。

The multiplication of zero octets is defined with the aid of two tables OCT _ EXP and OCT _ LOG, which are given in sections 5.7.3 and 5.7.4, respectively. Table OCT LOG maps octets (non-zero elements) to non-negative integers and OCT EXP maps non-negative integers to octets. For two octets u and v, definitions

u*v＝

0, if u or v is 0,

OCT _ EXP [ OCT _ LOG [ u ] + OCT _ LOG [ v ] ], otherwise.

Note that the upper right '+' is a normal integer addition because its argument is a normal integer.

Division u/v of two octets u and v and where v! 0 is defined as follows:

u/v＝

0, if u-0,

OCT _ EXP [ OCT _ LOG [ u ] -OCT _ LOG [ v ] +255], otherwise.

A1 element (multiplicative unit) is an octet represented by an integer 1. For octets u that are not zero elements, i.e. the multiplicative inverse of u

OCT_EXP[255-OCT_LOG[u]]。

The octet denoted by α is an octet having an integer representing 2. If i is a nonnegative integer 0 < ═ i < 256, to give

α^^i＝OCT_EXP[i].

5.7.3. TABLE OCT _ EXP

The table OCT _ EXP contains 510 octets. The index starts from 0 and ranges up to 509, and the entries are octets with a positive integer representation:

1，2，4，8，16，32，64，128，29，58，116，232，205，135，19，38，76，152，45，90，180，117，234，201，143，3，6，12，24，48，96，192，157，39，78，156，37，74，148，53，106，212，181，119，238，193，159，35，70，140，5，10，20，40，80，160，93，186，105，210，185，111，222，161，95，190，97，194，153，47，94，188，101，202，137，15，30，60，120，240，253，231，211，187，107，214，177，127，254，225，223，163，91，182，113，226，217，175，67，134，17，34，68，136，13，26，52，104，208，189，103，206，129，31，62，124，248，237，199，147，59，118，236，197，151，51，102，204，133，23，46，92，184，109，218，169，79，158，33，66，132，21，42，84，168，77，154，41，82，164，85，170，73，146，57，114，228，213，183，115，230，209，191，99，198，145，63，126，252，229，215，179，123，246，241，255，227，219，171，75，150，49，98，196，149，55，110，220，165，87，174，65，130，25，50，100，200，141，7，14，28，56，112，224，221，167，83，166，81，162，89，178，121，242，249，239，195，155，43，86，172，69，138，9，18，36，72，144，61，122，244，245，247，243，251，235，203，139，11，22，44，88，176，125，250，233，207，131，27，54，108，216，173，71，142，1，2，4，8，16，32，64，128，29，58，116，232，205，135，19，38，76，152，45，90，180，117，234，201，143，3，6，12，24，48，96，192，157，39，78，156，37，74，148，53，106，212，181，119，238，193，159，35，70，140，5，10，20，40，80，160，93，186，105，210，185，111，222，161，95，190，97，194，153，47，94，188，101，202，137，15，30，60，120，240，253，231，211，187，107，214，177，127，254，225，223，163，91，182，113，226，217，175，67，134，17，34，68，136，13，26，52，104，208，189，103，206，129，31，62，124，248，237，199，147，59，118，236，197，151，51，102，204，133，23，46，92，184，109，218，169，79，158，33，66，132，21，42，84，168，77，154，41，82，164，85，170，73，146，57，114，228，213，183，115，230，209，191，99，198，145，63，126，252，229，215，179，123，246，241，255，227，219，171，75，150，49，98，196，149，55，110，220，165，87，174，65，130，25，50，100，200，141，7，14，28，56，112，224，221，167，83，166，81，162，89，178，121，242，249，239，195，155，43，86，172，69，138，9，18，36，72，144，61，122，244，245，247，243，251，235，203，139，11，22，44，88，176，125，250，233，207，131，27，54，108，216，173，71，142

5.7.4. table OCT _ LOG

The table OCT _ LOG contains 255 non-negative integers. The table is indexed by octets interpreted as integers. The octet corresponding to the zero element, represented by the integer 0, is not taken as an index, so the index starts at 1 and ranges up to 255, and the entries are as follows:

0，1，25，2，50，26，198，3，223，51，238，27，104，199，75，4，100，224，14，52，141，239，129，28，193，105，248，200，8，76，113，5，138，101，47，225，36，15，33，53，147，142，218，240，18，130，69，29，181，194，125，106，39，249，185，201，154，9，120，77，228，114，166，6，191，139，98，102，221，48，253，226，152，37，179，16，145，34，136，54，208，148，206，143，150，219，189，241，210，19，92，131，56，70，64，30，66，182，163，195，72，126，110，107，58，40，84，250，133，186，61，202，94，155，159，10，21，121，43，78，212，229，172，115，243，167，87，7，112，192，247，140，128，99，13，103，74，222，237，49，197，254，24，227，165，153，119，38，184，180，124，17，68，146，217，35，32，137，46，55，63，209，91，149，188，207，205，144，135，151，178，220，252，190，97，242，86，211，171，20，42，93，158，132，60，57，83，71，109，65，162，31，45，67，216，183，123，164，118，196，23，73，236，127，12，111，246，108，161，59，82，41，157，85，170，251，96，134，177，187，204，62，90，203，89，95，176，156，169，160，81，11，245，22，235，122，117，44，215，79，174，213，233，230，231，173，232，116，214，244，234，168，80，88，175

5.7.5. operation on code elements

The operation on symbols has the same semantics as the operation on octet vectors of length T in this specification. Thus, if U and V are two symbols formed by octets U [0], …, U [ T-1] and V [0], …, V [ T-1], respectively, then the symbol and U + V are defined as a component-by-component summation of octets, i.e., equal to the symbol formed by octets d [0], …, d [ T-1], such that

d[i]＝u[i]+v[i]，0＜＝i＜T。

Further, if β is an octet, the product β × U is defined as a symbol D obtained by multiplying each octet of U by β, that is,

d[i]＝beta*u[i]，0＜＝i＜T。

5.7.6. operation on a matrix

All matrices in this specification have entries that are octets, and thus matrix operations and definitions are defined in the manner of underlying octet arithmetic, e.g., operations on matrices, matrix rank, and matrix inverse.

5.8. Requirements for compatible decoders

If a RaptorQ-compatible decoder receives a mathematically sufficient set of coded symbols generated according to the encoder specification in section 5.3 to reconstruct a source block, such a decoder should recover the entire source block.

For a source block with K 'source symbols, a RaptorQ-compatible decoder should have the following recovery properties for all K' values in table 2, section 5.6.

1. If the decoder receives K' encoded symbols generated according to the encoder specification in section 5.3 with corresponding ESIs chosen independently and uniformly randomly from the range of possible ESIs, then on average the decoder will not be able to recover the entire source block up to 1 out of 100 times.

2. If the decoder receives K' +1 encoding symbols generated according to the encoder specification in section 5.3 with corresponding ESIs chosen independently and uniformly randomly from the range of possible ESIs, the decoder will not be able to recover the entire source block up to 1 out of 10,000 times, on average.

3. If the decoder receives K' +2 encoding symbols generated according to the encoder specification in section 5.3 with corresponding ESIs chosen independently and uniformly randomly from the range of possible ESIs, then on average the decoder will not be able to recover the entire source block up to 1 out of 1,000,000 times.

Note that the example FEC decoder specified in section 5.4 fulfills two requirements, namely

1. As long as a mathematically sufficient set of coded symbols generated according to the encoder specification in section 5.3 is received, it can reconstruct the source block;

2. it fulfills the above force restore attribute.

6. Safety considerations

Data delivery may be subject to denial of service attacks by attackers who send corrupt packets that are received by the recipient as legitimate. This is especially a problem for multicast delivery, since corrupt packets may be injected into sessions close to the root of the multicast tree, in which case the corrupt packets will reach many receivers. This is especially a problem when using the code described in this document, since using even a corrupt packet containing encoded data results in decoding of completely corrupt and useless objects. Therefore, it is recommended that source authentication and integrity checks be applied to the decoded object before delivering the object to the application. For example, the SHA-1 hash of the object [ SHA1] may be appended prior to transmission, and the SHA-1 hash calculated and checked after the object is decoded but before it is delivered to the application. For example, source authentication should be provided by including a digital signature computed over a hash value that can be verified by the recipient. It is also recommended to use packet authentication protocols such as TESLA [ RFC4082] to detect and drop corrupt packets upon arrival. The method may also be used to provide source authentication. Furthermore, it is recommended to implement a reverse path forwarding check in all network routers and switches along the path from sender to receiver to limit the possibility of bad proxies injecting corrupt packets into the multicast tree data path.

Another security issue is that some FEC information may be obtained by the out-of-band recipient in the session description and if the session description is forged or corrupted, the recipient will not use the correct protocol to decode the content from the received packet. To avoid these problems, it is recommended to take measures to prevent the recipient from accepting incorrect session descriptions, for example by using source authentication to ensure that the recipient only accepts legitimate session descriptions from authorized senders.

IANA considerations

The values of FEC encoding ID and FEC instance ID are to be IANA registered. See [ RFC5052] for IANA to consider general guidelines when applied to this document. This document is based on ietf rmt: fec: the coding namespace is to "RaptorQ code" to assign the full-specification FEC coding ID 6 (tbc).

8. Thank you

Thanks to Ranganathan (Ranga) Krishnan. Ranga Krishnan supports discovery and resolution implementation details as well as discovery system indexing very well. In addition, Habeeb Mohimuddin Mohammed and Antonios Pitarokitioilis, both from the university of Munich industries (TUM), and Alan Shinsato have made independent implementations of RaptorQ encoders/decoders that help clarify and solve the problems with the present specification.

9. Reference to the literature

9.1. Standard reference

[ RFC2119] Branner, S., "words for use in RFCs to indication RequirementLevels (for RFC to Indicate keywords requiring level)", BCP 14, RFC2119, 3 months 1997.

[ RFC4082] Perrig, A., Song, D., Canetti, R., Tygar, J. and B.Briscoe, "timedEffect Stream Loss-Tolerant Authentication (TESLA): multicast Source authentication transform Introduction (time efficient stream loss-tolerant authentication (TESLA)', RFC4082, month 6 2005.

[ SHA1] "Secure Hash Standard," Federal Information processing standards Publication (FIPS PUB)180-1, 4.2005.

[ RFC5052] Watson, M., Luby, M, and L.Visisano, "Forward Error Correction (FEC) Building Block," RFC5052, month 8 of 2007.

9.2. Informative references

[ RFC3453] Luby, M., Vicisano, L., Gemmell, J., Rizzo, L., Handley, M, and J.Crowcroff, "The Use of Forward Error Correction (FEC) in reusable Multicast," RFC3453, month 12 2002.

[ RFC5053] Luby, M., Shokrollahi, A., Watson, M., and T.Stockhammer, "Raptorfurred Error Correction Scheme for Object Delivery" RFC5053, month 10 2007.

Author address

Michael Luby

Qualcomm Incorporated 3165 Kifer Road

Santa Clara，CA 95051

U.S.A.

Email:lubyqualcomm.com

Amin Shokrollahi

EPFL

Laboratoire d′algorithmique

EPFL

Station 14

Batiment BC

Lausanne 1015

Switzerland

Email:amin.shokrollahiepfl.ch

Mark Watson

Qualcomm Incorporated

3165 Kifer Road

Santa Clara，CA 95051

U.S.A.

Email:watsonqualcomm.com

Thomas Stockhammer

Nomor Research

Brecherspitzstrasse 8

Munich 81541

Germany

Email:stockhammernomor.de

Lorenz Minder

Qualcomm Incorporated

3165 Kifer Road

Santa Clara，CA 95051

U.S.A.

Email:lminderqualcomm.com

Claims

1. A method of electronically transmitting data via one or more transmitters capable of outputting an electronic signal, wherein the data to be transmitted is represented by an ordered set of source symbols and the data is transmitted as a sequence of encoded symbols representing at least a portion of the electronic signal, the method comprising:

obtaining the ordered set of source symbols in an electronically readable form;

generating a set of intermediate symbols from the ordered set of source symbols, wherein the source symbols are reproducible from the set of intermediate symbols;

specifying a set of intermediate symbols such that each intermediate symbol is specified as a member of one of the sets of intermediate symbols and there is at least a first set of intermediate symbols and a second set of intermediate symbols, and wherein each set of intermediate symbols has associated distinct encoding parameters and has at least one intermediate symbol as a member; and

wherein the first set of intermediate symbols are designated as symbols for belief propagation decoding and the second set of intermediate symbols are designated as symbols dulled for belief propagation decoding;

generating a plurality of encoded symbols, wherein an encoded symbol is generated from one or more of the intermediate symbols, wherein at least one encoded symbol is generated directly or indirectly from a plurality of intermediate symbols selected from a plurality of the sets of intermediate symbols.

2. The method of claim 1, wherein each encoded symbol is generated from a combination of a first symbol generated from one or more intermediate symbols in the first set of intermediate symbols and a second symbol generated from one or more intermediate symbols in the second set of intermediate symbols.

3. The method of claim 2, wherein the distinct encoding parameters include at least distinct degree distributions, such that each encoded symbol is generated from a combination of a first symbol generated from one or more intermediate symbols in the first set of intermediate symbols having a first degree distribution and a second symbol generated from one or more intermediate symbols in the second set of intermediate symbols having a second degree distribution different from the first degree distribution.

4. The method of claim 2, wherein the first symbol is generated using a chain reaction encoding process applied to the first set of intermediate symbols.

5. The method of claim 2, wherein the second symbol is an exclusive or of a fixed number of symbols randomly chosen from the second set of intermediate symbols.

6. The method of claim 2, wherein the second symbol is an exclusive-or of a first number of symbols randomly selected from the second set of intermediate symbols, and wherein the first number depends on a second number equal to the number of symbols selected from the first set to generate the first symbol.

7. The method of claim 2, wherein the combination is an exclusive or of the first symbol and the second symbol.

8. The method of claim 1, wherein the intermediate symbols comprise the ordered set of source symbols and a set of redundant source symbols generated from the ordered set of source symbols.

9. The method of claim 8, wherein at least some of the redundant symbols are generated using GF [2] operations and other redundant symbols are generated using GF [256] operations.

10. The method of claim 1, wherein the intermediate symbols are generated from the source symbols during encoding using a decoding process, wherein the decoding process is based on a set of linear relationships between the intermediate symbols and the source symbols.

11. The method of claim 10, wherein at least some of the linear relationships are relationships over GF [2] and other linear relationships are relationships over GF [256 ].

12. The method of claim 1, wherein a number of distinct encoded symbols that can be generated from a given ordered set of source symbols is independent of a number of source symbols in the ordered set.

13. The method of claim 1, wherein an average number of symbol operations performed to generate encoded symbols is limited by a constant independent of a number of symbols in the ordered set.

14. The method of claim 1, wherein the first set of symbols is an order of magnitude greater than the second set of symbols.

15. A method of receiving data from a source, wherein the data is received at a destination over a packet communication channel, and wherein the data is representable by a set of encoded symbols derived from an ordered set of source symbols representing the data sent from the source to the destination, the method comprising:

obtaining a set of received encoded symbols;

decoding a set of intermediate symbols from the set of received encoded symbols;

associating each of the intermediate symbols with a set of intermediate symbols, wherein the intermediate symbols are associated into at least two sets, and wherein a first set comprises symbols designated by the source for belief propagation decoding and a second set comprises symbols designated by the source as a set of permanently inactive symbols to schedule a decoding process for recovering the intermediate symbols from the received encoded symbols; and

recovering at least some source symbols of the ordered set of source symbols from the set of intermediate symbols according to the decoding process.

16. The method of claim 15, wherein the decoding process comprises at least: a first decoding stage in which a reduced set of encoded symbols is generated in dependence on the second set of permanently inactive symbols and a third set of dynamically inactive symbols which is a subset of the first set of symbols; and a second decoding stage in which the reduced set of encoded symbols is used to decode the second set of permanently inactive symbols and the third set of dynamically inactive symbols; and a third decoding stage in which the decoded second and third sets of permanently inactive symbols and the set of received encoded symbols are used to decode at least some of the remaining intermediate symbols in the first set of symbols.

17. The method of claim 16, wherein the first decoding stage uses belief propagation decoding in combination with passivation decoding, and/or the second decoding stage uses gaussian elimination.

18. The method of claim 16, wherein the third decoding stage uses back substitution, or backward scanning followed by forward scanning.

19. The method of claim 16, wherein the decoding process operates on the third set of dynamically inactive symbols taking into account that a number of symbols in the third set of dynamically inactive symbols is less than 10% of a number of source symbols and/or less than 10% of a number of symbols in the second set of permanently inactive symbols.

20. The method of claim 15, wherein the received encoded symbols are operated on as LDPC code generated symbols or Reed-Solomon code generated symbols.

21. The method of claim 15, wherein each received encoded symbol of the set of received encoded symbols is operated on as a combination of a first symbol generated from one or more symbols of the first set of symbols and a second symbol generated from one or more symbols of the second set of symbols.

22. The method of claim 21, wherein each received encoded symbol is operated on as the combination being an exclusive or of the first symbol and the second symbol.

23. The method of claim 21, wherein each received encoded symbol is operated on as if the second symbol is an exclusive or of a fixed number of symbols randomly selected from the second set.

24. The method of claim 21, wherein each received encoded symbol is operated on as if the second symbol is an exclusive or of a first number of symbols randomly selected from the second set, wherein the first number of symbols depends on a second number of symbols selected from the first set to generate the first symbol.

25. The method of claim 21, wherein the decoding process operates as if the first symbol was selected from the first set of symbols based on a chain reaction code.

26. The method of claim 15, wherein the decoding process operates as if the size of the second set of permanently inactive symbols is proportional to the square root of the number of source symbols.

27. The method of claim 15, wherein the decoding process operates as if the intermediate symbols comprise the ordered set of source symbols and a set of redundant symbols generated from the ordered set of source symbols.

28. The method of claim 27, wherein the decoding process operates as if at least some of the redundant symbols were generated using GF [2] operations and other redundant symbols were generated using GF [256] operations.

29. The method of claim 15, wherein the decoding process operates as if the intermediate symbols comprise the ordered set of source symbols.

30. The method of claim 15, wherein the decoding process operates as if the intermediate symbols were symbols generated from the source symbols using a decoding process based on a set of linear relationships between the intermediate symbols and the source symbols.

31. The method of claim 30, wherein the decoding process operates as if at least some of the linear relationships are relationships over GF [2] and other linear relationships are relationships over GF [256 ].

32. The method of claim 15, wherein the decoding process operates as if the number of different possible encoded symbols that can be received is independent of the number of source symbols in the ordered set.

33. The method of claim 15, wherein an average number of symbol operations to perform to decode the set of source symbols from the set of received encoded symbols is limited by a constant multiplied by a number of source symbols, wherein the constant is independent of the number of source symbols.

34. The method of claim 15, wherein the decoding process operates as if the number of symbols in the first set of symbols is an order of magnitude greater than the number of symbols in the second set of permanently inactive symbols.

35. The method of claim 15, wherein the decoding process operates such that recovering the set of all K source symbols from the set of N-K + a encoded symbols has a probability of success for some K, N and a with at least a lower bound of 1- (0.01) ^ (a +1), where a is 0, 1, or 2, where the lower bound is independent of a number of source symbols.